musings-roadmap.tex /size: 20 Kb    last modification: 2021-10-28 13:50
1% language=us runpath=texruns:manuals/musings
2
3% \showfontkerns
4
5\startcomponent musings-roadmap
6
7\environment musings-style
8
9\startchapter[title={\METATEX, a roadmap}]
10
11% \startlines \setupalign[flushright]
12% Hans Hagen
13% Hasselt NL
14% September 2018
15% \stoplines
16
17\startsection[title={Introduction}]
18
19Here I will shortly wrap up the state of \LUATEX\ and \CONTEXT\ in fall 2018. I
20made the first draft of this article as preparation for the \CONTEXT\ meeting
21where we also discussed the future. I updated the text afterwards to match the
22decisions made there. It's also a personal summary of thoughts and discussions
23with team members about where to move next.
24
25\stopsection
26
27\startsection[title={The state of affairs}]
28
29After a dozen years the development of \LUATEX\ has reached a state where adding
30more functionality and|/|or opening up more of the internals makes not much
31sense. Apart from fixes and maybe some minor extensions, version 1.10 is what you
32get. Users can do enough in \LUA\ and there is not much to gain in convenience
33and performance. Of course some of the code can and will be cleaned up, as we
34still see the effects of going from \PASCAL\ to \CWEB\ to \CCODE. In the process
35consistency is on the radar so we might occasionally add a helper. But we also
36don't want to move too far away from the original code, which is for instance why
37we keep names, keys and other properties found in original \TEX, which in turn
38leads to some inconsistencies with extensions added over time. We have to accept
39that.
40
41Because \LUATEX\ development is closely related to \CONTEXT\ development,
42especially \MKIV, we've also reached the moment that we can get rid of some older
43code and assume the latest \LUATEX\ to be used. Because we do so much in \LUA\
44the question is always to what extent the benefits outweigh the drawbacks. Just
45in case you wonder why we use \LUA\ extensively, the main reason is that it is
46easier and more efficient to manage data in this language and modern typesetting
47needs much data. It also permits us to extend regular \TEX\ functionality. But,
48one should not overrate the impact: we still let \TEX\ do what \TEX\ is best at!
49
50Performance is quite important. It doesn't make sense to create a powerful
51typesetting system where processing a page takes a second. We have discussed
52performance before since one of the complaints about \LUATEX\ is that it is slow.
53A simple, basic test is this:
54
55\starttyping
56\starttext
57    \dorecurse{1000}{\input tufte \par}
58\stoptext
59\stoptyping
60
61This involves 1000 times loading a file (and reporting that on the console, which
62can influence runtime), typesetting paragraphs, splitting of a page and of course
63loading fonts and saving to the \PDF\ file. When I run this on a modest machine,
64I get these (relative) timings for the (about) 225 pages:
65
66\starttabulate[|l|c|c|c|c|]
67\BC \TEX\ engine used  \BC \PDFTEX \BC \LUATEX \BC \LUAJITTEX \BC \XETEX \NC \NR
68\BC runtime in seconds \NC 2.0     \NC  3.9    \NC 3.0        \NC 8.4    \NC \NR
69\stoptabulate
70
71Now, as expected the 8 bit \PDFTEX\ is the winner here but \LUATEX\ is not doing
72that bad. I don't know why \XETEX\ is so much slower, maybe because its 64 bit
73binary is less optimal. I once noticed that a 64 bit \PDFTEX\ performed worse on
74such a test than \LUATEX, for which I always use 64 bit binaries.
75
76If you consider that often much more is done than in this example, you can take
77my word that \LUATEX\ quickly outpaces \PDFTEX\ on more complex tasks. In that
78sense it is now our benchmark. It must be said that the \MKIV\ code is probably a
79bit more efficient than the \MKII\ code but that doesn't matter much in this
80simple test because hardly any macro magic happens here; it mostly tests basic
81font processing, paragraph building and page construction. I don't think that I
82can squeeze out more pages per second, at least not without users telling me
83where they encounter bottlenecks that don't result from their style coding. It's
84no problem to write inefficient macros (or styles) so normally a user should
85first carefully check her|/|his own work. Using a more modern \CPU\ with proper
86caching and an \SSD\ helps too.
87
88So, to summarize, we can say that with version 1.10 \LUATEX\ is sort of finished.
89Our mission is now to make \LUATEX\ robust and stable. Things can be added and
90improved, but these are small and mostly consistency related.
91
92\stopsection
93
94\startsection[title={More in \LUA}]
95
96Till now I always managed to add functionality to \CONTEXT\ without hampering
97performance too much. Of course the biggest challenge is always in handling fonts
98and common features like color because that all happens in \LUA. So, the question
99is, what if we delegate more of the core functionality to \LUA ? I will discuss a
100few options because the \CONTEXT\ developers and users need to agree on the path
101to follow. One question there is, are the possible performance hits (which can be
102an inconvenience) compensated by better and easier typesetting.
103
104Fonts, colors, special typesetting features like spaced kerning, protrusion,
105expansion, but also dropped caps, line numbering, marginal notes, tables,
106structure related things, floats and spacing are not open for much discussion.
107All the things that happen in \LUA\ combined with macros is there and will stay.
108But how about hyphenation, paragraph building and page building? And how about a
109leaner and meaner, future safe engine?
110
111Hyphenation is handled in the \TEX\ core. But in \CONTEXT\ already for years one
112can also use a \LUA\ based variant. There is room for extensions and improvements
113there. Interesting is that performance is more or less the same, so this is an
114area where we might switch to the \LUA\ method eventually. It compares to fonts,
115where node mode is more or less the standard and base mode the old way.
116
117Building the paragraphs in \LUA\ is also available in \MKIV, although it needs an
118update. Again performance is not that bad, so when we add features not possible
119(or hard to do) in regular \TEX, it might actually pay of to default to the
120par builder written in \LUA.
121
122The page builder is also doable in \LUA\ but so far I only played a bit with a
123\LUA\ based variant. I might pick up that thread. However, when we would switch to
124\LUA\ there, it might have a bit of a penalty, unless we combine it with some
125other mechanisms which is not entirely trivial, as it would mean a diversion from
126the way \TEX\ does it normally.
127
128How about math? We could at some point do math rendering in \LUA\ but because the
129core mechanism is the standard, it doesn't really makes much sense. It would also
130touch the soul of \TEX. But, I might give it a try, just for fun, so that I can
131play with it a bit. It's typically something for cold and rainy days with some
132music in the background.
133
134We already use \LUA\ in the frontend: locating and reading files in \TEX,
135\XML, \LUA\ and whatever input format. Normalization and manipulation is all
136active and available. The backend is also depending on \LUA, like support for
137special \PDF\ features and exporting to \XML . The engine still handles the page
138stream conversion, font inclusion and object management.
139
140The inclusion of images is also handled by the engine, although in \CONTEXT\ we
141can delegate \PDF\ inclusion to \LUA. Interesting is that this has no performance
142hit.
143
144With some juggling the page stream conversion can also be done in \LUA, and I
145might move that code into the \CONTEXT\ distribution. Here we do have a
146performance hit: about one second more runtime on the 14 seconds needed for the
147300 page \LUATEX\ manual and just over more than half a second on a 11 second
148\LUAJITTEX\ run. The manual has lots of tables, verbatim, indices and uses color
149as well as a more than average number of fonts and much time is spent in \LUA. So
150there is a price to pay there. I tried to speed that up but there is not much to
151gain there.
152
153So, say that we default to \LUA\ based hyphenation, which enables some new
154functionality, \LUA\ based par building, which permits some heuristics for corner
155cases, and \LUA\ based page building, which might result in more control over
156tricky cases. A total performance hit of some 5\% is probably acceptable,
157especially because by that time I might have replaced my laptop and won't notice
158the degrade. This still fits in the normal progress and doesn't really demand a
159roadmap or wider acceptance. And of course we would still use the same strategies
160as implemented in traditional \TEX\ as default anyway.
161
162\stopsection
163
164\startsection[title={A more drastic move}]
165
166More fundamental is the question whether we delegate more backend activity to
167\LUA\ code. If we decide to handle the page stream in \LUA, then the next
168question is, why not also delegate object management and font inclusion to
169\LUA. Now, keep in mind that this is all very \CONTEXT\ specific! Already for
170more than a decade we delegate a lot to \LUA, and also we have a rather tight
171control over this core functionality. This would mean that \CONTEXT\ doesn't
172really need the backend code in the engine. \footnote {For generic packages like
173TikZ we (can) provide some primitive emulators, which is rather trivial to
174implement.}
175
176That situation is actually not unique. For instance, already for a while we don't
177need the \LUATEX\ font loader either, as loading the \OPENTYPE\ files is done in
178\LUA. So, we could also get rid of the font loader code. Currently some code is
179shared with the font inclusion in the backend but that can be isolated.
180
181You can see a \TEX\ engine as being made from several parts, but the core really
182concerns only two processes: reading, storing and expanding macros on the one
183hand, and converting a stream of characters into lines, paragraphs, pages etc.
184Fonts are mostly an abstraction: they are visible in so called glyph nodes as
185font identifier (a number) and character code (also a number) properties. The
186result, nowadays being \PDF, is also an abstraction: at some point the engine
187converts the to be shipped out box in \PDF\ instructions, and in our case,
188relatively simple ones. The backend registers which characters and fonts are used
189and also includes the right resources. But, the backend is not part of the core
190as such! It has been introduced in \PDFTEX\ and is a so called extension.
191
192So, what does that all mean for a future version of \CONTEXT\ and \LUATEX ? It
193means that we can decide to follow up with a \CONTEXT\ that does more in \LUA,
194which means not hard coded in a binary, on the one hand, but that we can also
195decide to strip the engine from non|-|core code. But, given that \LUATEX\ is also
196used in other macro packages, this would mean a different engine. We cannot say
197that \LUATEX\ is stable when we also experiment with core components.
198
199We've seen folks picking up experimental versions assuming that it is a precursor
200to official code. So, in order to move on we need to avoid confusion: we need to
201use another name. Choosing a name is always tricky but as Taco already registered
202the \METATEX\ domain, and because in the \CONTEXT\ distribution you will find
203references to \METATEX, we will use that name for the future engine. Adding \LUA\
204to that name makes sense but then the name would become too long.
205
206The main difference between \METATEX\ and \LUATEX\ would be that the former has
207no file lookup library, no hardcoded font loader, and no backend generator (but
208possibly some helpers, and these need time to evolve). We're basically back where
209\TEX\ started but instead of coding these extensions in \PASCAL\ or \CCODE\ we
210use \LUA. We're also kind of back to when we first started experimenting with
211\LUATEX\ in \CONTEXT\ where test, write and rewrite were going in parallel. But,
212as said, we cannot impose that on a wide audience.
213
214If we go for such a lean and mean follow up, then we can also do a more drastic
215cleanup of obsolete code in \CONTEXT\ (dating from \ETEX, \PDFTEX, \ALEPH, etc.).
216We then are sort of back to where it all started: we go back to the basics. This
217might mean dropping some primitives (one can define them as dummy). Of course we
218could generalize some of the \CONTEXT\ code to provide the kicked out
219functionality but would that pay of? Probably not.
220
221Just for the record: replacing the handling of macros, registers, grouping, etc.\
222to \LUA\ is not really an option as the performance hit would make a large system
223like \CONTEXT\ sort of unusable: it's no option and not even considered (although
224I must admit that I have some experimental \LUA\ based \TEX\ parser code around).
225
226It is quite likely that building \METATEX\ from source for the moment will be an
227option to the build script. But we can also decide to simplify that process,
228which is possible because we only need one binary. But in general we can assume
229that one can generate \METATEX\ and \LUATEX\ from the same source. A first step
230probably is a further isolation of the backend code. The fontloader and file
231handling code already can be made optional.
232
233Given that we only need one binary (it being \LUATEX\ or \METATEX) and nowadays
234only use \OPENTYPE\ fonts, one can even start thinking of a mini distribution,
235possibly with a zipped resource tree, something we experimented with in the early
236days of \LUATEX.
237
238Another though I have been playing with is a better separation between low level
239and high level \CONTEXT\ commands, and whether the low level layer should be more
240generic in nature (so that one can run specific packages on top of it instead of
241the whole of \CONTEXT) but that might not be worth the trouble.
242
243\stopsection
244
245\startsection[title={Interlude}]
246
247If we look at the future, it's good to also look at the past. Opening up \TEX\
248the way we did has many advantages but also potential drawbacks. It works quite
249well in \CONTEXT\ because we ship an integrated package. I don't think that there
250are many users who kick in their own callbacks. It is possible but completely up
251to the user to make sure things work out well. Performance hits, interference,
252crashes: those who interfere with the internals can sort that out themselves. I'm
253not sure how well that works out in other macro packages but it is a time bomb if
254users start doing that. Of course the documented interfaces to use \LUA\ in
255\CONTEXT\ are supported. So far I think we're not yet bitten in the tail. We keep
256this aspect out of the discussion.
257
258Another important aspect is stability of the engine. Sometimes we get suggestions
259for changes or patches that works for a specific case but for sure will have side
260effects on \CONTEXT. Just as we don't test \LATEX\ side effects, \LATEX\ users
261don't check \CONTEXT. And we're not even talking of users who expect their code
262to keep working. A tight control over the source is important but cannot be we
263will not be around for ever. This means that at some point \LUATEX\ should not be
264changed any more, even when we observe side effects we want to get rid of,
265because these side effects can be in use. This is another argument for a stripped
266down engine. The less there is to mess with, the less the mess.
267
268\stopsection
269
270\startsection[title={Audience}]
271
272So how about \CONTEXT\ itself? Of course we can make it better. We can add more
273examples and more documentation. We can try to improve support. The main question
274for us (as developers) is who actually is our audience. From the mails coming to
275the \CONTEXT\ support list it looks like a rather diverse group of users.
276
277At \TEX\ meetings there are often discussions about promoting \TEX. I can agree
278on the fact that even for simple documents it makes a lot of sense to use \TEX,
279but who will take the first hurdles? How many people really produce a lot of
280documents? And how many need \TEX\ after maybe a short period of (enforced) usage
281at the university?
282
283It's not trivial to recognize the possibilities and power of the
284\LUATEX|-|\CONTEXT\ combination. We never got any serious requests for support
285from large organizations. In fact, we do use this combination in a few projects
286for educational publishers, but there it's actually the authors and editors doing
287the work. It's seldom company policy to use tools that efficiently automate
288typesetting. I dare to say that publishers are not really an audience at all:
289they normally delegate the task. They might accept \TEX\ documents but let them
290rekey or adapt far|-|far|-|away and as cheap as possible. Thinking of it, the
291main reason for Don Knuth for writing \TEX\ in the first place was the ability to
292control the look and feel and quality. It were developments at typesetters and
293publishers that triggered development of \TEX . It was user demand. And the
294success of \TEX\ was largely due to the unique personality and competence of the
295author.
296
297System integrators qualify as audience but I fear that \TEX\ is not considered
298hip and modern. It doesn't seem to matter if you can demonstrate that it can do a
299wonderful job efficiently and relatively cheap. Also the fact that an
300installation can be very stable on the long run is of no importance. Maybe that
301audience (market place) is all about \quotation {The more we have to program and
302update regularly, the merrier.}. Marketing \TEX\ is difficult.
303
304Those who render multiple products, maintain manuals, have to render many
305documents automatically qualify as audience. But often company policies,
306preferred suppliers, so called standard tools etc.\ are used as argument against
307\TEX. It's a missed opportunity.
308
309One needs a certain mindset to recognize the potential and the question is, how
310do we reach that audience. Drawing a roadmap for that is not easy but worth
311discussing. We're open for suggestions.
312
313% \footnote {It's kind of interesting that recently the \TEX\ User Group announced
314% its presence on Facebook and Twitter. Apart from wondering how that gets updated,
315% one can also wonder how many potential (or even current) users go there, given
316% that these platforms are subjected to rise and fall. I'm on neither of them and
317% don't plan to. Kids (our future users) that I know already said goodbye to them.
318% We'll see how that works out.}
319
320\stopsection
321
322\startsection[title={Conclusion}]
323
324At the \CONTEXT\ user meeting those present agreed that moving forward this way
325makes sense. This means that we will explore a lean and mean \METATEX\ alongside
326\LUATEX. There is no rush and it's all volunteer work so we will take our time
327for this. It boils down to some reshuffling of code so that we can remove the
328built|-|in font loader, file handling, and probably also \SYNCTEX\ because we can
329emulate that. Then the backend with its font inclusion code will be cleaned up a
330bit (we even discussed only supporting modern wide fonts). It's no big deal to
331adapt \CONTEXT\ to this (so it can and will support both \LUATEX\ and \METATEX).
332Eventually the backend might go away but now we're talking years ahead. By then
333we can also explore the option to make \METATEX\ start out as a \LUA\ function
334call (the main control loop) and become reentrant. There will probably not be
335many changes to the opened up \TEX\ kernel, but we might extend the \METAPOST\
336part a bit (some of that was discussed at the meeting) especially because it is a
337nice tool to visualize big data.
338
339As with \LUATEX\ development we will go in small steps so that we keep a working
340system. Of course \LUATEX\ is always there as stable fallback. The experiments
341will mostly happen in the experimental branch and binaries will be generated
342using the compile farm on the \CONTEXT\ garden, just as happens now. This also
343limits testing and exploring to the \CONTEXT\ community so that there are no side
344effects for mainstream \LUATEX\ usage.
345
346Nowadays, instead if roadmaps, we tend to use navigational gadgets that adapt
347themselves to the situation. On the road by car this can mean a detour and when
348walking around it can be going to suggested points of interest. During the
349excursion at the meeting, we noticed that after the drivers (navigators)
350synchronized their gadget with Jano, the routes that were followed differed a
351bit. We saw cars in front of going a different direction and cars behind us
352arriving from a different direction. So, even when we talk about roadmaps, our
353route can be adapted to the situation.
354
355Now here is something to think about. If you look at the \TEX\ community you will
356notice that it's an aging community. User groups seem to loose members, although
357the \CONTEXT\ group is currently still growing. Fortunately we see a new
358generation taking interest and the \CONTEXT\ users are a pleasant mix and it
359makes me stay around. I see it as an \quote {old timers} responsibility to have
360\TEX\ and its environment in a healthy state by the time I retire from it
361(although I have no plans in that direction). In parallel to the upcoming
362development I think we will also see a change in \TEX\ use and usage. This aspect
363was also discussed at the meeting and for sure will get a follow up on the
364mailing lists and future meetings. It might as well influence the decisions we
365make the upcoming years. So far \TEX\ has never failed us in it's flexibility and
366capacity to adapt, so let's end on that positive note.
367
368\stopsection
369
370\stopchapter
371
372\stopcomponent
373