ontarget-eventually.tex /size: 22 Kb    last modification: 2024-01-16 10:21
1% language=us runpath=texruns:manuals/ontarget
2
3\startcomponent ontarget-eventually
4
5\environment ontarget-style
6
7\startchapter[title={Eventually 1.0}]
8
9\startsection[title=Reflection]
10
11This is just a short reflection on how we came to version 1.0 of \LUAMETATEX.
12Much has already been said in articles and history documents. There is nothing
13in here that is new but I just occasionally like to wrap up the current state.
14At the time of writing, which happens to be the \CONTEXT\ 2021 meeting, we're
15somewhere between 0.9 and 1.0 and as usual it reflects a current state of mind.
16
17\stopsection
18
19\startsection[title=Introduction]
20
21The development on \LUAMETATEX\ took a bit more time than I had in planned when I
22started with it. I presume that it also relates to the way the \TEX\ program is
23looked at: a finished program that converges to a bugless state. But, with
24version 1.0 near by it makes sense to reflect on the process. Before I go into
25details I want to remark that when I wrote \CONTEXT\ I looked at this program
26from the macro end. I had no real reason to look into the code, and figuring out
27what happens in a black box is a challenge (and kind of game) in itself. At the
28time I started using \TEX\ I had done my share of complex and relatively large
29scale programming in \PASCAL\ and \MODULA\ so it's not that I was afraid of
30languages. It was before the Internet took off and not being in academia and
31connected one had to figure things out anyway. I did have Don's 5 volume \TEX\
32series but stuck to the \TEX\ book. Being on \MSDOS\ I couldn't compile the
33program anyway, definitely not without the source at hand. I did read the first
34chapters of the \METAFONT\ book, but apart from being intrigued by it, it was not
35before I ran into \METAPOST\ that knowing that language took off. Of course I had
36browsed \TEX\ the program but not in a systematic way.
37
38I was involved with \PDFTEX\ development but stayed at my end of the line: needs,
39applications, testing and suggestions. With \LUATEX\ that line got crossed,
40triggered by the \LUA\ interfaces, but while I focussed on the \TEX\ end, Taco
41did the \CCODE, and we had pleasant and intense daily discussion on how to move
42forward. I could not get away any longer with the abstraction but had to deal
43with nodes and such, which was okay as we were hit the boundaries of convenience
44programming solutions in \CONTEXT.
45
46When we started our \LUATEX\ journey the \TEX\ follow|-|up most widely used,
47\PDFTEX, did have some \ETEX\ extensions but in retrospect only a few of those
48were of relevance to us, like the concept of \type {\protected} macros \footnote
49{In \CONTEXT\ we always had a protection mechanism and from the \LUATEX\ source I
50learned that the macro bases solution was basically the same as the one used in
51the engine.} and the larger set of registers. And the \ETEX\ project, in spite of
52occasional discussions, never became a continuous effort. The \NTS\ project that
53was related to \ETEX\ and had as objective an extensible successor produced a
54\JAVA\ implementation but that one was never useful (as a starter, its
55performance was such that it could not be used) and I didn't really look forward
56to spending time on \JAVA\ anyway. Taco and I played with an extended \ETEX\ but
57lack of time made that one end up in the archive.
58
59There were some programmatic additions to \PDFTEX\ but it's main attributes were
60protrusion, expansion and a \PDF\ backend (\THANH's thesis subject). Features
61like position tracking were handy but basically just a built|-|in variant of a
62concept we already had come up with at the \DVI\ level (using a postprocessing
63script that later became \type {dvipos}). There was \OMEGA\ with a directional
64model but this engine was always more of an academic project, not a production
65system. \footnote {\ALEPH\ was more reliable but never took off, if only because
66\PDFTEX\ had a backend.} It was \XETEX\ that moved the \TEX\ world into the
67\UNICODE\ domain and opened the engine up to new font technologies. Although
68\UTF8\ was already doable in earlier engines (which is why \CONTEXT\ used it
69already for some internals), native support was way more convenient.
70
71It was clear that if we wanted to move on we had to make more fundamental steps,
72but in such a way that it still fit in with what people expect from \TEX. While
73it started an a playground by embedding the \LUA\ interpreter, it quickly became
74clear that we could open up the internals in fundamental ways, thereby also
75getting around the discussion about to what extent \TEX\ could and should be
76extended: that discussion could be and was postponed by the opening up. Because
77we already foresaw some of possibilities it was decided to freeze \CONTEXT\ for
78the older engines. It was around the first \CONTEXT\ meeting that the \MKII\ and
79\MKIV\ tags showed up, around the same time that \LUATEX\ became useable. More
80than a decade later, when \LUATEX\ basically had become frozen, at another
81meeting it was decided to move on with \LUAMETATEX: the \LUATEX\ project was
82pretty much a \CONTEXT\ projects and that follow up would be even more driven by
83\CONTEXT\ users and usage. But how does it all feel 15 years later? I'll try to
84summarize that below. It will also explain why I got more audacious in extending
85the \LUATEX\ engine into what is now \LUAMETATEX. This also related to the fact
86that at some point I realized that progress just demands taking decisions, and it
87happens that we can make these in the perspective of \CONTEXT\ without side
88effects for other \TEX\ usage. It is also fun to experiment.
89
90\stopsection
91
92\startsection[title=Extending necessary parts]
93
94The \PDFTEX\ program, having a backend built in already supports the usage of
95wide \TRUETYPE\ but it was \XETEX\ that first provided using them directly in the
96frontend. But that happened within the concept of traditional \TEX, especially
97when it comes to math. There are some extra primitives to deal with scripts and
98languages but (and this is personally) I decided that these didn't really fit in
99the way \CONTEXT\ looks at things so \MKII\ doesn't support anything beyond the
100fonts. The \XETEX\ program first was available on Apple computers and font
101support was closely related to its technology as well as technologies that relate
102to where the program originates. Later other operating systems became supported
103too.
104
105We decided in \LUATEX\ to delegate \quote {everything fonts} to \LUA, for a good
106reason: we didn't want to be platform dependent. And using libraries has the
107danger of periodical enforced fundamental changes because in these times software
108politics and fashion have short cycles. The fact that \XETEX\ later changed the
109font engine proved that this was a good decision. At some point \LATEX\ decided
110to use a special version of \LUATEX\ that uses a font library as alternative,
111which is fine, but that also introduces a dependency (and frequent updating of
112the binary). The \LUATEX\ engine has a slim variant of the \FONTFORGE\ library
113built in for reading various font formats and its backend can embed subsets of
114\OPENTYPE, \TYPEONE\ and traditional bitmap fonts. At some point \CONTEXT\
115switched to its own \LUA\ based font file interpreter and experimented with a
116\LUA\ based backend that later became exclusive for \LUAMETATEX. It became clear
117that we could do with less code in the engine and thereby less dependencies.
118
119In this perspective it is also good to notice that the \LUATEX\ engine has no
120real concept of \UNICODE: it just expects \UTF8\ and that's it. All internals
121provide enough granularity to support \UNICODE. The rest has to come from the
122macro package, as we know that each one does it its own way. There are no
123dependencies on \UNICODE\ libraries. You only have to look at what ends up on
124your system when you install a program that just juggles bytes to notice that by
125including one library a whole lot gets drawn in, most of which is not relevant to
126the program and we don't want that. It might start small but who knows where
127one ends up. If we want users to be able to compile the program, we don't want
128to end up in dependency hell.
129
130The \LUATEX\ project was, apart from curiosity and potential usage in \CONTEXT,
131initially also driven by the Oriental \TEX\ project that aimed at high quality
132bidirectional typesetting. There the focus was on fonts as well as processing
133paragraphs. That triggered all kinds of opening up of internals and once
134\CONTEXT\ started swapping (and adding) mechanisms using \LUA\ more came to
135fruit. In the end it took a decade to reach version 1.0 and we could have stopped
136there knowing that we're quite prepared for the future.
137
138Although the whole \TEX\ concept didn't change, there were some fundamental
139changes. From the documentation by Don Knuth it becomes clear that interpreting
140is closely interwoven with typesetting: the so called main interpretation loop
141calls out to font processing, ligature building, hyphenation, kerning, breaking
142lines, processing pages, etc. In \LUATEX\ these steps became more independent
143simply because the processing of fonts (via \LUA) came down to feeding a linked
144list of nodes to a callback function. That list should be hyphenated if needed (a
145now separated step) and if needed the traditional font processing could be
146applied (ligature building and kerning). But, although one can say that we
147already got away from the way \TEX\ works internally, most documentation to the
148original program still applied, simply because the fundamental approach was the
149same. We didn't feel too guilty about it and I don't think anyone objected. By
150the way, the same is true for the math subsystem: we had to adapt it to
151\OPENTYPE\ parameters and formula construction and although that was inspired by
152\TEX\ it definitely was different, even to the extend that the math fonts that
153evolved in the community are now a strange hybrid of old and new.
154
155\stopsection
156
157\startsection[title=Getting around the frozen machinery]
158
159So why did the \LUAMETATEX\ project started at all? There has been plenty written
160on how \LUATEX\ evolved and the same is true for \LUAMETATEX\ so I'm not going to
161repeat that here. It is enough to know that the demand for a stable and frozen
162\LUATEX\ by other users than \CONTEXT\ simply doesn't go well with further
163experiments and we still had plenty ideas. Because at some point Taco had no time
164I was already responsible for quite some additions to the \LUATEX\ program so it
165was no big deal to switch to a an even more extensive mix of working with
166\quotation {\TEX\ the macro language} and \quotation {\TEX\ the program}.
167
168The first priorities were with some basic cleanup: remove unused font code, get
169rid of some ever changing libraries and remove the backend related code. I could
170do that because I already had a \LUA\ driven backend in \MKIV\ (which was removed
171later on) and font handling was already all done in \LUA. The idea was to go lean
172and mean, and indeed, even with all kind of extensions, the binary is much
173smaller than its predecessor, which is nice because it is also a \LUA\ engine.
174Simplifying the build so that users can easily compile themselves was also of
175high priority because I considered the rather large and complex setup as a time
176bomb. And I also had my doubts if we could prevent the \LUATEX\ engine to evolve
177over time in a way that made it less useable for \CONTEXT.
178
179But, interestingly all this extending and pruning didn't feel like I was
180violating the concept of a long term stable engine. In fact, original \TEX\ has
181no backend either, just a simple binary serialization of output (\DVI). And by
182removing some font related frontend code we actually came closer to the original.
183I suppose that these decisions slowly made me aware of the fact that there was no
184reason to not consider more drastic extensions. After all, wasn't the \ETEX\
185project also about extending. \footnote {Although non of the ideas that Taco and
186I discussed on our numerous trips to meetings all over the world ever made it
187into that engine.}
188
189When we look at \LUAMETATEX\ 1.0 we still see the expected machinery there but
190many subsystems have been extended. Once I made the decision that it's now or
191never, each subsystem got evaluated against my long term wish list and usage in
192\CONTEXT. Now, let's be clear: I basically can do all I want in \LUATEX\ but that
193doesn't mean it's always a pretty solution. And to make the \CONTEXT\ code base
194better to understand for users, even if it is already rather consistent and set
195up to be readable, is one of my objectives. I spend a lot of time on readability:
196I cannot stand a bad looking source and over time the look and feel is also
197determined by the way the \CONTEXT\ interfaces and related syntax highlighting
198evolved, especially the \TEX, \METAPOST, \LUA\ mix. This is why \LUAMETATEX\ has
199some extensions to the macro language.
200
201So, while some might argue that \quotation {It can already be done.} I decided to
202ignore that argument when the actual solutions came too close to \quotation {See
203how well I can do this using dirty tricks!}. If we can do better, without harming
204the system, let's do it: \LUA\ did it, \CCODE\ did it and even Don Knuth switched
205from \PASCAL\ to \CCODE . If we want we can put all the extensions under the
206\quotation {\TEX\ is meant to be extended} umbrella, as long as we call it
207different, which is what we do. But I admit that one has to (emotionally) cross a
208boundary of feeling comfortable with fundamental additions to a program like
209\TEX. But I've been around long enough to not feel guilty about it.
210
211So in the end that means that for instance marks were extended, inserts got more
212options, glyphs and boxes have way more properties, (the result and handling of)
213paragraphs can be better controlled, page breaking got hooks (and might be
214extended), local boxes got redone, adjustments were extended, the math machinery
215has been completely opened up, hyphenation became more powerful, the font
216mechanism got more control and new scaling features, alignments got some
217extensions, we can do more with boxes, etc. But often I still first had to
218convince myself that it's okay to do so. After all, none of this had happened
219before and to my knowledge also has not been considered in ways that resulted in
220an implementation (but I might be wrong here). It helps that I can test out
221experiments in production versions of \LMTX\ and that users are quite willing to
222test.
223
224\stopsection
225
226\startsection[title=Extending the macro language]
227
228In the previous section some mechanisms were mentioned, but before \TEX\ even
229ends up there macros and primitives come into play. The \LUATEX\ engine already
230has some handy extras, like ways to prepend and append tokens and a limited so
231called \quote {local control} mechanism (think of nested main loops). There are
232some new look head and expansions related primitives and csname related tricks.
233There are a few more conditionals too. Details can be found in manual and
234articles.
235
236In \LUAMETATEX\ some more got added and some of these mechanism could be improved
237and the reason again is that I aim at readable code. Most programming languages
238for instance have conditionals with some kind of continuation (like \type
239{elseif}) and so I added that to \TEX\ too \type {\orelse}. Actually, there are
240even more new conditionals than in \LUATEX. Yes, we don't really need these,
241especially because in \LUAMETATEX\ we can now extend the primitive language via
242\LUA, but I wanted to improve readability deep down in \CONTEXT. It also reduces
243the clutter when logging, although logging itself has been quite a bit
244overhauled. There is less need for intermediate (often not that natural)
245intermediate layers when we can do it properly in primitive \TEX\ lingua.
246
247More fundamental was extending the way \TEX\ deals with macro arguments. Although
248the extensions to parsing them are using specifiers that make them upward
249compatible I admit that even I have to consult a list of possibilities every now
250and then but in the end they make things better (performance wise with less
251code). As a side effect the macro machinery could be optimized a bit (expansion
252as well as the save stacking).
253
254There are a few more ways to store integers and dimensions (these fit in nicely),
255there are new into grouping, some primitives have more keywords and
256therefore scanners have been extended, the \ETEX\ expression handlers have
257alternative variants.
258
259Although this is a sensitive aspect of \TEX\ when it comes to compatibility, at
260some point I decided that it made no sense to not expose more details about
261nodes, input, and nesting states. The grouping and input related stacks had been
262optimized in the meantime so reporting in that area was already not compatible.
263Improving logging is an ongoing effort and I don't really loose sleep over it not
264being compatible, as long as it gets better. There is now also some tracing for
265marks, inserts, math and alignments.
266
267\stopsection
268
269\startsection[title=Refactoring the code base]
270
271This is again an emotionally laden decision: what to touch and keep. For sure we
272keep the original comments but that doesn't make it literate. We started out with
273a \CCODE\ base that came from converted \PASCAL\ \WEB.
274
275The input machinery is a bit different due to the fact that \LUA\ can (and often
276has to) kick in. In \LUAMETATEX\ it's even more different because even more goes
277via \LUA. We cannot even run the engine without a basic set of callbacks
278assigned: if you don't like that, use \LUATEX. Does this violate the \TEX\
279concept? Not really, because system dependencies are explicitly mentioned as such
280in the source code. We have to adapt to the way an operating system sees files
281anyway (eight bit, \UTF8, \UTF16).
282
283We still have many global variables (a practical Knuth thing I guess) but now
284they are grouped into structures so that we can more clearly see where they
285belong. This involved quite a but of shuffling and editing but I got there. In
286\LUAMETATEX\ all constants (coded in macros) became enumerations, and all hard
287coded values too which was quite a bit of work too. Probably no one will notice
288or realize that, but starting from an existing code base is more work than
289starting from scratch, which is what I always did so far. When possible we use
290case statements. Most macros became (inline) functions. Complex functions got
291better variable names. All functions are in name spaces. This was (and is) a
292stepwise process that takes lots of time, especially because \CONTEXT\ users
293expect a reasonable stable system and changes like that are sensitive for errors.
294
295Talking of errors, the error and reporting system has been overhauled, so for
296instance we have now a dedicated string formatter. This all happened in several
297steps: normalization, consistency, abstraction, formatters, etc. Keep in mind
298that we not only have the original messages but also new ones. And we have \TEX,
299\LUA\ and \METAPOST\ communicating with the user. Where in \LUATEX\ we have to
300conform more to the traditional engine, because that is what other macro packages
301rely on, In \CONTEXT\ we have more freedom, so we can make it better and more
302detailed. Of course it could all be controlled by configurations but at some
303point I decided to kick out variables doing that because it made no sense to
304complicate the code base.
305
306Memory management has been overhauled (more dynamic) as has dumping to the (more
307efficient) format file. With what is mentioned in the previous paragraphs we can
308safely say that in the meantime back porting to \LUATEX\ (which I had in mind)
309makes no sense any longer. There is occasionally some pressure to let \LUATEX\ do
310the same as other engines (new common features) and that doesn't always fit into
311the model. There is no need for \LUAMETATEX\ to follow up on that because often
312we already have plenty of possibilities. There is of course still work todo, for
313instance I still have to make some variable names in functions more verbose but
314that is not fundamental. I also have to go over the documentation in the code. I
315might make some interfaces more consistent anyway, so that also would demand
316adaptations. And of course the documentation in general always lags behind.
317
318So far I only mentioned dealing with \TEX, but keep in mind that in \LUAMETATEX\
319we also have an upgraded \METAPOST: only a \LUA\ backend (we can produce \PDF\
320from that other output), no font code, a couple of extensions, more callbacks,
321\IO\ via \LUA. Scanners make extending the language possible and injectors make
322for efficient piping back to \METAPOST. Such extensions are also possible in
323\TEX\ and the \LUAMETATEX\ scanning interfaces have been improved and extended
324too. We have extra callbacks (but some were dropped), more helpers (most
325noticeable in the node namespace), libraries that improve dealing with binary
326files, a reworked token library (which in turn lead to a reorganization of
327command codes in the \TEX\ engine), a few more extensions if \LUA\ file handling
328and string manipulations. We got decimal math, complex math, new compression
329libraries, better (\LUA) memory management, a few optional library interfaces,
330etc. Fortunately that all didn't bloat the binary.
331
332So, because in the meantime \LUAMETATEX\ is quite different from \LUATEX, we can
333consider the last one to be a prototype for the real deal.
334
335\stopsection
336
337\startsection[title=Simplifying the build]
338
339This was one of the first things I did. It was a curious process of removing more
340and more of the original build (all kind of dependencies) which is not entirely
341trivial because of the way the \LUATEX\ build is set up. I admit that I did try
342to stay within the regular source build concept but after a while I realized that
343this made no sense so we (Mojca was involved in that) made the move to \CMAKE.
344Shortly after that I started using Visual Studio as editing environment (which
345saves time and is rather convenient) and native compilation under \MSWINDOWS\
346became possible without any special measures (in fact, setting up the build for
347\ARM\ processors was more work).
348
349A side effect is that right from the start we could provide binaries for various
350platforms via the compile farm on the \CONTEXT\ garden maintained by Mojca, who
351also does daily \TEX\ live builds there. On my machine I use the Windows Linux
352Subsystem for cross compilation but we can also do native builds. And, with my
353laptop being a robust 2013 old timer I force myself to make sure that
354\LUAMETATEX\ keeps performing well.
355
356\stopsection
357
358\startsection[title=Because it just makes sense]
359
360So, in the end \LUAMETATEX\ is likely the engine most different from the Knuthian
361original but from the above one can conclude that this was a graduate process
362where I got more audacious over time. In the end the only thing that matters (and
363I believe that Don Knuth agrees with this) that you like writing the code, feel
364confident that the code is all right, explore the possibilities, try to improve
365the quality and understanding and that successive rewrites can reduce obscurity.
366And in my opinion we didn't loose the \TEX\ look and feel and still can operate
367well within the established boundaries of the \TEX\ ecosystem. The fact that most
368\CONTEXT\ users in the meantime use \LUAMETATEX\ and the related \LMTX\ variant
369is an indication that they are okay with it, and that is what matters most.
370
371\stopsection
372
373\stopchapter
374
375\stopcomponent
376