musings-history.tex /size: 18 Kb    last modification: 2024-01-16 10:21
1% language=us runpath=texruns:manuals/musings
2
3\startcomponent musings-history
4
5\environment musings-style
6
7\startchapter[title={All those \TEX's}]
8
9\startlines \setupalign[flushright]
10Hans Hagen
11Hasselt NL
12February 2020
13\stoplines
14
15% \startsection[title=Introduction]
16% \stopsection
17
18This is about \TEX, the program that is used as part of the large suite of
19resources that make up what we call a \quote {\TEX\ distribution}, which is used
20to typeset documents. There are many flavors of this program and all end with
21\type {tex}. But not everything in a distribution that ends with these three
22characters is a typesetting program. For instance, \type {latex} launches the a
23macro package \LATEX, code that feeds the program \type {tex} to do something
24useful. Other formats are Plain (no \type {tex} appended) or \CONTEXT\ (\type
25{tex} in the middle. Just take a look at the binary path of the \TEX\
26distribution to get an idea. When you see \type {pdftex} it is the program, when
27you see \type {pdflatex} it is the macro package \LATEX\ using the \PDFTEX\
28program. You won't find this for \CONTEXT\ as we don't use that model of mixing
29program names and macro package names.
30
31Here I will discuss the programs, not the macro packages that use them. When you
32look at a complete \TEXLIVE\ installation, you will see many \TEX\ binaries. (I
33will use the verbatim names to indicate that we're talking of programs). Of
34course there is the original \type {tex}. Then there is its also official
35extended version \type {etex}, which is mostly known for adding some more
36primitives and more registers. There can be \type {aleph}, which is a stable
37variant of \type {omega} meant for handling more complex scripts. When \PDF\
38became popular the \type {pdftex} program popped up: this was the first \TEX\
39engine that has a backend built in. Before that you always had to run an
40additional program to convert the native \DVI\ output of \TEX\ into for instance
41\POSTSCRIPT. Much later, \type {xetex} showed up, that, like \OMEGA, dealt with
42more complex scripts, but using recent font technologies. Eventually we saw \type
43{luatex} enter the landscape, an engine that opened up the internals with the
44\LUA\ script subsystem; it was basically a follow up on \type {pdftex} and \type
45{aleph}.
46
47The previous paragraph mentions a lot of variants and there are plenty more. For
48\CJK\ and especially Japanese there are \type {ptex}, \type {eptex}, \type
49{uptex}, \type {euptex}. Parallel to \type {luatex} we have \type {luajittex} and
50\type {luahbtex}. As a follow up on the (presumed stable) \type {luatex} the
51\CONTEXT\ community now develops \type {luametatex}. A not yet mentioned side
52track is \NTS\ (New \TEX\ system), a rewrite of good old \TEX\ in \JAVA, which in
53the end didn't take off and was never really used.
54
55There are even more \TEX's and they came and went. There was \type {enctex} which
56added encoding support, there were \type {emtex} and \type {hugeemtex} that
57didn't add functionality but made more possible by removing some limits on memory
58and such; these were quite important. Then there were vendors of \TEX\ systems
59that came up with variants (some had extra capabilities), like \type {microtex},
60\type {pctex}, \type {yandytex} and \type {vtex} but they never became part of
61the public effort.
62
63For sure there are more, and I know this because not so long ago, when I cleaned
64up some of my archives, I found \type {eetex} (extended \ETEX), and suddenly
65remembered that Taco Hoekwater and I indeed had experimented with some extensions
66that we had in mind but that never made it into \ETEX. I had completely forgotten
67about it, probably because we moved on to \LUATEX. It is the reason why I wrap
68this up here.
69
70In parallel there have been some developments in the graphic counterparts. Knuts
71\type {metafont} program got a \LUA\ enhanced cousin \type {mflua} while \type
72{metapost} (aka \type {mpost} or \type {mp}) became a library that is embedded in
73\LUATEX\ (and gets a follow up in \LUAMETATEX). I will not discuss these here.
74
75If we look back at all this, we need to keep in mind that originally \TEX\ was
76made by Don Knuth for typesetting his books. These are in English (although over
77time due to references he needed to handle different scripts than Latin, be it
78just snippets and not whole paragraphs). Much development of successors was the
79result of demands with respect to scripts other than Latin and languages other
80than English. Given the fact that (at least in my country) English seems to
81become more dominant (kids use it, universities switch to it) one can wonder if
82at some point the traditional engine can just serve us as well.
83
84The original \type {tex} program was actually extended once: support for mixed
85usage of multiple languages became possible. But apart from that, the standard
86program has been pretty stable in terms of functionality. Of course, the parts
87that made the extension interface have seen changes but that was foreseeable. For
88instance, the file system hooks into the \KPSE\ library and one can execute
89programs via the \type {\write} command. Virtual font technology was also an
90extension but that didn't require a change in the program but involved
91postprocessing the \DVI\ files.
92
93The first major \quote {upgrade} was \ETEX. For quite a while extensions were
94discussed but at some point the first version became available. For me, once
95\PDFTEX\ incorporated these extensions, it became the default. So what did it
96bring? First of all we got more than 256 registers (counters, dimensions, etc.).
97Then there are some extra primitives, for instance \type {\protected} that
98permits the definition of unexpandable macros (although before that one could
99simulate it at the cost of some overhead) and convenient ways to test the
100existence of a macro with \type {\ifdefined} and \type {\ifcsname}. Although not
101strictly needed, one could use \type {\dimexpr} for expressions. A probably
102seldom used extension was the (paragraph bound) right to left typesetting. That
103actually is a less large extension than one might imagine: we just signal where
104the direction changes and the backend deals with the reverse flushing. It was
105mostly about convenience.
106
107The \OMEGA\ project (later followed up by \ALEPH) didn't provide the additional
108programming related primitives but made the use of wide fonts possible. It did
109extend the number of registers, just by bumping the limits. As a consequence it
110was much more demanding with respect to memory. The first time I heard of \ETEX\
111and \OMEGA\ was at the 1995 euro\TEX\ meeting organized by the \NTG\ and I was
112sort of surprised by the sometimes emotional clash between the supporters of
113these two variants. Actually it was the first time I became aware of \TEX\
114politics in general, but that is another story. It was also the time that I
115realized that practical discussions could be obscured by nitpicking about
116speaking the right terminology (token, node, primitive, expansion, gut, stomach,
117etc.) and that one could best keep silent about some issues.
118
119The \PDFTEX\ follow up had quite some impact: as mentioned it had a backend built
120in, but it also permitted hyperlinks and such by means of additional primitives.
121It added a couple more, for instance for generating random numbers. But it
122actually was a research project: the frontend was extended with so called
123character protrusion (which lets glyphs hang into the margin) and expansion (a
124way to make the output look better by scaling shapes horizontally). Both these
125extensions were integrated in the paragraph builder and are thereby extending
126core code. Adding some primitives to the macro processor is one thing, adapting a
127very fundamental property of the typesetting machinery is something else. Users
128could get excited: \TEX\ renders a text even better (of course hardly anyone
129notices this, even \TEX\ users, as experiments proved).
130
131In the end \OMEGA\ never took off, probably because there was never a really
132stable version and because at some time \XETEX\ showed up. This variant was first
133only available on Apple computers because it depends on third party libraries.
134Later, ports to other systems showed up. Using libraries is not specific for
135\XETEX. For instance \PDFTEX\ uses them for embedding images. But, as that is
136actually a (backend) extension it is not critical. Using libraries in the
137frontend is more tricky as it adds a dependency and the whole idea about \TEX\
138was that is is independent. The fact that after a while \XETEX\ switched
139libraries is an indication of this dependency. But, if a user can live with that,
140it's okay. The same is true for (possibly changing) fonts provided by the
141operating system. Not all users care too strongly about long term compatibility.
142In fact, most users work on a document, and once finished store some \PDF\ copy
143some place and then move on and forget about it.
144
145It must be noted that where \ETEX\ has some limited right to left support,
146\OMEGA\ supports more. That has some more impact on all kinds of calculations in
147the machinery because when one goes vertical the width is swapped with the
148height|/|depth and therefore the progression is calculated differently.
149
150Naturally, in order to deal with scripts other than Latin, \XETEX\ did add some
151primitives. I must admit that I never looked into those, as \CONTEXT\ only added
152support for wide fonts. Maybe these extensions were natural for \LATEX, but I
153never saw a reason to adapt the \CONTEXT\ machinery to it, also because some
154\PDFTEX\ features were lacking in \XETEX\ that \CONTEXT\ assumed to be present
155(for the kind of usage it is meant for). But we can safely say that the impact of
156\XETEX\ was that the \TEX\ community became aware that there were new font
157technologies that were taking over the existing ones used till now. One thing
158that is worth noticing is that \XETEX\ is still pretty much a traditional \TEX\
159engine: it does for instance \OPENTYPE\ math in a traditional \TEX\ way. This is
160understandable as one realizes that the \OPENTYPE\ math standard was kind of
161fuzzy for quite a while. A consequence is that for instance the \OPENTYPE\ math
162fonts produced by the \GUST\ foundation are a kind of hybrid. Later versions
163adopted some more \PDFTEX\ features like expansion and protrusion.
164
165I skip the Japanese \TEX\ engines because they serve a very specific audience and
166provide features for scripts that don't hyphenate but use specific spacing and
167line breaks by injecting glues and penalties. One should keep in mind that before
168\UNICODE\ all kinds of encodings were used for these scripts and the 256
169limitations of traditional \TEX\ were not suited for that. Add to that demands
170for vertical typesetting and it will be clear that a specialized engine makes
171sense. It actually fits perfectly in the original idea that one could extend
172\TEX\ for any purpose. It is a typical example of where one can argue that users
173should switch to for instance \XETEX\ or \LUATEX\ but these were not available
174and therefore there is no reason to ditch a good working system just because some
175new (yet unproven) alternative shows up a while later.
176
177We now arrive at \LUATEX. It started as an experiment in 2005 where a \LUA\
178interpreter was added to \PDFTEX. One could pipe data into the \TEX\ machinery
179and query some properties, like the values of registers. At some point the
180project sped up because Idris Hamid got involved. He was one of the few \CONTEXT\
181users who used \OMEGA\ (which it actually did support to some extent) but he was
182not satisfied with the results. His oriental \TEX\ project helped pushing the
183\LUATEX\ project forward. The idea was that by opening up the internals of \TEX\
184we could do things with fonts and paragraph building that were not possible
185before. The alternative, \XETEX\ was not suitable for him as it was too bound to
186what the libraries provides (rendering then depends on what library gets used and
187what is possible at what time). But, dealing with scripts and fonts is just one
188aspect of \LUATEX. For instance more primitives were added and the math machinery
189got an additional \OPENTYPE\ code path. Memory constraints were lifted and all
190became \UNICODE\ internally. Each stage in the typesetting process can be
191intercepted, overloaded, extended.
192
193Where the \ETEX\ and \OMEGA\ extensions were the result of many years of
194discussion, the \PDFTEX, \XETEX\ and \LUATEX\ originate in practical demands.
195Very small development teams that made fast decisions made that possible.
196
197Let's give some more examples of extensions in \LUATEX. Because \PDFTEX\ is the
198starting point there is protrusion and expansion, but these mechanisms have been
199promoted to core functionality. The same is true for embedding images and content
200reuse: these are now core features. This makes it possible to implement them more
201naturally and efficiently. All the backend related functionality (literal \PDF,
202hyperlinks, etc) is now collected in a few extension primitives and the code is
203better isolated. This took a bit of effort but is in my opinion better. Support
204for directions comes from \OMEGA\ and after consulting with its authors it was
205decided that only four made sense. Here we also promoted the directionality to
206core features instead of extensions. Because we wanted to serve \OMEGA\ users too
207extended \TFM\ fonts can be read, not that there are many of them, which fits
208nicely into the whole machinery going 32~instead of 8~bits. Instead of the \ETEX\
209register model, where register numbers larger than 255 were implemented
210differently, we adopted the \OMEGA\ model of just bumping 256 to 65536 (and of
211course, 16K would have been sufficient too but the additional memory it uses can
212be neglected compared to what other programs use and|/|or what resources users
213carry on their machines).
214
215The modus operandi for extending \TEX\ is to take the original literate \WEB\
216sources and define change files. The \PDFTEX\ program already deviated from that
217by using a monolithic source. But still \PASCAL\ is used for the body of core
218code. It gets translated to \CCODE\ before being compiled. In the \LUATEX\
219project Taco Hoekwater took that converted code and laid the foundation for what
220became the original \LUATEX\ code base.
221
222Some extensions relate to the fact that we have \LUA\ and have access to \TEX's
223internal node lists for manipulations. An example is the concept of attributes.
224By setting an attribute to a value, the current nodes (glyphs, kerns, glue,
225penalties, boxes, etc) get these as properties and one can query them at the
226\LUA\ end. This basically permits variables to travel with nodes and act
227accordingly. One can for instance implement color support this way. Instead of
228injecting literal or special nodes that themselves can interfere we now can have
229information that does not interfere at all (apart from maybe some performance
230hit). I think that conceptually this is pretty nice.
231
232At the \LUA\ one has access to the \TEX\ internals but one can also use specific
233token scanners to fetch information from the input streams. In principle one can
234create new primitives this way. It is always a chicken|-|egg question what works
235better but the possibility is there. There are many such conceptual additions
236in \LUATEX, which for sure makes it the most \quote {aggressive} extension of
237\TEX\ so far. One reason for these experiments and extensions is that \LUA\
238is such a nice and suitable language for this purpose.
239
240Of course a fundamental part of \LUATEX\ is the embedded \METAPOST\ library. For
241sure the fact that \CONTEXT\ integrates \METAPOST\ has been the main reason for
242that.
243
244The \CONTEXT\ macro package is well adapted to \LUATEX\ and the fact that its
245users are always willing to update made the development of \LUATEX\ possible.
246However, we are now in a stage that other macro packages use it so \LUATEX\ has
247entered a state where nothing more gets added. The \LATEX\ macro package now
248also supports \LUATEX, although it uses a variant that falls back on a library to
249deal with fonts (like \XETEX\ does).
250
251With \LUATEX\ being frozen (of course bugs will be fixed), further exploration
252and development is now moved to \LUAMETATEX, again in the perspective of
253\CONTEXT. I will not go into details apart from saying that is is a lightweight
254version of \LUATEX. More is delegated to \LUA, which already happened in
255\CONTEXT\ anyway, but also some extra primitives were added, mostly to enable
256writing nicer looking code. However, a major aspect is that this program uses a
257lean and mean code base, is supposed to compile out of the box, and that sources
258will be an integral part of the \CONTEXT\ code base, so that users are always in
259sync.
260
261So, to summarize: we started with \type {tex} and moved on to \type {etex} and
262\type {pdftex}. At some point \type {omega} and \type {xetex} filled the
263\UNICODE\ and script gaps, but it now looks like \type {luatex} is becoming
264popular. Although \type {luatex} is the reference implementation, \LATEX\
265exclusively uses \type {luahbtex}, while \CONTEXT\ has a version that targets at
266\type {luametatex}. In parallel, the \type {[e][u][p]tex} engines fill the
267specific needs for Japanese users. In most cases, good old \type {tex} and less
268old \type {etex} are just shortcuts to \type {pdftex} which is compatible but has
269the \PDF\ backend on board. That 8 bit engine is not only faster than the more
270recent engines, but also suits quite well for a large audience, simply because
271for articles, thesis, etc. (written in a Latin script, most often English) it
272fits the bill well.
273
274I deliberately didn't mention names and years as well as detailed pros and cons.
275A user should have the freedom to choose what suits best. I'm not sure how well
276\TEX\ would have evolved or will evolve in these days of polarized views on
277operating systems, changing popularity of languages, many (also open source)
278projects being set up to eventually be monetized. We live in a time where so
279called influencers play a role, where experience and age often matters less than
280being fancy or able to target audiences. Where something called a standard today
281is replaced quickly by a new one tomorrow. Where stability and long term usage of
282a program is only a valid argument for a few. Where one can read claims that one
283should use this or that because it is todays fashion instead of the older thing
284that was the actually the only way to achieve something at all a while ago. Where
285a presence on facebook, twitter, instagram, whatsapp, stack exchange is also an
286indication of being around at all. Where hits, likes, badges, bounties all play a
287role in competing and self promotion. Where today's standards are tomorrow's
288drawbacks. Where even in the \TEX\ community politics seem to creep in. Maybe you
289can best not tell what is your favorite \TEX\ engine because what is hip today
290makes you look out of place tomorrow.
291
292\stopchapter
293
294\stopcomponent
295