still-one.tex /size: 11 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\usemodule[art-01,abr-02] \setupbodyfont[11pt]
4
5\starttext
6
7\startchapter[title=\LUATEX\ going stable]
8
9\startsection[title=Introduction]
10
11We're closing in on version 1.0 of \LUATEX\ and at the time of this writing (mid
12April 2016) we're at version 0.95. The last decade we've reported on a regular
13basis about progress in user group journals, \CONTEXT\ related documents and the
14\LUATEX\ manual and it makes no sense to repeat ourselves.
15
16So where do we stand now? I will not go into details about what is available in
17\LUATEX, for that you consult the manual but will stick to the larger picture
18instead.
19
20\stopsection
21
22\startsection[title=What is it]
23
24First of all, as the name suggests, \LUATEX\ has the \LUA\ scripting engine on
25board. Currently we're still at version 5.2 and the reason for not going 5.3 is
26mainly because it has a different implementation of numbers and we cannot foresee
27side effects. We will test this when we move on to \LUATEX\ version 2.0.
28
29The second part of the name indicates that we have some kind of \TEX\ and we
30think we managed to remain largely compatible with the traditional engine. We
31took most of \ETEX, much of \PDFTEX\ and some from \ALEPH\ (\OMEGA). On top of
32that we added a few new primitives and extended others.
33
34If you look at the building blocks of \TEX, you can roughly recognize these:
35
36\startitemize
37\startitem
38    an input parser (tokenizer) that includes macro expansion; its working is
39    well described, of course in the \TEX\ book, but more than three decades of
40    availability has made \TEX's behaviour rather well documented
41\stopitem
42\startitem
43    a list builder that links basic elements like characters (tagged with font
44    information), rules, boxes, glue and kerns together in a double linked
45    list of so called nodes (and noads in intermediate math lists)
46\stopitem
47\startitem
48    a language subsystem that is responsible for hyphenating words using so called
49    patterns and exceptions
50\stopitem
51\startitem
52    a font subsystem that provides information about glyphs properties, and that
53    also makes it possible to construct math symbols from snippets; it also makes
54    sure that the backend knows what to embed
55\stopitem
56\startitem
57    a paragraph builder that breaks a long list into lines and a page builder
58    that splits of chunks that can be wrapped into pages; this is all done within
59    given constraints using a model of rewards and penalties
60\stopitem
61\startitem
62    a first class math renderer that set the standard and has inspired modern
63    math font technology
64\stopitem
65\startitem
66    mechanisms for dealing with floating data, marking page related info, wrapping
67    stuff in boxes, adding glue, penalties and special information
68\stopitem
69\startitem
70    a backend that is responsible for wrapping everything typeset in a format that
71    can be printed and viewed
72\stopitem
73\stopitemize
74
75So far we're still talking of a rather generic variant of \TEX\ with \LUA\ as
76extension language. Next we zoom in on some details.
77
78\stopsection
79
80\startsection[title=Where it differs]
81
82Given experiences with discussing extensions to the engine and given the fact
83that there is never really an agreement about what makes sense or not, the
84decission was made to not extend the engine any more than really needed but to
85provide hooks to do that in \LUA. And, time has proven that this is a feasible
86approach. On the one hand we are as good as possible faithful to the original,
87and at the same time we can deal with todays and near future demands.
88
89Tokenization still happens as before but we can also write input parsers
90ourselves. You can intercept the raw input when it gets read from file, but you
91can also create scanners that you can sort of plug into the parser. Both are a
92compromise between convenience and speed but powerful enough. At the input end we
93now can group catcode changes (catcodes are properties of characters that control
94how they are interpreted) into tables so that switching between regimes is fast.
95
96You can in great detail influence how data gets read from files because the \IO\
97subsystem is opened up. In fact, you have the full power of \LUA\ available when
98doing so. At the same time you can print back from \LUA\ into the input stream.
99
100The input that makes in into \TEX, either or not intercepted and manipulated
101beforehand, is to be in \UTF8. What comes out to the terminal and log is also
102\UTF8, and internally all codepaths work with wide characters. Some memory
103constraints have been lifted, and character related commands accept large
104numbers. This comes at a price, which means that in practice the \LUATEX\ engine
105can be several times slower than the 8|-|bit \PDFTEX, but of course in practice
106performance is mostly determined by the efficiency of macro package, so it might
107actually be faster in situations that would stress its ancestors.
108
109Node lists travel through \TEX\ and can be intercepted at many points. That way
110you can add additional manipulations. You can for instance rely on \TEX\ for
111hyphenation, ligature building and kerning but you can also plug in alternatives.
112For this purpose these stages are clearly separated and less integrated (deep
113down) than in traditional \TEX. There are helpers for accessing lists of nodes,
114individual nodes and you can box those lists too (this is called packing). You
115can adapt, create and destroy node lists at will, as long as you make sure you
116feed back into \TEX\ something that makes sense.
117
118In order to control (or communicate with) nodes from the \TEX\ end, an attribute
119mechanism was added that makes it possible to bind properties to nodes when they
120get added to lists. At the \TEX\ end you can set an attribute that then gets
121assigned to the currently injected nodes, while at the \LUA\ end you can query
122the node for these attributes and their values.
123
124The language subsystem is re|-|implemented and behaves mostly the same as in the
125original \TEX\ program. It has a few extensions and permits runtime loading of
126patterns. In addition to language support we also have basic script support, that
127is: directional information is now part of the stream and contrary to \ALEPH\
128that wraps this into extension whatsits, in \LUATEX\ we have directional nodes as
129core nodes.
130
131The font subsystem is opened up in such a way that you can pass your own fonts to
132the core. You can even construct virtual fonts. This open approach makes it
133possible to support \OPENTYPE\ fonts and whatever format will show up in the
134future. Of course the backend needs to embed the right data in the result file
135but by then the hard work is already done. This approach fits into the always
136present wish of users (and package writers) to be able to implement whatever
137crazy thought one comes up with.
138
139The paragraph builder is a somewhat cleaned up variant of the \PDFTEX\ one,
140combined with directional and boundary support from \ALEPH. The protrusion and
141expansion mechanism have been redone in such a way that the front- and backend
142code is better separated and is somewhat more efficient now. As one can intercept
143the paragraph builder, additional functionality can be injected before, after or
144at some stages in the process.
145
146Of course we have kept the math engine but, because we now need to support
147\OPENTYPE\ math, alternative code paths have been added to deal with the kind of
148information that such fonts provide. We also took the opportunity to open up the
149math machinery a bit so that one can control rendering of some more complex
150elements and set the spacing between elements. Because \TEX\ users are quite
151traditional we had to stop somewhere, simply because legacy code has to be dealt
152with.
153
154Most mentioned auxiliary mechanisms can be accessed via the node lists, for
155instance you can locate inserts and marks in them. The backend related whatsit
156nodes can be recognized as well. At any time one can query and set \TEX\
157registers and intercept boxed material. Of course some knowledge of the inner
158working of \TEX\ helps here.
159
160The backend code is as much as possible separated from the frontend code (but
161there is still some work to do there). As in \PDFTEX\ you can of course inject
162arbitrary \PDF\ code and make feature rich documents. This flexibility keeps
163\TEX\ current.
164
165\stopsection
166
167\startsection[title=Extras]
168
169Is that all? No, apart from some minor extensions that might help to make
170programming somewhat easier \TEX, there are a few more fundamental additions.
171
172Images and reusable content (boxes) are now part of the core instead of them
173being wrapped into backend specific whatsits, although of course the backend has
174to provide support for it. This is more natural in the frontend (and user
175interface) and also more consistent in the engine itself. All backend
176functionality is now collected in three primitives that take arguments. This
177permits a cleaner separation between front- and backend.
178
179Then there is the \METAPOST\ library, a feature already present for many years
180now. It provides \TEX\ with some graphic capabilities that, given the origin,
181fits nicely into the whole. The \LUATEX\ and \MPLIB\ project started about the
182same time and right from the start it was our plan to combine both.
183
184One of the extras is of course \LUA. It not only permits us to interface to the
185internals of \TEX, but it also provides the user with a way to manipulate data.
186Even if you never use \LUA\ to access internals, it might still be found useful
187for occasionally doing things that are hard to accomplish using the macro
188langage.
189
190In addition to stock \LUA\ we include the \LPEG\ library, an image reading
191library (related to the backend) including read access to \PDF\ files via the
192used poppler library, parsing of \PDF\ content streams, zip compression, access
193to the file system, the ability to run commands and socket support. Some of this
194might become external libraries at some point, as we want to keep the expected
195core functionality lean and mean. A nice extra is that we provide \LUAJITTEX, a
196compatible variant that has a faster \LUA\ virtual machine on board.
197
198\stopsection
199
200\startsection[title=Follow up]
201
202The interfaces that we have now have to a large extent evolved to what we had in
203mind. We started with simple experiments: just \LUA\ plus a bit of access to
204registers. Then the Oriental \TEX\ project (with Idris Samawi Hamid) made it
205possible to speed up development and conversion to \CCODE\ and opening up took
206off. After that we gradually moved forward.
207
208That doesn't mean that we're done yet. The \LUATEX\ 1.0 engine will not change
209much. We might add a few things, and for sure we will keep working on the code
210base. The move from \PASCAL\ to \CCODE\ \WEB\ (an impressive job by itself), as
211well as merging functionality of engines (kind of a challenge when you want to
212remain compatible), opening up via \LUA\ (which possibilities even surprised us),
213and experimenting (\CONTEXT\ users paid the price for that) took quite some time,
214also because we played with proofs of concept. It helped that we used the engine
215exclusively for real typesetting related work ourselves.
216
217We will continue to clean up and document the source and stepwise improve the
218manual. If you followed the development of \CONTEXT, you will have noticed that
219\MKIV\ is heavily relying on the \LUA\ interface so stability is important
220(although we can relatively easy adapt to future developments as we did in the
221past). However, the fact that other packages support \LUATEX\ means that we also
222need to keep the 1.0 engine stable. Our challenge is to provide stability on the
223one hand, but not limit ourselves to much on the other. We'll keep you posted on
224what comes next.
225
226\blank
227
228Hans, Hartmut, Luigi, Taco
229
230\stopsection
231
232\stopchapter
233
234\stoptext
235