mk-halfway.tex /size: 16 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\environment mk-environment
4
5\startcomponent mk-halfway
6
7\chapter{Halfway}
8
9\subject{introduction}
10
11We are about halfway into the \LUATEX\ project now. At the time of
12writing this document we are only a few days away from version
130.40 (the Bacho\TEX\ cq.\ \TEX Live version) and around euro\TEX\
142009 we will release version 0.50. Starting with version 0.30
15(which we released around the conference of the Korean \TEX\ User
16group meeting) all one-decimal releases are supported and usable
17for (controlled) production work. We have always stated that all
18interfaces may change until they are documented to be stable, and
19we expect to document the first stable parts in version 0.50.
20Currently we plan to release version 1.00 sometime in 2012, 30
21years after \TEX82, with 0.60 and 0.70 in 2010, 0.80 and 0.90 in
222011. But of course it might turn out different.
23
24In this update we assume that the reader knows what \LUATEX\ is and
25what it does.
26
27\subject{design principles}
28
29We started this project because we wanted an extensible engine.
30We chose \LUA\ as the glue language. We do not regret this choice as it
31permitted us to open up \TEX's internals reasonably well. There have been
32a few extensions to \TEX\ itself, and there will be a few more, but none
33of them are fundamental in the sense that they influence
34
35typesetting. Extending \TEX\ in that area is up to the macro package
36writer, who can use the \LUA\ language combined with \TEX\ macros. In a
37similar fashion we made some decisions about \LUA\ libraries that are
38included. What we have now is what you will get. Future versions of
39\LUATEX\ will have the ability to load additional libraries but these
40will not be part of the core distribution.  There is simply too much
41choice and we do not want to enter endless discussions about what is
42best. More flexibility would also add a burden on maintenance that we
43do not want. Portability has always been a virtue of \TEX\ and we want
44to keep it that way.
45
46\subject{lua scripting}
47
48Before 0.40 there could be multiple instances of the \LUA\ interpreter
49active at the same time, but we have now decided to limit the number of
50instances to just one. The reason is simple: sharing all functionality
51among multiple \LUA\ interpreter instances does more bad than good and
52\LUA\ has enough possibilities to create namespaces anyway. The new
53limit also simplifies the internal source code, which is a good
54thing. While the \type {\directlua} command is now sort of frozen, we
55might extend the functionality of \type {\latelua}, especially in
56relation to what is possible in the backend. Both commands still
57accept a number but this now refers to an index in a user||definable
58name table that will be shown when an error occurs.
59
60\subject {input and output}
61
62The current \LUATEX\ release permits multiple instances of \KPSE\
63which can be handy if you mix, for instance, a macro package and
64\MPLIB, as both have their own \quote{progname} (and engine) namespace.
65However, right from the start it has been possible to bring most input
66under \LUA\ control and one can overload the usual \KPSE\
67mechanisms. This is what we do in \CONTEXT\ (and probably only there).
68
69Logging, etc., is also under \LUA\ control. There is no support for
70writing to \TEX's opened output channels except for the log and the
71terminal. We are investigating limited write control to numbered
72channels but this has a very low priority.
73
74Reading from zip files and sockets has been available
75for a while now.
76
77Among the first things that have been implemented is a mechanism for
78managing category codes (\type{\catcode}) although this is not really
79needed for practical usage as we aim at full compatibility. It just
80makes printing back to \TEX\ from \LUA\ a bit more comfortable.
81
82\subject {interface to tex}
83
84Registers can always be accessed from \LUA\ by number and (when
85defined at the \TEX\ end) also by name. When writing to a register
86grouping is honored. Most internal registers can be accessed
87(mostly read-only). Box registers can be manipulated but users
88need to be aware of potential memory management issues.
89
90There will be provisions to use the primitives related to setting
91codes (lowercase codes and such). Some of this functionality will be
92available in version 0.50.
93
94\subject {fonts}
95
96The internal font model has been extended to the full \UNICODE\
97range. There are readers for \OPENTYPE, \TYPEONE, and traditional
98\TEX\ fonts. Users can create virtual fonts on the fly and have
99complete control over what goes into \TEX. Font specific features
100can either be mapped onto the traditional ligature and kerning
101mechanisms or be implemented in \LUA.
102
103We use code from \FONTFORGE\ that has been stripped to get a
104smaller code base. Using the \FONTFORGE\ code has the advantage
105that we get a similar view on the fonts in \LUATEX\ as in this
106editor which makes debugging easier and developing fonts more
107convenient.
108
109The interface is already rather stable but some of the keys in loaded
110tables might change. Almost all of the font interface will be stable
111in version 0.50.
112
113\subject {tokens}
114
115It is possible to intercept tokenization. Once intercepted, a token
116table can be manipulated before being piped back into \LUATEX.  We
117still support \OMEGA's translation processes but that might become
118obsolete at some point.
119
120Future versions of \LUATEX\ might use \LUA's so-called \quote {user data}
121concept but the interface will mostly be the same. Therefore this
122subsystem will not be frozen yet in version 0.50.
123
124\subject {nodes}
125
126Users have access to the node lists in various stages. This interface
127has already been quite stable for some time but some cleanup might
128still take place. Currently the node memory maintenance is still
129explicit, but eventually we will make releasing unused nodes automatic.
130
131We have plans for keeping more extensive information within
132a paragraph (initial whatsit) so that one can build alternative
133paragraph builders in \LUA. There will be a vertical packer (in
134addition to the horizontal packer) and we will open up the page
135builder (inserts etc.). The basic interface will be stable in version
1360.50.
137
138\subject {attributes}
139
140This new kid on the block is now available for most subsystems but
141we might change some of its default behaviour. As of 0.40 you can
142also use negative values for attributes. The original idea of
143using negative values for special purposes has been abandoned as
144we consider a secondary (faster and more efficient) limited
145variant. The basic principles will be stable around version 0.50,
146but we reserve the freedom to change some aspects of attributes
147until we reach version 1.00.
148
149\subject {hyphenation}
150
151In \LUATEX\ we have clearly separated hyphenation, ligature
152building and kerning. Managing patterns as well as hyphenation is
153reimplemented from scratch but uses the same principles as
154traditional \TEX. Patterns can be loaded at run time and exceptions
155are quite efficient now. There are a few extensions, like embedded
156discretionaries in exceptions and pre- as well as posthyphens.
157
158On the agenda is fixing some \quote{hyphenchar} related issues and future
159releases might deal with compound words as well. There are some
160known limitations that we hope to have solved in version 0.50.
161
162\subject {images}
163
164Image handling is part of the backend. This part of the \PDFTEX\
165code has been rewritten and can now be controlled from \LUA. There
166are already a few more options than in \PDFTEX\ (simple
167transformations). The image code will also be integrated in the
168virtual font handler.
169
170\subject {paragraph building}
171
172The paragraph builder has been rewritten in \CCODE\ (soon to be
173converted back to \CWEB). There is a callback related to the builder
174so it is possible to overload the default line breaker by one written
175in \LUA.
176
177There are no further short|-|term revisions on the agenda, apart from
178writing an advanced (third order) Arabic routine for the Oriental
179\TEX\ project.
180
181Future releases may provide a bit more control over \type{\parshape}s
182and multiple paragraph shapes.
183
184\subject {metapost}
185
186The closely related \MPLIB\ project has resulted in a \METAPOST\
187library that is included in \LUATEX. There can be multiple
188instances active at the same time and \METAPOST\ processing is
189very fast. Conversion to \PDF\ is to be done with \LUA.
190
191On the to-do list is a bit more interoperability (pre- and
192postscript tables) and this will make it into release 0.50
193(maybe even in version 0.40 already).
194
195\subject {mathematics}
196
197Version 0.50 will have a stable version of \UNICODE\
198math support. Math is backward compatible but provides solutions
199for dealing with \OPENTYPE\ math fonts. We provide math lists in
200their intermediate form (noads) so that it is possible to
201manipulate math in great detail.
202
203The relevant math parameters are reorganized according to what
204\OPENTYPE\ math provides (we use the Cambria font as our reference). Parameters
205are grouped by style. Future versions of \LUATEX\ will build upon
206this base to provide a simple mechanism for switching style sets
207and font families in-formula.
208
209There are new primitives for placing accents (top and bottom
210variants and extensible characters), creating radicals, and making
211delimiters. Math characters are permitted in text mode.
212
213There will be an additional alignment mechanism analogous to
214what \MATHML\ provides. Expect more.
215
216\subject {page building}
217
218Not much work has been done on opening up the page builder
219although we do have access to the intermediate lists. This
220is unlikely to happen before 0.50.
221
222\subject {going cweb}
223
224After releasing version 0.50 around Euro\TEX\ 2009 there will be a
225period of relative silence. Apart from bug fixes and (private)
226experiments there will be no release for a while. At the time of the
2270.50 release the \LUATEX\ source code will probably be in plain C
228completely. After that is done, we will concentrate hard on
229consolidating and upgrading the code base back into \CWEB.
230
231\subject {cleanup}
232
233Cleanup of code is a continuous process. Cleanup is needed because
234we deal with a merge of traditional \TEX, \ETEX\ extensions,
235\PDFTEX\ functionality and some \OMEGA\ (\ALEPH) code.
236
237Compatibility is a prerequisite, with the exception of logging and
238rather special ligature reconstruction code.
239
240We also use the opportunity to slowly move away from all the global
241variables that are used in the \PASCAL\ version.
242
243\subject {alignments}
244
245We do have some ideas about opening up alignments, but it has a
246low priority and it will not happen before the 0.50 release.
247
248\subject {error handling}
249
250Once all code is converted to \CWEB, we will look into error
251handling and recovery. It has no high priority as it is easier to
252deal with after the conversion to \CWEB.
253
254\subject {backend}
255
256The backend code will be rewritten stepwise. The image related
257code has already been redone, and currently everything related to
258positioning and directions is redesigned and made more consistent.
259Some bugs in the \ALEPH\ code (inherited from \OMEGA) have been
260removed and we are trying to come up with a consistent way of dealing
261with directions. Conceptually this is somewhat messy because much
262directionality is delegated to the backend.
263
264We are experimenting with positioning (preroll) and better literal
265injection. Currently we still use the somewhat fuzzy \PDFTEX\ methods
266that evolved over time (direct, page and normal injection) but we
267will come up with a clearer model.
268
269Accuracy of the output (\PDF) will be improved and character
270extension (hz) will be done more efficiently. Experimental code
271seems to work okay. This will become available from release 0.40
272and onwards and further cleanup will take place when the \CWEB\
273code is there, as much of the \PDF\ backend code is already \CCODE.
274
275\subject{context mkiv}
276
277When we started with \LUATEX\ we decided to use a branch of
278\CONTEXT\ for testing as it involves quite drastic changes, many
279rewrites, a tight connection with binary versions, etc.
280
281As a result for some time we now have two versions of \CONTEXT: \MKII\
282and \MKIV, where the former targets \PDFTEX\ and \XETEX, and
283the latter exclusively uses \LUATEX. Although the user interface
284is downward compatible the code base starts to diverge more and
285more. Therefore at the last \CONTEXT\ meeting it was decided to
286freeze the current version of \MKII\ and only apply bug fixes
287and an occasional simple extension.
288
289This policy change opened the road to rather drastic splitting of the
290code, also because full compatibility between \MKII\ and \MKIV\ is not
291required. Around \LUATEX\ version 0.40 the new, currently still
292experimental, document structure related code will be merged into the
293regular \MKIV\ version. This might have some impact as it opens up new
294possibilities.
295
296\subject {the future}
297
298In the future, \MKIV\ will try to create (more) clearly separated
299layers of functionality so that it will become possible to make
300subsets of \CONTEXT\ for special purposes. This is done under the name
301\METATEX. Think of layering like:
302
303\startitemize[packed]
304\item \IO, catcodes, callback management, helpers
305\item input regimes, characters, filtering
306\item nodes, attributes and noads
307\item user interface
308\item languages, scripts, fonts and math
309\item spacing, par building and page construction
310\item \XML, graphics, \METAPOST, job management, and structure (huge impact)
311\item modules, styles, specific features
312\item tools
313\stopitemize
314
315\subject{fonts}
316
317At this moment \MKIV\ is already quite capable of dealing with
318\OPENTYPE\ fonts. The driving force behind this is the Oriental
319\TEX\ project which brings along some very complex and feature
320rich Arabic font technology. Much time has gone into reverse
321engineering the specification and behaviour of how these fonts
322behave in Uniscribe (which we use as our reference for Arabic).
323
324Dealing with the huge \CJK\ fonts is less a font issue and more
325a matter of node list processing. Around the annual meeting of
326the Korean User Group we got much of the machinery working, thanks
327to discussions on the spot and on the mailing list.
328
329\subject {math}
330
331Between \LUATEX\ versions 0.30 and 0.40 the math machinery was opened
332up (stage one). In order to test this new functionality, \MKIV's math
333subsystem (that was then already partially \UNICODE\ aware) had to be
334adapted.
335
336First of all \UNICODE\ permits us to use only one math family and so
337\MKIV\ now does that.  The implementation uses Microsoft's Cambria
338Math font as a benchmark. It creates virtual fonts from the other (old
339and new) math fonts so they appear to match up to Cambria
340Math. Because the \TEX\ Gyre math project is not yet up to speed \MKIV\
341currently uses virtual variants of these fonts that are created at
342run time. The missing pieces in for instance Latin Modern and friends
343are compensated for by means of virtual characters.
344
345Because it is now possible to parse the intermediate noad lists \MKIV\ can
346do some manipulations before the formula is typeset. This is for
347instance used for alphabet remapping, forcing sizes, and spacing
348around punctuation.
349
350Although \MKIV\ already supports most of the math that users expect
351there is still room for improvement once there is even more control
352over the machinery. This is possible because \MKIV\ is not bound to
353downward compatibility.
354
355As with all other \LUATEX\ related \MKIV\ code, it is expected that we
356will have to rewrite most of the current code a few times as we
357proceed, so \MKIV\ math support is not yet stable either. We can take
358such drastic measures because \MKIV\ is still experimental and because
359users are willing to do frequent synchronous updating of macros and
360engine. In the process we hope to get away from all ad||hoc boxing and
361kerning and whatever solutions for creating constructs, by using
362the new accent, delimiter, and radical primitives.
363
364\subject {tracing and testing}
365
366Whenever possible we add tracing and visualization features to
367\CONTEXT\ because the progress reports and articles need them. Recent
368extensions concerned tracing math and tracing \OPENTYPE\ processing.
369
370The \OPENTYPE\ tracing options are a great help in stepwise
371reaching the goals of the Oriental \TEX\ project. This project
372gave the \LUATEX\ project its initial boost and aims at high
373quality right|-|to|-|left typesetting. In the process complex (test)
374fonts are made which, combined with the tracing mentioned, help us
375to reveal the secrets of \OPENTYPE.
376
377\stopcomponent
378