hybrid-codebase.tex /size: 59 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\startcomponent hybrid-lexing
4
5\environment hybrid-environment
6
7\startchapter[title={Updating the code base}]
8
9\startsection [title={Introduction}]
10
11After much experimenting with new code in \MKIV\ a new stage in \CONTEXT\
12development was entered in the last quarter of 2011. This was triggered by
13several more or less independent developments. I will discuss some of them here
14since they are a nice illustration of how \CONTEXT\ evolves. This chapter was
15published in TugBoat 103; thanks to Karl Berry and Barbara Beeton for making it
16better.
17
18\stopsection
19
20\startsection [title={Interfacing}]
21
22Wolfgang Schuster, Aditya Mahajan and I were experimenting with an abstraction
23layer for module writers. In fact this layer itself was a variant of some new
24mechanisms used in the \MKIV\ structure related code. That code was among the
25first to be adapted as it is accompanied by much \LUA\ code and has been
26performing rather well for some years now.
27
28In \CONTEXT\ most of the user interface is rather similar and module writers are
29supposed to follow the same route as the core of \CONTEXT. For those who have
30looked in the source the following code might look familiar:
31
32\starttyping
33\unexpanded\def\mysetupcommand
34  {\dosingleempty\domysetupcommand}
35
36\def\domysetupcommand[#1]%
37  {..........
38   \getparameters[\??my][#1]%
39   ..........
40   ..........}
41\stoptyping
42
43This implements the command \type {\mysetupcommand} that is used as
44follows:
45
46\starttyping
47\mysetupcommand[color=red,style=bold,...]
48\stoptyping
49
50The above definition uses three rather low|-|level interfacing commands. The
51\type {\unexpanded} makes sure that the command does not expand in unexpected
52ways in cases where expansion is less desirable. (Aside: The \CONTEXT\ \type
53{\unexpanded} prefix has a long history and originally resulted in the indirect
54definition of a macro. That way the macro could be part of testing (expanded)
55equivalence. When \ETEX\ functionality showed up we could use \type {\protected}
56but we stuck to the name \type {\unexpanded}. So, currently \CONTEXT's \type
57{\unexpanded} is equivalent to \ETEX's \type {\protected}. Furthermore, in
58\CONTEXT\ \type {\expanded} is not the same as the \ETEX\ primitive. In order to
59use the primitives you need to use their \type {\normal...} synonyms.) The \type
60{\dosingleempty} makes sure that one argument gets seen by injecting a dummy when
61needed. At some point the \type {\getparameters} command will store the values of
62keys in a namespace that is determined by \type {\??my}. The namespace used here
63is actually one of the internal namespaces which can be deduced from the double
64question marks. Module namespaces have four question marks.
65
66There is some magic involved in storing the values. For instance, keys are
67translated from the interface language into the internal language which happens
68to be English. This translation is needed because a new command is generated:
69
70\starttyping
71\def\@@mycolor{red}
72\def\@@mystyle{bold}
73\stoptyping
74
75and such a command can be used internally because in so|-|called unprotected mode
76\type {@?!} are valid in names. The Dutch equivalent is:
77
78\starttyping
79\mijnsetupcommando[kleur=rood,letter=vet]
80\stoptyping
81
82and here the \type {kleur} has to be converted into \type {color} before the
83macro is constructed. Of course values themselves can stay as they are as long as
84checking them uses the internal symbolic names that have the language specific
85meaning.
86
87\starttyping
88\c!style{color}
89\k!style{kleur}
90\v!bold {vet}
91\stoptyping
92
93Internally assignments are done with the \type {\c!} variant, translation of the
94key is done using the \type {\k!} alternative and values are prefixed by \type
95{\v!}.
96
97It will be clear that for the English user interface no translation is needed and
98as a result that interface is somewhat faster. There we only need
99
100\starttyping
101\c!style{color}
102\v!bold {bold}
103\stoptyping
104
105Users never see these prefixed versions, unless they want to define an
106internationalized style, in which case the form
107
108\starttyping
109\mysetupcommand[\c!style=\v!bold]
110\stoptyping
111
112has to be used, as it will adapt itself to the user interface. This leaves the
113\type {\??my} that in fact expands to \type {\@@my}. This is the namespace prefix.
114
115Is this the whole story? Of course it isn't, as in \CONTEXT\ we often have a
116generic instance from which we can clone specific alternatives; in practice, the
117\type {\@@mycolor} variant is used in a few cases only. In that case a setup
118command can look like:
119
120\starttyping
121\mysetupcommand[myinstance][style=bold]
122\stoptyping
123
124And access to the parameters is done with:
125
126\starttyping
127\getvalue{\??my myinstance\c!color}
128\stoptyping
129
130So far the description holds for \MKII\ as well as \MKIV, but in \MKIV\ we are
131moving to a variant of this. At the cost of a bit more runtime and helper macros,
132we can get cleaner low|-|level code. The magic word here is \type
133{commandhandler}. At some point the new \MKIV\ code started using an extra
134abstraction layer, but the code needed looked rather repetitive despite subtle
135differences. Then Wolfgang suggested that we should wrap part of that
136functionality in a definition macro that could be used to define module setup and
137definition code in one go, thereby providing a level of abstraction that hides
138some nasty details. The main reason why code could look cleaner is that the
139experimental core code provided a nicer inheritance model for derived instances
140and Wolfgang's letter module uses that extensively. After doing some performance
141tests with the code we decided that indeed such an initializer made sense. Of
142course, after that we played with it, some more tricks were added, and eventually
143I decided to replace the similar code in the core as well, that is: use the
144installer instead of defining helpers locally.
145
146So, how does one install a new setup mechanism? We stick to the core code and
147leave modules aside for the moment.
148
149\starttyping
150\definesystemvariable{my}
151
152\installcommandhandler \??my {whatever} \??my
153\stoptyping
154
155After this command we have available some new helper commands of which only a few
156are mentioned here (after all, this mechanism is still somewhat experimental):
157
158\starttyping
159\setupwhatever[key=value]
160\setupwhatever[instance][key=value]
161\stoptyping
162
163Now a value is fetched using a helper:
164
165\starttyping
166\namedwhateverparameter{instance}{key}
167\stoptyping
168
169However, more interesting is this one:
170
171\starttyping
172\whateverparameter{key}
173\stoptyping
174
175For this to work, we need to set the instance:
176
177\starttyping
178\def\currentwhatever{instance}
179\stoptyping
180
181Such a current state macro already was used in many places, so it fits into the
182existing code quite well. In addition to \type {\setupwhatever} and friends,
183another command becomes available:
184
185\starttyping
186\definewhatever[instance]
187\definewhatever[instance][key=value]
188\stoptyping
189
190Again, this is not so much a revolution as we can define such a command easily
191with helpers, but it pairs nicely with the setup command. One of the goodies is
192that it provides the following feature for free:
193
194\starttyping
195\definewhatever[instance][otherinstance]
196\definewhatever[instance][otherinstance][key=value]
197\stoptyping
198
199In some cases this creates more overhead than needed because not all commands
200have instances. On the other hand, some commands that didn't have instances yet,
201now suddenly have them. For cases where this is not needed, we provide simple
202variants of commandhandlers.
203
204Additional commands can be hooked into a setup or definition so that for instance
205the current situation can be updated or extra commands can be defined for this
206instance, such as \type {\start...} and \type {\stop...} commands.
207
208It should be stressed that the installer itself is not that special in the sense
209that we could do without it, but it saves some coding. More important is that we
210no longer have the \type {@@} prefixed containers but use \type
211{\whateverparameter} commands instead. This is definitely slower than the direct
212macro, but as we often deal with instances, it's not that much slower than \type
213{\getvalue} and critical components are rather well speed|-|optimized anyway.
214
215There is, however, a slowdown due to the way inheritance is implemented. That is
216how this started out: using a different (but mostly compatible) inheritance
217model. In the \MKII\ approach (which is okay in itself) inheritance happens by
218letting values point to the parent value. In the new model we have a more dynamic
219chain. It saves us macros but can expand quite wildly depending on the depth of
220inheritance. For instance, in sectioning there can easily be five or more levels
221of inheritance. So, there we get slower processing. The same is true for \type
222{\framed} which is a rather critical command, but there it is nicely compensated
223by less copying. My personal impression is that due to the way \CONTEXT\ is set
224up, the new mechanism is actually more efficient on an average job. Also, because
225many constructs also depend on the \type {\framed} command, that one can easily
226be part of the chain, which again speeds up a bit. In any case, the new
227mechanisms use much less hash space.
228
229Some mechanisms still look too complex, especially when they hook into others.
230Multiple inheritance is not trivial to deal with, not only because the meaning of
231keys can clash, but also because supporting it would demand quite complex fully
232expandable resolvers. So for the moment we stay away from it. In case you wonder
233why we cannot delegate more to \LUA: it's close to impossible to deal with \TEX's
234grouping in efficient ways at the \LUA\ end, and without grouping available \TEX\
235becomes less useful.
236
237Back to the namespace. We already had a special one for modules but after many
238years of \CONTEXT\ development, we started to run out of two character
239combinations and many of them had no relation to what name they spaced. As the
240code base is being overhauled anyway, it makes sense to also provide a new core
241namespace mechanism. Again, this is nothing revolutionary but it reads much more
242nicely.
243
244\starttyping
245\installcorenamespace {whatever}
246
247\installcommandhandler \??whatever {whatever} \??whatever
248\stoptyping
249
250This time deep down no \type {@@} is used, but rather something more obscure. In
251any case, no one will use the meaning of the namespace variables, as all access
252to parameters happens indirectly. And of course there is no speed penalty
253involved; in fact, we are more efficient. One reason is that we often used the
254prefix as follows:
255
256\starttyping
257\setvalue{\??my:option:bla}{foo}
258\stoptyping
259
260and now we just say:
261
262\starttyping
263\installcorenamespace {whateveroption}
264
265\setvalue{\??whateveroption bla}{foo}
266\stoptyping
267
268The commandhandler does such assignments slightly differently as it has to prevent
269clashes between instances and keywords. A nice example of such a clash is this:
270
271\starttyping
272\setvalue{\??whateveroption sectionnumber}{yes}
273\stoptyping
274
275In sectioning we have instances named \type {section}, but we also have keys
276named \type {number} and \type {sectionnumber}. So, we end up with
277something like this:
278
279\starttyping
280\setvalue{\??whateveroption section:sectionnumber}{yes}
281\setvalue{\??whateveroption section:number}{yes}
282\setvalue{\??whateveroption :number}{yes}
283\stoptyping
284
285When I decided to replace code similar to that generated by the installer a new
286rewrite stage was entered. Therefore one reason for explaining this here is that
287in the process of adapting the core code instabilities are introduced and as most
288users use the beta version of \MKIV, some tolerance and flexibility is needed and
289it might help to know why something suddenly fails.
290
291In itself using the commandhandler is not that problematic, but wherever I decide
292to use it, I also clean up the related code and that is where the typos creep in.
293Fortunately Wolfgang keeps an eye on the changes so problems that users report on
294the mailing lists are nailed down relatively fast. Anyway, the rewrite itself is
295triggered by another event but that one is discussed in the next section.
296
297We don't backport (low|-|level) improvements and speedups to \MKII, because for
298what we need \TEX\ for, we consider \PDFTEX\ and \XETEX\ rather obsolete. Recent
299tests show that at the moment of this writing a \LUATEX\ \MKIV\ run is often
300faster than a comparable \PDFTEX\ \MKII\ run (using \UTF-8 and complex font
301setups). When compared to a \XETEX\ \MKII\ run, a \LUATEX\ \MKIV\ run is often
302faster, but it's hard to compare, as we have advanced functionality in \MKIV\
303that is not (or differently) available in \MKII.
304
305\stopsection
306
307\startsection [title={Lexing}]
308
309The editor that I use, called \SCITE, has recently been extended with an extra
310external lexer module that makes more advanced syntax highlighting possible,
311using the \LUA\ \LPEG\ library. It is no secret that the user interface of
312\CONTEXT\ is also determined by the way structure, definitions and setups can be
313highlighted in an editor. \footnote {It all started with \type {wdt}, \type
314{texedit} and \type {texwork}, editors and environments written by myself in
315\MODULA2 and later in \PERL\ Tk, but that was in a previous century.} When I
316changed to \SCITE\ I made sure that we had proper highlighting there.
317
318At \PRAGMA\ one of the leading principles has always been: if the document source
319looks bad, mistakes are more easily made and the rendering will also be affected.
320Or phrased differently: if we cannot make the source look nice, the content is
321probably not structured that well either. The same is true for \TEX\ source,
322although to a large extent there one must deal with the specific properties of
323the language.
324
325So, syntax highlighting, or more impressively: lexing, has always been part of
326the development of \CONTEXT\ and for instance the pretty printers of verbatim
327provide similar features. For a long time we assumed line|-|based lexing, mostly
328for reasons of speed. And surprisingly, that works out quite well with \TEX. We
329used a simple color scheme suitable for everyday usage, with not too intrusive
330coloring. Of course we made sure that we had runtime spell checking integrated,
331and that the different user interfaces were served well.
332
333But then came the \LPEG\ lexer. Suddenly we could do much more advanced
334highlighting. Once I started playing with it, a new color scheme was set up and
335more sophisticated lexing was applied. Just to mention a few properties:
336
337\startitemize[packed]
338\startitem
339    We distinguish between several classes of macro names: primitives, helpers,
340    interfacing, and user macros.
341\stopitem
342\startitem
343    In addition we highlight constant values and special registers differently.
344\stopitem
345\startitem
346    Conditional constructs can be recognized and are treated as in any
347    regular language (keep in mind that users can define their own).
348\stopitem
349\startitem
350    Embedded \METAPOST\ code is lexed independently using a lexer that knows the
351    language's primitives, helpers, user macros, constants and of course specific
352    syntax and drawing operators. Related commands at the \TEX\ end (for defining
353    and processing graphics) are also dealt with.
354\stopitem
355\startitem
356    Embedded \LUA\ is lexed independently using a lexer that not only deals with the
357    language but also knows a bit about how it is used in \CONTEXT. Of course the
358    macros that trigger \LUA\ code are handled.
359\stopitem
360\startitem
361    Metastructure and metadata related macros are colored in a fashion similar to
362    constants (after all, in a document one will not see any constants, so there is
363    no color clash).
364\stopitem
365\startitem
366    Some special and often invisible characters get a special background color so
367    that we can see when there are for instance non|-|breakable spaces
368    sitting there.
369\stopitem
370\startitem
371    Real|-|time spell checking is part of the deal and can optionally be turned on.
372    There we distinguish between unknown words, known but potentially misspelled
373    words, and known words.
374\stopitem
375\stopitemize
376
377Of course we also made lexers for \METAPOST, \LUA, \XML, \PDF\ and text documents
378so that we have a consistent look and feel.
379
380When writing the new lexer code, and testing it on sources, I automatically
381started adapting the source to the new lexing where possible. Actually, as
382cleaning up code is somewhat boring, the new lexer is adding some fun to it. I'm
383not so sure if I would have started a similar overhaul so easily otherwise,
384especially because the rewrite now also includes speedup and cleanup. At least it
385helps to recognize less desirable left|-|overs of \MKII\ code.
386
387\stopsection
388
389\startsection [title={Hiding}]
390
391It is interesting to notice that users seldom define commands that clash with low
392level commands. This is of course a side effect of the fact that one seldom needs
393to define a command, but nevertheless. Low|-|level commands were protected by
394prefixing them by one or more (combinations of) \type {do}, \type {re} and \type
395{no}'s. This habit is a direct effect of the early days of writing macros. For
396\TEX\ it does not matter how long a name is, as internally it becomes a pointer
397anyway, but memory consumption of editors, loading time of a format, string space
398and similar factors determined the way one codes in \TEX\ for quite a while.
399Nowadays there are hardly any limits and the stress that \CONTEXT\ puts on the
400\TEX\ engine is even less than in \MKII\ as we delegate many tasks to \LUA.
401Memory comes cheap, editors can deal with large amount of data (keep in mind that
402the larger the file gets, the more lexing power can be needed), and screens are
403wide enough not to lose part of long names in the edges.
404
405Another development has been that in \LUATEX\ we have lots of registers so that
406we no longer have to share temporary variables and such. The rewrite is a good
407moment to get rid of that restriction.
408
409This all means that at some point it was decided to start using longer command
410names internally and permit \type {_} in names. As I was never a fan of using
411\type {@} for this, underscore made sense. We have been discussing the use of
412colons, which is also nice, but has the disadvantage that colons are also used in
413the source, for instance to create a sub|-|namespace. When we have replaced all
414old namespaces, colons might show up in command names, so another renaming
415roundup can happen.
416
417One reason for mentioning this is that users get to see these names as part of
418error messages. An example of a name is:
419
420\starttyping
421\page_layouts_this_or_that
422\stoptyping
423
424The first part of the name is the category of macros and in most cases is the
425same as the first part of the filename. The second part is a namespace. The rest
426of the name can differ but we're approaching some consistency in this.
427
428In addition we have prefixed names, where prefixes are used as consistently as
429possible:
430
431\starttabulate[|l|l|]
432\NC \type {t_} \NC token register \NC \NR
433\NC \type {d_} \NC dimension register \NC \NR
434\NC \type {s_} \NC skip register \NC \NR
435\NC \type {u_} \NC muskip register \NC \NR
436\NC \type {c_} \NC counter register, constant or conditional \NC \NR
437\NC \type {m_} \NC (temporary) macro \NC \NR
438\NC \type {p_} \NC (temporary) parameter expansion (value of key)\NC \NR
439\NC \type {f_} \NC fractions \NC \NR
440\stoptabulate
441
442This is not that different from other prefixing in \CONTEXT\ apart from the fact
443that from now on those variables (registers) are no longer accessible in a
444regular run. We might decide on another scheme but renaming can easily be
445scripted. In the process some of the old prefixes are being removed. The main
446reason for changing to this naming scheme is that it is more convenient to grep
447for them.
448
449In the process most traditional \type {\if}s get replaced by \quote
450{conditionals}.  The same is true for \type {\chardef}s that store states;
451these become \quote {constants}.
452
453\stopsection
454
455\startsection[title=Status]
456
457We always try to keep the user interface constant, so most functionality and
458control stays stable. However, now that most users use \MKIV, commands that no
459longer make sense are removed. An interesting observation is that some users
460report that low|-|level macros or registers are no longer accessible. Fortunately
461that is no big deal as we point them to the official ways to deal with matters.
462It is also a good opportunity for users to clean up accumulated hackery.
463
464The systematic (file by file) cleanup started in the second half of 2011 and as
465of January 2012 one third of the core (\TEX) modules have to be cleaned up and
466the planning is to get most of that done as soon as possible. However, some
467modules will be rewritten (or replaced) and that takes more time. In any case we
468hope that rather soon most of the code is stable enough that we can start working
469on new mechanisms and features. Before that a cleanup of the \LUA\ code is
470planned.
471
472Although in many cases there are no fundamental changes in the user interface and
473functionality, I will wrap up some issues that are currently being dealt with.
474This is just a snapshot of what is happening currently and as a consequence it
475describes what users can run into due to newly introduced bugs.
476
477The core modules of \CONTEXT\ are loosely organized in groups. Over time there
478has been some reorganization and in \MKIV\ some code has been moved into new
479categories. The alphabetical order does not reflect the loading order or
480dependency tree as categories are loaded intermixed. Therefore the order below is
481somewhat arbitrary and does not express importance. Each category has multiple
482files.
483
484\startsubsubject[title={anch: anchoring and positioning}]
485
486More than a decade ago we started experimenting with position tracking. The
487ability to store positional information and use that in a second pass permits for
488instance adding backgrounds. As this code interacts nicely with (runtime)
489\METAPOST\ it has always been quite powerful and flexible on the one hand, but at
490the same time it was demanding in terms of runtime and resources. However, were
491it not for this feature, we would probably not be using \TEX\ at all, as
492backgrounds and special relative positioning are needed in nearly all our
493projects.
494
495In \MKIV\ this mechanism had already been ported to a hybrid form, but recently
496much of the code has been overhauled and its \MKII\ artifacts stripped. As a
497consequence the overhead in terms of memory probably has increased but the impact
498on runtime has been considerably reduced. It will probably take some time to
499become stable if only because the glue to \METAPOST\ has changed. There are some
500new goodies, like backgrounds behind parshapes, something that probably no one
501uses and is always somewhat tricky but it was not too hard to support. Also,
502local background support has been improved which means that it's easier to get
503them in more column-based layouts, several table mechanisms, floats and such.
504This was always possible but is now more automatic and hopefully more intuitive.
505
506\stopsubsubject
507
508\startsubsubject[title={attr: attributes}]
509
510We use attributes (properties of nodes) a lot. The framework for this had been
511laid early in \MKIV\ development, so not much has changed here. Of course the
512code gets cleaner and hopefully better as it is putting quite a load on the
513processing. Each new feature depending on attributes adds some extra overhead
514even if we make sure that mechanisms only kick in when they are used. This is due
515to the fact that attributes are linked lists and although unique lists are
516shared, they travel with each node. On the other hand, the cleanup (and
517de|-|\MKII|-|ing) of code leads to better performance so on the average no user
518will notice this.
519
520\stopsubsubject
521
522\startsubsubject[title={back: backend code generation}]
523
524This category wraps backend issues in an abstract way that is similar to the
525special drivers in \MKII. So far we have only three backends: \PDF, \XML, and
526\XHTML. Such code is always in a state of maintenance, if only because backends
527evolve.
528
529\stopsubsubject
530
531\startsubsubject[title={bibl: bibliographies}]
532
533For a while now, bibliographies have not been an add|-|on but part of the core.
534There are two variants: traditional \BIBTEX\ support derived from a module by
535Taco Hoekwater but using \MKIV\ features (the module hooks into core code), and a
536variant that delegates most work to \LUA\ by creating an in-memory \XML\ tree
537that gets manipulated. At some point I will extend the second variant. Going the
538\XML\ route also connects better with developments such as Jean|-|Michel
539Hufflen's Ml\BIBTEX.
540
541\stopsubsubject
542
543\startsubsubject[title={blob: typesetting in \LUA}]
544
545Currently we only ship a few helpers but eventually this will become a framework
546for typesetting raw text in \LUA. This might be handy for some projects that we
547have where the only input is \XML, but I'm not that sure if it will produce nice
548results and if the code will look better. On the other hand, there are some cases
549where in a regular \TEX\ run some basic typesetting in \LUA\ might make sense. Of
550course I also need an occasional pet project so this might qualify as one.
551
552\stopsubsubject
553
554\startsubsubject[title={buff: buffers and verbatim}]
555
556Traditionally buffers and verbatim have always been relatives as they share code.
557The code was among the first to be adapted to \LUATEX. There is not that much to
558gain in adapting it further. Maybe I will provide more lexers for
559pretty|-|printing some day.
560
561\stopsubsubject
562
563\startsubsubject[title={catc: catcodes}]
564
565Catcodes are a rather \TEX|-|specific feature and we have organized them in
566catcode regimes. The most important recent change has been that some of the
567characters with a special meaning in \TEX\ (like ampersand, underscore,
568superscript, etc.) are no longer special except in cases that matter. This
569somewhat incompatible change surprisingly didn't lead to many problems. Some code
570that is specific for the \MKII\ \XML\ processor has been removed as we no longer
571assume it is in \MKIV.
572
573\stopsubsubject
574
575\startsubsubject[title={char: characters}]
576
577This important category deals with characters and their properties. Already from
578the beginning of \MKIV\ character properties have been (re)organized in \LUA\
579tables and therefore much code deals with it. The code is rather stable but
580occasionally the tables are updated as they depend on developments in \UNICODE.
581In order to share as much data as possible and prevent duplicates there are
582several inheritance mechanisms in place but their overhead is negligible.
583
584\stopsubsubject
585
586\startsubsubject[title={chem: chemistry}]
587
588The external module that deals with typesetting chemistry was transformed
589into a \MKIV\ core module some time ago. Not much has changed in this department
590but some enhancements are pending.
591
592\stopsubsubject
593
594\startsubsubject[title={cldf: \CONTEXT\ \LUA\ documents}]
595
596These modules are mostly \LUA\ code and are the interface into \CONTEXT\ as well
597as providing ways to code complete documents in \LUA. This is one of those
598categories that is visited every now and then to be adapted to improvements in
599other core code or in \LUATEX. This is one of my favourite categories as it
600exposes most of \CONTEXT\ at the \LUA\ end which permits writing solutions in
601\LUA\ while still using the full power of \CONTEXT. A dedicated manual is on its
602way.
603
604\stopsubsubject
605
606\startsubsubject[title={colo: colors and transparencies}]
607
608This is rather old code, and apart from some cleanup not much has been changed
609here. Some macros that were seldom used have been removed. One issue that is
610still pending is a better interface to \METAPOST\ as it has different color
611models and we have adapted code at that end. This has a rather low priority
612because in practice it is no real problem.
613
614\stopsubsubject
615
616\startsubsubject[title={cont: runtime code}]
617
618These modules contain code that is loaded at runtime, such as filename remapping,
619patches, etc. It does not make much sense to improve these.
620
621\stopsubsubject
622
623\startsubsubject[title={core: all kinds of core code}]
624
625Housekeeping is the main target of these modules. There are still some
626typesetting|-|related components here but these will move to other categories.
627This code is cleaned up when there is a need for it. Think of managing files,
628document project structure, module loading, environments, multipass data, etc.
629
630\stopsubsubject
631
632\startsubsubject[title={data: file and data management}]
633
634This category hosts only \LUA\ code and hasn't been touched for a while. Here we
635deal with locating files, caching, accessing remote data, resources,
636environments, and the like.
637
638\stopsubsubject
639
640\startsubsubject[title={enco: encodings}]
641
642Because (font) encodings are gone, there is only one file in this category and
643that one deals with weird (composed or otherwise special) symbols. It also
644provides a few traditional \TEX\ macros that users expect to be present, for
645instance to put accents over characters.
646
647\stopsubsubject
648
649\startsubsubject[title={file: files}]
650
651There is some overlap between this category and core modules. Loading files is
652always somewhat special in \TEX\ as there is the \TEX\ directory structure to
653deal with. Sometimes you want to use files in the so|-|called tree, but other
654times you don't. This category provides some management code for (selective)
655loading of document files, modules and resources. Most of the code works with
656accompanying \LUA\ code and has not been touched for years, apart from some
657weeding and low|-|level renaming. The project structure code has mostly been
658moved to \LUA\ and this mechanism is now more restrictive in the sense that one
659cannot misuse products and components in unpredictable ways. This change permits
660better automatic loading of cross references in related documents.
661
662\stopsubsubject
663
664\startsubsubject[title={font: fonts}]
665
666Without proper font support a macro package is rather useless. Of course we do
667support the popular font formats but nowadays that's mostly delegated to \LUA\
668code. What remains at the \TEX\ end is code that loads and triggers a combination
669of fonts efficiently. Of course in the process text and math each need to get the
670proper amount of attention.
671
672There is no longer shared code between \MKII\ and \MKIV. Both already had rather
673different low|-|level solutions, but recently with \MKIV\ we went a step further.
674Of course it made sense to kick out commands that were only used for \PDFTEX\
675\TYPEONE\ and \XETEX\ \OPENTYPE\ support but more important was the decision to
676change the way design sizes are supported.
677
678In \CONTEXT\ we have basic font definition and loading code and that hasn't
679conceptually changed much over the years. In addition to that we have so-called
680bodyfont environments and these have been made a bit more powerful in recent
681\MKIV. Then there are typefaces, which are abstract combinations of fonts and
682defining them happens in typescripts. This layered approach is rather flexible,
683and was greatly needed when we had all those font encodings (to be used in all
684kinds of combinations within one document). In \MKIV, however, we already had
685fewer typescripts as font encodings are gone (also for \TYPEONE\ fonts). However,
686there remained a rather large blob of definition code dealing with Latin Modern;
687large because it comes in design sizes.
688
689As we always fall back on Latin Modern, and because we don't preload fonts, there
690is some overhead involved in resolving design size related issues and
691definitions. But, it happens that this is the only font that ships with many
692files related to different design sizes. In practice no user will change the
693defaults. So, although the regular font mechanism still provides flexible ways to
694define font file combinations per bodyfont size, resolving to the right best
695matching size now happens automatically via a so|-|called \LUA\ font goodie file
696which brings down the number of definitions considerably. The consequence is that
697\CONTEXT\ starts up faster, not only in the case of Latin Modern being used, but
698also when other designs are in play. The main reason for this is that we don't
699have to parse those large typescripts anymore, as the presets were always part of
700the core set of typescripts. At the same time loading a specific predefined set
701has been automated and optimized. Of course on a run of 30 seconds this is not
702that noticeable, but it is on a 5 second run or when testing something in the
703editor that takes less than a second. It also makes a difference in automated
704workflows; for instance at \PRAGMA\ we run unattended typesetting flows that need
705to run as fast as possible. Also, in virtual machines using network shares, the
706fewer files consulted the better.
707
708Because math support was already based on \OPENTYPE, where \CONTEXT\ turns
709\TYPEONE\ fonts into \OPENTYPE\ at runtime, nothing fundamental has changed here,
710apart from some speedups (at the cost of some extra memory). Where the overhead
711of math font switching in \MKII\ is definitely a factor, in \MKIV\ it is close to
712negligible, even if we mix regular, bold, and bidirectional math, which we have
713done for a while.
714
715The low|-|level code has been simplified a bit further by making a better
716distinction between the larger sizes (\type {a} up to \type {d}) and smaller
717sizes (\type {x} and \type {xx}). These now operate independently of each other
718(i.e.\ one can now have a smaller relative \type {x} size of a larger one). This
719goes at the cost of more resources but it is worth the effort.
720
721By splitting up the large basic font module into smaller ones, I hope that it can
722be maintained more easily although someone familiar with the older code will only
723recognize bits and pieces. This is partly due to the fact that font code is
724highly optimized.
725
726\stopsubsubject
727
728\startsubsubject[title={grph: graphic (and widget) inclusion}]
729
730Graphics inclusion is always work in progress as new formats have to be dealt
731with or users want additional conversions to be done. This code will be cleaned
732up later this year. The plug|-|in mechanisms will be extended (examples of
733existing plug|-|ins are automatic converters and barcode generation).
734
735\stopsubsubject
736
737\startsubsubject[title={hand: special font handling}]
738
739As we treat protrusion and hz as features of a font, there is not much left in
740this category apart from some fine|-|tuning. So, not much has happened here and
741eventually the left|-|overs in this category might be merged with the font
742modules.
743
744\stopsubsubject
745
746\startsubsubject[title={java: \JAVASCRIPT\ in \PDF}]
747
748This code already has been cleaned up a while ago, when moving to \MKIV, but we
749occasionally need to check and patch due to issues with \JAVASCRIPT\ engines in
750viewers.
751
752\stopsubsubject
753
754\startsubsubject[title={lang: languages and labels}]
755
756There is not much changed in this department, apart from additional labels. The
757way inheritance works in languages differs too much from other inheritance code
758so we keep what we have here. Label definitions have been moved to \LUA\ tables
759from which labels at the \TEX\ end are defined that can then be overloaded
760locally. Of course the basic interface has not changed as this is typically code
761that users will use in styles.
762
763\stopsubsubject
764
765\startsubsubject[title={luat: housekeeping}]
766
767This is mostly \LUA\ code needed to get the basic components and libraries in
768place. While the \type {data} category implements the connection to the outside
769world, this category runs on top of that and feeds the \TEX\ machinery. For
770instance conversion of \MKVI\ files happens here. These files are seldom touched
771but might need an update some time (read: prune obsolete code).
772
773\stopsubsubject
774
775\startsubsubject[title={lpdf: \PDF\ backend}]
776
777Here we implement all kinds of \PDF\ backend features. Most are abstracted via
778the backend interface. So, for instance, colors are done with a high level
779command that goes via the backend interface to the \type {lpdf} code. In fact,
780there is more such code than in (for instance) the \MKII\ special drivers, but
781readability comes at a price. This category is always work in progress as
782insights evolve and users demand more.
783
784\stopsubsubject
785
786\startsubsubject[title={lxml: \XML\ and lpath}]
787
788As this category is used by some power users we cannot change too much here,
789apart from speedups and extensions. It's also the bit of code we use frequently
790at \PRAGMA, and as we often have to deal with rather crappy \XML\ I expect to
791move some more helpers into the code. The latest greatest trickery related to
792proper typesetting can be seen in the documents made by Thomas Schmitz. I wonder
793if I'd still have fun doing our projects if I hadn't, in an early stage of \MKIV,
794written the \XML\ parser and expression parser used for filtering.
795
796\stopsubsubject
797
798\startsubsubject[title={math: mathematics}]
799
800Math deserves its own category but compared to \MKII\ there is much less code,
801thanks to \UNICODE. Since we support \TYPEONE\ as virtual \OPENTYPE\ nothing
802special is needed there (and eventually there will be proper fonts anyway). When
803rewriting code I try to stay away from hacks, which is sometimes possible by
804using \LUA\ but it comes with a slight speed penalty. Much of the \UNICODE\
805math|-|related font code is already rather old but occasionally we add new
806features. For instance, because \OPENTYPE\ has no italic correction we provide an
807alternative (mostly automated) solution.
808
809On the agenda is more structural math encoding (maybe like openmath) but tagging
810is already part of the code so we get a reasonable export. Not that someone is
811waiting for it, but it's there for those who want it. Most math|-|related
812character properties are part of the character database which gets extended on
813demand. Of course we keep \MATHML\ up|-|to|-|date because we need it in a few
814projects.
815
816We're not in a hurry here but this is something where Aditya and I have to redo
817some of the code that provides \AMS|-|like math commands (but as we have them
818configurable some work is needed to keep compatibility). In the process it's
819interesting to run into probably never|-|used code, so we just remove those
820artifacts.
821
822\stopsubsubject
823
824\startsubsubject[title={meta: metapost interfacing}]
825
826This and the next category deal with \METAPOST. This first category is quite old
827but already adapted to the new situation. Sometimes we add extra functionality
828but the last few years the situation has become rather stable with the exception
829of backgrounds, because these have been overhauled completely.
830
831\stopsubsubject
832
833\startsubsubject[title={mlib: metapost library}]
834
835Apart from some obscure macros that provide the interface between front- and
836backend this is mostly \LUA\ code that controls the embedded \METAPOST\ library.
837So, here we deal with extensions (color, shading, images, text, etc.) as well as
838runtime management because sometimes two runs are needed to get a graphic right.
839Some time ago, the \MKII|-|like extension interface was dropped in favor of one
840more natural to the library and \METAPOST~2. As this code is used on a daily
841basis it is quite well debugged and the performance is pretty good too.
842
843\stopsubsubject
844
845\startsubsubject[title={mult: multi|-|lingual user interface}]
846
847Even if most users use the English user interface, we keep the other ones around
848as they're part of the trademark. Commands, keys, constants, messages and the
849like are now managed with \LUA\ tables. Also, some of the tricky remapping code
850has been stripped because the setup definitions files are dealt with. These are
851\XML\ files that describe the user interface that get typeset and shipped with
852\CONTEXT.
853
854These files are being adapted. First of all the commandhandler code is defined
855here. As we use a new namespace model now, most of these namespaces are defined
856in the files where they are used. This is possible because they are more verbose
857so conflicts are less likely (also, some checking is done to prevent reuse).
858Originally the namespace prefixes were defined in this category but eventually
859all that code will be gone. This is a typical example where 15|-|year|-|old
860constraints are no longer an issue and better code can be used.
861
862\stopsubsubject
863
864\startsubsubject[title={node: nodes}]
865
866This is a somewhat strange category as all typeset material in \TEX\ becomes
867nodes so this deals with everything. One reason for this category is that new
868functionality often starts here and is sometimes shared between several
869mechanisms. So, for the moment we keep this category. Think of special kerning,
870insert management, low|-|level referencing (layer between user code and backend
871code) and all kinds of rule and displacement features. Some of this functionality
872is described in previously published documents.
873
874\stopsubsubject
875
876\startsubsubject[title={norm: normalize primitives}]
877
878We used to initialize the primitives here (because \LUATEX\ starts out blank).
879But after moving that code this category only has one definition left and that
880one will go too. In \MKII\ these files are still used (and actually generated by
881\MKIV).
882
883\stopsubsubject
884
885\startsubsubject[title={pack: wrapping content in packages}]
886
887This is quite an important category as in \CONTEXT\ lots of things get packed.
888The best example is \type {\framed} and this macro has been maximally optimized,
889which is not that trivial since much can be configured. The code has been adapted
890to work well with the new commandhandler code and in future versions it might use
891the commandhandler directly. This is however not that trivial because hooking a
892setup of a command into \type {\framed} can conflict with the two commands using
893keys for different matters.
894
895Layers are also in this category and they probably will be further optimized.
896Reimplementing reusable objects is on the horizon, but for that we need a more
897abstract \LUA\ interface, so that will come first. This has a low priority
898because it all works well. This category also hosts some helpers for the page
899builder but the builder itself has a separate category.
900
901\stopsubsubject
902
903\startsubsubject[title={page: pages and output routines}]
904
905Here we have an old category: output routines (trying to make a page), page
906building, page imposition and shipout, single and multi column handling, very
907special page construction, line numbering, and of course setting up pages and
908layouts. All this code is being redone stepwise and stripped of old hacks. This
909is a cumbersome process as these are core components where side effects are
910sometimes hard to trace because mechanisms (and user demands) can interfere.
911Expect some changes for the good here.
912
913\stopsubsubject
914
915\startsubsubject[title={phys: physics}]
916
917As we have a category for chemistry it made sense to have one for physics and
918here is where the unit module's code ended up. So, from now on units are
919integrated into the core. We took the opportunity to rewrite most of it from
920scratch, providing a bit more control.
921
922\stopsubsubject
923
924\startsubsubject[title={prop: properties}]
925
926The best|-|known property in \TEX\ is a font and color is a close second. Both
927have their own category of files. In \MKII\ additional properties like backend
928layers and special rendering of text were supported in this category but in
929\MKIV\ properties as a generic feature are gone and replaced by more specific
930implementations in the \type {attr} namespace. We do issue a warning when any of
931the old methods are used.
932
933\stopsubsubject
934
935\startsubsubject[title={regi: input encodings}]
936
937We still support input encoding regimes but hardly any \TEX\ code is involved
938now. Only when users demand more functionality does this code get extended. For
939instant, recently a user wanted a conversion function for going from \UTF-8 to an
940encoding that another program wanted to see.
941
942\stopsubsubject
943
944\startsubsubject[title={scrn: interactivity and widgets}]
945
946All modules in this category have been overhauled. On the one hand we lifted some
947constraints, for instance the delayed initialization of fields no longer makes
948sense as we have a more dynamic variable resolver now (which is somewhat slower
949but still acceptable). On the other hand some nice but hard to maintain features
950have been simplified (not that anyone will notice as they were rather special).
951The reason for this is that vaguely documented \PDF\ features tend to change over
952time which does not help portability. Of course there have also been some
953extensions, and it is actually less hassle (but still no fun) to deal with such
954messy backend related code in \LUA.
955
956\stopsubsubject
957
958\startsubsubject[title={scrp: script|-|specific tweaks}]
959
960These are script|-|specific \LUA\ files that help with getting better results for
961scripts like \CJK. Occasionally I look at them but how they evolve depends on
962usage. I have some very experimental files that are not in the distribution.
963
964\stopsubsubject
965
966\startsubsubject[title={sort: sorting}]
967
968As sorting is delegated to \LUA\ there is not much \TEX\ code here. The \LUA\
969code occasionally gets improved if only because users have demands. For instance,
970sorting Korean was an interesting exercise, as was dealing with multiple
971languages in one index. Because sorting can happen on a combination of \UNICODE,
972case, shape, components, etc.\ the sorting mechanism is one of the more complex
973subsystems.
974
975\stopsubsubject
976
977\startsubsubject[title={spac: spacing}]
978
979This important set of modules is responsible for vertical spacing, strut
980management, justification, grid snapping, and all else that relates to spacing
981and alignments. Already in an early stage vertical spacing was mostly delegated
982to \LUA\ so there we're only talking of cleaning up now. Although \unknown\ I'm
983still not satisfied with the vertical spacing solution because it is somewhat
984demanding and an awkward mix of \TEX\ and \LUA\ which is mostly due to the fact
985that we cannot evaluate \TEX\ code in \LUA.
986
987Horizontal spacing can be quite demanding when it comes down to configuration:
988think of a table with 1000 cells where each cell has to be set up (justification,
989tolerance, spacing, protrusion, etc.). Recently a more drastic optimization has
990been done which permits even more options but at the same time is much more
991efficient, although not in terms of memory.
992
993Other code, for instance spread|-|related status information, special spacing
994characters, interline spacing and linewise typesetting all falls into this
995category and there is probably room for improvement there. It's good to mention
996that in the process of the current cleanup hardly any \LUA\ code gets touched, so
997that's another effort.
998
999\stopsubsubject
1000
1001\startsubsubject[title={strc: structure}]
1002
1003Big things happened here but mostly at the \TEX\ end as the support code in \LUA\
1004was already in place. In this category we collect all code that gets or can get
1005numbered, moves around and provides visual structure. So, here we find itemize,
1006descriptions, notes, sectioning, marks, block moves, etc. This means that the
1007code here interacts with nearly all other mechanisms.
1008
1009Itemization now uses the new inheritance code instead of its own specific
1010mechanism but that is not a fundamental change. More important is that code has
1011been moved around, stripped, and slightly extended. For instance, we had
1012introduced proper \type {\startitem} and \type {\stopitem} commands which are
1013somewhat conflicting with \type {\item} where a next instance ends a previous
1014one. The code is still not nice, partly due to the number of options. The code is
1015a bit more efficient now but functionally the same.
1016
1017The sectioning code is under reconstruction as is the code that builds lists. The
1018intention is to have a better pluggable model and so far it looks promising. As
1019similar models will be used elsewhere we need to converge to an acceptable
1020compromise. One thing is clear: users no longer need to deal with arguments but
1021variables and no longer with macros but with setups. Of course providing backward
1022compatibility is a bit of a pain here.
1023
1024The code that deals with descriptions, enumerations and notes was already done in
1025a \MKIV\ way, which means that they run on top of lists as storage and use the
1026generic numbering mechanism. However, they had their own inheritance support code
1027and moving to the generic code was a good reason to look at them again. So, now
1028we have a new hierarchy: constructs, descriptions, enumerations and notations
1029where notations are hooked into the (foot)note mechanisms.
1030
1031These mechanisms share the rendering code but operate independently (which was
1032the main challenge). I did explore the possibility of combining the code with
1033lists as there are some similarities but the usual rendering is too different as
1034in the interface (think of enumerations with optional local titles, multiple
1035notes that get broken over pages, etc.). However, as they are also stored in
1036lists, users can treat them as such and reuse the information when needed (which
1037for instance is just an alternative way to deal with end notes).
1038
1039At some point math formula numbering (which runs on top of enumerations) might
1040get its own construct base. Math will be revised when we consider the time to be
1041ripe for it anyway.
1042
1043The reference mechanism is largely untouched as it was already doing well, but
1044better support has been added for automatic cross|-|document referencing. For
1045instance it is now easier to process components that make up a product and still
1046get the right numbering and cross referencing in such an instance.
1047
1048Float numbering, placement and delaying can all differ per output routine (single
1049column, multi|-|column, columnset, etc.). Some of the management has moved to
1050\LUA\ but most is just a job for \TEX. The better some support mechanisms become,
1051the less code we need here.
1052
1053Registers will get the same treatment as lists: even more user control than is
1054already possible. Being a simple module this is a relatively easy task, something
1055for a hot summer day. General numbering is already fine as are block moves so
1056they come last. The \XML\ export and \PDF\ tagging is also controlled from this
1057category.
1058
1059\stopsubsubject
1060
1061\startsubsubject[title={supp: support code}]
1062
1063Support modules are similar to system ones (discussed later) but on a slightly
1064more abstract level. There are not that many left now so these might as well
1065become system modules at some time. The most important one is the one dealing
1066with boxes. The biggest change there is that we use more private registers. I'm
1067still not sure what to do with the visual debugger code. The math|-|related code
1068might move to the math category.
1069
1070\stopsubsubject
1071
1072\startsubsubject[title={symb: symbols}]
1073
1074The symbol mechanisms organizes special characters in groups. With
1075\UNICODE|-|related fonts becoming more complete we hardly need this mechanism.
1076However, it is still the abstraction used in converters (for instance footnote
1077symbols and interactive elements). The code has been cleaned up a bit but
1078generally stays as is.
1079
1080\stopsubsubject
1081
1082\startsubsubject[title={syst: tex system level code}]
1083
1084Here you find all kinds of low|-|level helpers. Most date from early times but
1085have been improved stepwise. We tend to remove obscure helpers (unless someone
1086complains loudly) and add new ones every now and then. Even if we would strip
1087down \CONTEXT\ to a minimum size, these modules would still be there. Of course
1088the bootstrap code is also in this category: think of allocators, predefined
1089constants and such.
1090
1091\stopsubsubject
1092
1093\startsubsubject[title={tabl: tables}]
1094
1095The oldest table mechanism was a quite seriously patched version of \TABLE\ and
1096finally the decision has been made to strip, replace and clean up that bit. So,
1097we have less code, but more features, such as colored columns and more.
1098
1099The (in|-|stream) tabulate code is mostly unchanged but has been optimized
1100(again) as it is often used. The multipass approach stayed but is somewhat more
1101efficient now.
1102
1103The natural table code was originally meant for \XML\ processing but is quite
1104popular among users. The functionality and code is frozen but benefits from
1105optimizations in other areas. The reason for the freeze is that it is pretty
1106complex multipass code and we don't want to break anything.
1107
1108As an experiment, a variant of natural tables was made. Natural tables have a
1109powerful inheritance model where rows and cells (first, last, \unknown) can be
1110set up as a group but that is rather costly in terms of runtime. The new table
1111variant treats each column, row and cell as an instance of \type {\framed} where
1112cells can be grouped arbitrarily. And, because that is somewhat extreme, these
1113tables are called x|-|tables. As much of the logic has been implemented in \LUA\
1114and as these tables use buffers (for storing the main body) one could imagine
1115that there is some penalty involved in going between \TEX\ and \LUA\ several
1116times, as we have a two, three or four pass mechanism. However, this mechanism is
1117surprisingly fast compared to natural tables. The reason for writing it was not
1118only speed, but also the fact that in a project we had tables of 50 pages with
1119lots of spans and such that simply didn't fit into \TEX's memory any more, took
1120ages to process, and could also confuse the float splitter.
1121
1122Line tables \unknown\ well, I will look into them when needed. They are nice in a
1123special way, as they can split vertically and horizontally, but they are seldom
1124used. (This table mechanism was written for a project where large quantities of
1125statistical data had to be presented.)
1126
1127\stopsubsubject
1128
1129\startsubsubject[title={task: lua tasks}]
1130
1131Currently this is mostly a place where we collect all kinds of tasks that are
1132delegated to \LUA, often hooked into callbacks. No user sees this code.
1133
1134\stopsubsubject
1135
1136\startsubsubject[title={toks: token lists}]
1137
1138This category has some helpers that are handy for tracing or manuals but no sane
1139user will ever use them, I expect. However, at some point I will clean up this
1140old \MKIV\ mess. This code might end up in a module outside the core.
1141
1142\stopsubsubject
1143
1144\startsubsubject[title={trac: tracing}]
1145
1146A lot of tracing is possible in the \LUA\ code, which can be controlled from the
1147\TEX\ end using generic enable and disable commands. At the macro level we do
1148have some tracing but this will be replaced by a similar mechanism. This means
1149that many \type {\tracewhatevertrue} directives will go away and be replaced.
1150This is of course introducing some incompatibility but normally users don't use
1151this in styles.
1152
1153\stopsubsubject
1154
1155\startsubsubject[title={type: typescripts}]
1156
1157We already mentioned that typescripts relate to fonts. Traditionally this is a
1158layer on top of font definitions and we keep it this way. In this category there
1159are also the definitions of typefaces: combinations of fonts. As we split the
1160larger into smaller ones, there are many more files now. This has the added
1161benefit that we use less memory as typescripts are loaded only once and stored
1162permanently.
1163
1164\stopsubsubject
1165
1166\startsubsubject[title={typo: typesetting and typography}]
1167
1168This category is rather large in \MKIV\ as we move all code into here that
1169somehow deals with special typesetting. Here we find all kinds of interesting new
1170code that uses \LUA\ solutions (slower but more robust). Much has been discussed
1171in articles as they are nice examples and often these are rather stable.
1172
1173The most important new kid on the block is margin data, which has been moved into
1174this category. The new mechanism is somewhat more powerful but the code is also
1175quite complex and still experimental. The functionality is roughly the same as in
1176\MKII\ and older \MKIV, but there is now more advanced inheritance, a clear
1177separation between placement and rendering, slightly more robust stacking, local
1178anchoring (new). It was a nice challenge but took a bit more time than other
1179reimplementations due to all kinds of possible interference. Also, it's not
1180always easy to simulate \TEX\ grouping in a script language. Even if much more
1181code is involved, it looks like the new implementation is somewhat faster. I
1182expect to clean up this code a couple of times.
1183
1184On the agenda is not only further cleanup of all modules in this category, but
1185also more advanced control over paragraph building. There is a parbuilder written
1186in \LUA\ on my machine for years already which we use for experiments and in the
1187process a more \LUATEX-ish (and efficient) way of dealing with protrusion has
1188been explored. But for this to become effective, some of the \LUATEX\ backend
1189code has to be reorganized and Hartmut wants do that first. In fact, we can then
1190backport the new approach to the built|-|in builder, which is not only faster but
1191also more efficient in terms of memory usage.
1192
1193\stopsubsubject
1194
1195\startsubsubject[title={unic: \UNICODE\ vectors and helpers}]
1196
1197As \UNICODE\ support is now native all the \MKII\ code (mostly vectors and
1198converters) is gone. Only a few helpers remain and even these might go away.
1199Consider this category obsolete and replaced by the \type {char} category.
1200
1201\stopsubsubject
1202
1203\startsubsubject[title={util: utility functions}]
1204
1205These are \LUA\ files that are rather stable. Think of parsers, format
1206generation, debugging, dimension helpers, etc. Like the data category, this one
1207is loaded quite early.
1208
1209\stopsubsubject
1210
1211\startsubsubject[title={Other \TEX\ files}]
1212
1213Currently there are the above categories which can be recognized by filename and
1214prefix in macro names. But there are more files involved. For instance, user
1215extensions can go into these categories as well but they need names starting with
1216something like \type {xxxx-imp-} with \type {xxxx} being the category.
1217
1218Then there are modules that can be recognized by their prefix: \type {m-} (basic
1219module), \type {t-} (third party module), \type {x-} (\XML|-|specific module),
1220\type {u-} (user module), \type {p-} (private module). Some modules that Wolfgang
1221and Aditya are working on might end up in the core distribution. In a similar
1222fashion some seldom used core code might get moved to (auto|-|loaded) modules.
1223
1224There are currently many modules that provide tracing for mechanisms (like font
1225and math) and these need to be normalized into a consistent interface. Often such
1226modules show up when we work on an aspect of \CONTEXT\ or \LUATEX\ and at that
1227moment integration is not high on the agenda.
1228
1229\stopsubsubject
1230
1231\startsubsubject[title={\METAPOST\ files}]
1232
1233A rather fundamental change in \METAPOST\ is that it no longer has a format (mem
1234file). Maybe at some point it will read \type {.gz} files, but all code is loaded
1235at runtime.
1236
1237For this reason I decided to split the files for \MKII\ and \MKIV\ as having
1238version specific code in a common set no longer makes much sense. This means that
1239already for a while we have \type {.mpii} and \type {.mpiv} files with the latter
1240category being more efficient because we delegate some backend|-|related issues
1241to \CONTEXT\ directly. I might split up the files for \MKIV\ a bit more so that
1242selective loading is easier. This gives a slight performance boost when working
1243over a network connection.
1244
1245\stopsubsubject
1246
1247\startsubsubject[title={\LUA\ files}]
1248
1249There are some generic helper modules, with names starting with \type {l-}. Then
1250there are the \type {mtx-*} scripts for all kinds of management tasks with the
1251most important one being \type {mtx-context} for managing a \TEX\ run.
1252
1253\stopsubsubject
1254
1255\startsubsubject[title={Generic files}]
1256This leaves the bunch of generic files that provides \OPENTYPE\ support to
1257packages other than \CONTEXT. Much time went into moving \CONTEXT|-|specific code
1258out of the way and providing a better abstract interface. This means that new
1259\CONTEXT\ code (we provide more font magic) will be less likely to interfere and
1260integration is easier. Of course there is a penalty for \CONTEXT\ but it is
1261bearable. And yes, providing generic code takes quite a lot of time so I
1262sometimes wonder why I did it in the first place, but currently the maintenance
1263burden is rather low. Khaled Hosny is responsible for bridging this code to
1264\LATEX.
1265
1266\stopsubsubject
1267
1268\stopsection
1269
1270\startsection[title={What next}]
1271
1272Here ends this summary of the current state of \CONTEXT. I expect to spend the
1273rest of the year on further cleaning up. I'm close to halfway now. What I really
1274like is that many users upgrade as soon as there is a new beta, and as in a
1275rewrite typos creep in, I therefore often get a fast response.
1276
1277Of course it helps a lot that Wolfgang Schuster, Aditya Mahajan, and Luigi Scarso
1278know the code so well that patches show up on the list shortly after a problem
1279gets reported. Also, for instance Thomas Schmitz uses the latest betas in
1280academic book production, presentations, lecture notes and more, and so provides
1281invaluable fast feedback. And of course Mojca Miklavec keeps all of it (and us)
1282in sync. Such a drastic cleanup could not be done without their help. So let's
1283end this status report with \unknown\ a big thank you to all those (unnamed)
1284patient users and contributors.
1285
1286\stopsection
1287
1288\stopchapter
1289
1290\stopcomponent
1291