hybrid-languages.tex /size: 18 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\startcomponent hybrid-languages
4
5\environment hybrid-environment
6
7\startchapter[title={The language mix}]
8
9During the third \CONTEXT\ conference that ran in parallel to Euro\TEX\ 2009 in
10The Hague we had several sessions where \MKIV\ was discussed and a few upcoming
11features were demonstrated. The next sections summarize some of that. It's hard
12to predict the future, especially because new possibilities show up once \LUATEX\
13is opened up more, so remarks about the future are not definitive.
14
15\startsection[title={\TEX}]
16
17From now on, if I refer to \TEX\ in the perspective of \LUATEX\ I mean \quotation
18{Good Old \TEX}, the language as well as the functionality. Although \LUATEX\
19provides a couple of extensions it remains pretty close to compatible to its
20ancestor, certainly from the perspective of the end user.
21
22As most \CONTEXT\ users code their documents in the \TEX\ language, this will
23remain the focus of \MKIV. After all, there is no real reason to abandon it.
24However, although \CONTEXT\ already stimulates users to use structure where
25possible and not to use low level \TEX\ commands in the document source, we will
26add a few more structural variants. For instance, we already introduced \type
27{\startchapter} and \type {\startitem} in addition to \type {\chapter} and \type
28{\item}.
29
30We even go further, by using key|/|value pairs for defining section titles,
31bookmarks, running headers, references, bookmarks and list entries at the start
32of a chapter. And, as we carry around much more information in the (for \TEX\ so
33typical) auxiliary data files, we provide extensive control over rendering the
34numbers of these elements when they are recalled (like in tables of contents).
35So, if you really want to use different texts for all references to a chapter
36header, it can be done:
37
38\starttyping
39\startchapter
40  [label=emcsquare,
41   title={About $e=mc^2$},
42   bookmark={einstein},
43   list={About $e=mc^2$ (Einstein)},
44   reference={$e=mc^2$}]
45
46  ... content ...
47
48\stopchapter
49\stoptyping
50
51Under the hood, the \MKIV\ code base is becoming quite a mix and once we have a
52more clear picture of where we're heading, it might become even more of a hybrid.
53Already for some time most of the font handling is done by \LUA, and a bit more
54logic and management might move to \LUA\ as well. However, as we want to be
55downward compatible we cannot go as far as we want (yet). This might change as
56soon as more of the primitives have associated \LUA\ functions. Even then it will
57be a trade off: calling \LUA\ takes some time and it might not pay off at all.
58
59Some of the more tricky components, like vertical spacing, grid snapping,
60balancing columns, etc.\ are already in the process of being \LUA fied and their
61hybrid form might turn into complete \LUA\ driven solutions eventually. Again,
62the compatibility issue forces us to follow a stepwise approach, but at the cost
63of (quite some) extra development time. But whatever happens, the \TEX\ input
64language as well as machinery will be there.
65
66\stopsection
67
68\startsection[title={\METAPOST}]
69
70I never regret integrating \METAPOST\ support in \CONTEXT\ and a dream came true
71when \MPLIB\ became part of \LUATEX. Apart from a few minor changes in the way
72text integrates into \METAPOST\ graphics the user interface in \MKIV\ is the same
73as in \MKII. Insofar as \LUA\ is involved, this is hidden from the user. We use
74\LUA\ for managing runs and conversion of the result to \PDF. Currently
75generating \METAPOST\ code by \LUA\ is limited to assisting in the typesetting of
76chemical structure formulas which is now part of the core.
77
78When defining graphics we use the \METAPOST\ language and not some \TEX|-|like
79variant of it. Information can be passed to \METAPOST\ using special macros (like
80\type {\MPcolor}), but most relevant status information is passed automatically
81anyway.
82
83You should not be surprised if at some point we can request information from
84\TEX\ directly, because after all this information is accessible. Think of
85something \type {w := texdimen(0) ;} being expanded at the \METAPOST\ end instead
86of \type {w := \the\dimen0 ;} being passed to \METAPOST\ from the \TEX\ end.
87
88\stopsection
89
90\startsection[title={\LUA}]
91
92What will the user see of \LUA ? First of all he or she can use this scripting
93language to generate content. But when making a format or by looking at the
94statistics printed at the end of a run, it will be clear that \LUA\ is used all
95over the place.
96
97So how about \LUA\ as a replacement for the \TEX\ input language? Actually, it is
98already possible to make such \quotation {\CONTEXT\ \LUA\ Documents} using
99\MKIV's built in functions. Each \CONTEXT\ command is also available as a \LUA\
100function.
101
102\startbuffer
103\startluacode
104  context.bTABLE {
105      framecolor = "blue",
106      align= "middle",
107      style = "type",
108      offset=".5ex",
109    }
110    for i=1,10 do
111      context.bTR()
112      for i=1,20 do
113        local r= math.random(99)
114        if r < 50 then
115          context.bTD {
116            background = "color",
117            backgroundcolor = "blue"
118          }
119          context(context.white("%#2i",r))
120        else
121          context.bTD()
122          context("%#2i",r)
123        end
124        context.eTD()
125      end
126      context.eTR()
127    end
128  context.eTABLE()
129\stopluacode
130\stopbuffer
131
132\typebuffer
133
134Of course it helps if you know \CONTEXT\ a bit. For instance we can as well say:
135
136\starttyping
137if r < 50 then
138  context.bTD {
139    background = "color",
140    backgroundcolor = "blue",
141    foregroundcolor = "white",
142  }
143else
144  context.bTD()
145end
146context("%#2i",r)
147context.eTD()
148\stoptyping
149
150And, knowing \LUA\ helps as well, since the following is more efficient:
151
152\startbuffer
153\startluacode
154  local colored = {
155    background = "color",
156    backgroundcolor = "blue",
157    foregroundcolor = "white",
158  }
159  local basespec = {
160    framecolor = "blue",
161    align= "middle",
162    style = "type",
163    offset=".5ex",
164  }
165  local bTR, eTR = context.bTR, context.eTR
166  local bTD, eTD = context.bTD, context.eTD
167  context.bTABLE(basespec)
168    for i=1,10 do
169      bTR()
170      for i=1,20 do
171        local r= math.random(99)
172        bTD((r < 50 and colored) or nil)
173        context("%#2i",r)
174        eTD()
175      end
176      eTR()
177    end
178  context.eTABLE()
179\stopluacode
180\stopbuffer
181
182\typebuffer
183
184Since in practice the speedup is negligible and the memory footprint is about the
185same, such optimization seldom make sense.
186
187At some point this interface will be extended, for instance when we can use
188\TEX's main (scanning, parsing and processing) loop as a so-called coroutine and
189when we have opened up more of \TEX's internals. Of course, instead of putting
190this in your \TEX\ source, you can as well keep the code at the \LUA\ end.
191
192\placefigure
193  {The result of the shown \LUA\ code.}
194  {\getbuffer}
195
196The script that manages a \CONTEXT\ run (also called \type {context}) will
197process files with the \type {cld} suffix automatically. You can also force
198processing as \LUA\ with the flag \type {--forcecld}. \footnote {Similar methods
199exist for processing \XML\ files.} The \type {mtxrun} script also recognizes
200\type {cld} files and delegate the call to the \type {context} script.
201
202\starttyping
203context yourfile.cld
204\stoptyping
205
206But will this replace \TEX\ as an input language? This is quite unlikely because
207coding documents in \TEX\ is so convenient and there is not much to gain here. Of
208course in a pure \LUA\ based workflow (for instance publishing information from
209databases) it would be nice to code in \LUA, but even then it's mostly syntactic
210sugar, as \TEX\ has to do the job anyway. However, eventually we will have a
211quite mature \LUA\ counterpart.
212
213\stopsection
214
215\startsection[title={\XML}]
216
217This is not so much a programming language but more a method of tagging your
218document content (or data). As structure is rather dominant in \XML, it is quite
219handy for situations where we need different output formats and multiple tools
220need to process the same data. It's also a standard, although this does not mean
221that all documents you see are properly structured. This in turn means that we
222need some manipulative power in \CONTEXT, and that happens to be easier to do in
223\MKIV\ than in \MKII.
224
225In \CONTEXT\ we have been supporting \XML\ for a long time, and in \MKIV\ we made
226the switch from stream based to tree based processing. The current implementation
227is mostly driven by what has been possible so far but as \LUATEX\ becomes more
228mature, bits and pieces will be reimplemented (or at least cleaned up and brought
229up to date with developments in \LUATEX).
230
231One could argue that it makes more sense to use \XSLT\ for converting \XML\ into
232something \TEX, but in most of the cases that I have to deal with much effort
233goes into mapping structure onto a given layout specification. Adding a bit of
234\XML\ to \TEX\ mapping to that directly is quite convenient. The total amount of
235code is probably smaller and it saves a processing step.
236
237We're mostly dealing with education|-|related documents and these tend to have a
238more complex structure than the final typeset result shows. Also, readability of
239code is not served with such a split as most mappings look messy anyway (or
240evolve that way) due to the way the content is organized or elements get abused.
241
242There is a dedicated manual for dealing with \XML\ in \MKIV, so we only show a
243simple example here. The documents to be processed are loaded in memory and
244serialized using setups that are associated to elements. We keep track of
245documents and nodes in a way that permits multipass data handling (rather usual
246in \TEX). Say that we have a document that contains questions. The following
247definitions will flush the (root element) \type {questions}:
248
249\starttyping
250\startxmlsetups xml:mysetups
251    \xmlsetsetup{#1}{questions}{xml:questions}
252\stopxmlsetups
253
254\xmlregistersetup{xml:mysetups}
255
256\startxmlsetups xml:questions
257    \xmlflush{#1}
258\stopxmlsetups
259
260\xmlprocessfile{main}{somefile.xml}{}
261\stoptyping
262
263Here the \type {#1} represents the current \XML\ element. Of course we need more
264associations in order to get something meaningful. If we just serialize then we
265have mappings like:
266
267\starttyping
268\xmlsetsetup{#1}{question|answer}{xml:*}
269\stoptyping
270
271So, questions and answers are mapped onto their own setup which flushes them,
272probably with some numbering done at the spot.
273
274In this mechanism \LUA\ is sort of invisible but quite busy as it is responsible
275for loading, filtering, accessing and serializing the tree. In this case \TEX\
276and \LUA\ hand over control in rapid succession.
277
278You can hook in your own functions, like:
279
280\starttyping
281\xmlfilter{#1}{(wording|feedback|choice)/function(cleanup)}
282\stoptyping
283
284In this case the function \type {cleanup} is applied to elements with names that
285match one of the three given. \footnote {This example is inspired by one of our
286projects where the cleanup involves sanitizing (highly invalid) \HTML\ data that
287is embedded as a \type {CDATA} stream, a trick to prevent the \XML\ file to be
288invalid.}
289
290Of course, once you start mixing in \LUA\ in this way, you need to know how we
291deal with \XML\ at the \LUA\ end. The following function show how we calculate
292scores:
293
294\starttyping
295\startluacode
296function xml.functions.totalscore(root)
297  local n = 0
298  for e in xml.collected(root,"/outcome") do
299    if xml.filter(e,"action[text()='add']") then
300      local m = xml.filter(e,"xml:///score/text()")
301      n = n + (tonumber(m or 0) or 0)
302    end
303  end
304  tex.write(n)
305end
306\stopluacode
307\stoptyping
308
309You can either use such a function in a filter or just use it as
310a \TEX\ macro:
311
312\starttyping
313\startxmlsetups xml:question
314  \blank
315  \xmlfirst{#1}{wording}
316  \startitemize
317    \xmlfilter{#1}{/answer/choice/command(xml:answer:choice)}
318  \stopitemize
319  \endgraf
320  score: \xmlfunction{#1}{totalscore}
321  \blank
322\stopxmlsetups
323
324\startxmlsetups xml:answer:choice
325    \startitem
326        \xmlflush{#1}
327    \stopitem
328\stopxmlsetups
329\stoptyping
330
331The filter variant is like this:
332
333\starttyping
334\xmlfilter{#1}{./function('totalscore')}
335\stoptyping
336
337So you can take your choice and make your source look more \XML|-|ish,
338\LUA|-|like or \TEX|-|wise. A careful reader might have noticed the peculiar
339\type {xml://} in the function code. When used inside \MKIV, the serializer
340defaults to \TEX\ so results are piped back into \TEX. This prefix forced the
341regular serializer which keeps the result at the \LUA\ end.
342
343Currently some of the \XML\ related modules, like \MATHML\ and handling of
344tables, are really a mix of \TEX\ code and \LUA\ calls, but it makes sense to
345move them completely to \LUA. One reason is that their input (formulas and table
346content) is restricted to non|-|\TEX\ anyway. On the other hand, in order to be
347able to share the implementation with \TEX\ input, it also makes sense to stick
348to some hybrid approach. In any case, more of the calculations and logic will
349move to \LUA, while \TEX\ will deal with the content.
350
351A somewhat strange animal here is \XSLFO. We do support it, but the \MKII\
352implementation was always somewhat limited and the code was quite complex. So,
353this needs a proper rewrite in \MKIV, which will happen indeed. It's mostly a
354nice exercise of hybrid technology but until now I never really needed it. Other
355bits and pieces of the current \XML\ goodies might also get an upgrade.
356
357There is already a bunch of functions and macros to filter and manipulate \XML\
358content and currently the code involved is being cleaned up. What direction we go
359also depends on users' demands. So, with respect to \XML\ you can expect more
360support, a better integration and an upgrade of some supported \XML\ related
361standards.
362
363\startsection [title={Tools}]
364
365Some of the tools that ship with \CONTEXT\ are also examples of hybrid usage.
366
367Take this:
368
369\starttyping
370mtxrun --script server --auto
371\stoptyping
372
373On my machine this reports:
374
375\starttyping
376MTXrun | running at port: 31415
377MTXrun | document root: c:/data/develop/context/lua
378MTXrun | main index file: unknown
379MTXrun | scripts subpath: c:/data/develop/context/lua
380MTXrun | context services: http://localhost:31415/mtx-server-ctx-startup.lua
381\stoptyping
382
383The \type {mtxrun} script is a \LUA\ script that acts as a controller for other
384scripts, in this case \type {mtx-server.lua} that is part of the regular
385distribution. As we use \LUATEX\ as a \LUA\ interpreter and since \LUATEX\ has a
386socket library built in, it can act as a web server, limited but quite right for
387our purpose. \footnote {This application is not intentional but just a side
388effect.}
389
390The web page that pops up when you enter the given address lets you currently
391choose between the \CONTEXT\ help system and a font testing tool. In \in {figure}
392[fig:fonttest] you seen an example of what the font testing tool does.
393
394\placefigure
395  [here]
396  [fig:fonttest]
397  {An example of using the font tester.}
398  {\externalfigure[mtx-server-ctx-fonttest.png][width=\textwidth]}
399
400Here we have \LUATEX\ running a simple web server but it's not aware of having
401\TEX\ on board. When you click on one of the buttons at the bottom of the screen,
402the server will load and execute a script related to the request and in this case
403that script will create a \TEX\ file and call \LUATEX\ with \CONTEXT\ to process
404that file. The result is piped back to the browser.
405
406You can use this tool to investigate fonts (their bad and good habits) as well as
407to test the currently available \OPENTYPE\ functionality in \MKIV\ (bugs as well
408as goodies).
409
410So again we have a hybrid usage although in this case the user is not confronted
411with \LUA\ and|/|or \TEX\ at all. The same is true for the other goodie, shown in
412\in {figure} [fig:help]. Actually, such a goodie has always been part of the
413\CONTEXT\ distribution but it has been rewritten in \LUA.
414
415\placefigure
416  [here]
417  [fig:help]
418  {An example of a help screen for a command.}
419  {\externalfigure[mtx-server-ctx-help.png][width=\textwidth]}
420
421The \CONTEXT\ user interface is defined in an \XML\ file, and this file is used
422for several purposes: initializing the user interfaces at format generation time,
423typesetting the formal command references (for all relevant interface languages),
424for the wiki, and for the mentioned help goodie.
425
426Using the mix of languages permits us to provide convenient processing of
427documents that otherwise would demand more from the user than it does now. For
428instance, imagine that we want to process a series of documents in the
429so|-|called \EPUB\ format. Such a document is a zipped file that has a
430description and resources. As the content of this archive is prescribed it's
431quite easy to process it:
432
433\starttyping
434context --ctx=x-epub.ctx yourfile.epub
435\stoptyping
436
437This is equivalent to:
438
439\starttyping
440texlua mtxrun.lua --script context --ctx=x-epub.ctx yourfile.epub
441\stoptyping
442
443So, here we have \LUATEX\ running a script that itself (locates and) runs a
444script \type {context}. That script loads a \CONTEXT\ job description file (with
445suffix \type {ctx}). This file tells what styles to load and might have
446additional directives but none of that has to bother the end user. In the
447automatically loaded style we take care of reading the \XML\ files from the
448zipped file and eventually map the embedded \HTML\ like files onto style elements
449and produce a \PDF\ file. So, we have \LUA\ managing a run and \MKIV\ managing
450with help of \LUA\ reading from zip files and converting \XML\ into something
451that \TEX\ is happy with. As there is no standard with respect to the content
452itself, i.e.\ the rendering is driven by whatever kind of structure is used and
453whatever the \CSS\ file is able to map it onto, in practice we need an additional
454style for this class of documents. But anyway it's a good example of integration.
455
456\stopsection
457
458\startsection [title={The future}]
459
460Apart from these language related issues, what more is on the agenda? To mention
461a few integration related thoughts:
462
463\startitemize[packed]
464
465\startitem
466    At some point I want to explore the possibility to limit processing to just
467    one run, for instance by doing trial runs without outputting anything but
468    still collecting multipass information. This might save some runtime in
469    demanding workflows especially when we keep extensive font loading and image
470    handling in mind.
471\stopitem
472
473\startitem
474    Related to this is the ability to run \MKIV\ as a service but that demands
475    that we can reset the state of \LUATEX\ and actually it might not be worth
476    the trouble at all given faster processors and disks. Also, it might not save
477    much runtime on larger jobs.
478\stopitem
479
480\startitem
481    More interesting can be to continue experimenting with isolating parts of
482    \CONTEXT\ in such a way that one can construct a specialized subset of
483    functionality. Of course the main body of code will always be loaded as one
484    needs basic typesetting anyway.
485\stopitem
486
487\stopitemize
488
489Of course we keep improving existing mechanisms and improve solutions using a mix
490of \TEX\ and \LUA, using each language (and system) for what it can do best.
491
492\stopsection
493
494\stopchapter
495
496\stopcomponent
497