onandon-execute.tex /size: 11 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\startcomponent onandon-execute
4
5\environment onandon-environment
6
7\startchapter[title={Executing \TEX}]
8
9Much of the \LUA\ code in \CONTEXT\ originates from experiments. What survives in
10the source code is probably either used, waiting to be used, or kept for
11educational purposes. The functionality that we describe here has already been
12present for a while in \CONTEXT, but has been improved a little starting with
13\LUATEX\ 1.08 due to an extra helper. The code shown here is generic and is not
14used in \CONTEXT\ as such.
15
16Say that we have this code:
17
18\startbuffer
19for i=1,10000 do
20    tex.sprint("1")
21    tex.sprint("2")
22    for i=1,3 do
23        tex.sprint("3")
24        tex.sprint("4")
25        tex.sprint("5")
26    end
27    tex.sprint("\\space")
28end
29\stopbuffer
30
31\typebuffer
32
33% \ctxluabuffer
34
35When we call \type {\directlua} with this snippet we get some 30 pages of \type
36{12345345345}. The printed text is saved until the end of the \LUA\ call so
37basically we pipe some 170\,000 characters to \TEX\ that get interpreted as one
38paragraph.
39
40Now imagine this:
41
42\startbuffer
43\setbox0\hbox{xxxxxxxxxxx} \number\wd0
44\stopbuffer
45
46\typebuffer
47
48which gives \getbuffer (the width of the \type {box0} register). If we check the
49box in \LUA, with:
50
51\startbuffer
52tex.sprint(tex.box[0].width)
53tex.sprint("\\enspace")
54tex.sprint("\\setbox0\\hbox{!}")
55tex.sprint(tex.box[0].width)
56\stopbuffer
57
58\typebuffer
59
60the result is {\tttf \ctxluabuffer} i.e. the same number repeated, which is not
61what you would expect at first sight. However, if you consider that we just pipe
62to a \TEX\ buffer that gets parsed \italic {after} the \LUA\ call, it will be
63clear that the reported width is each time the width that we started with. Our
64code will work all right if we use:
65
66\startbuffer
67tex.sprint(tex.box[0].width)
68tex.sprint("\\enspace")
69tex.sprint("\\setbox0\\hbox{!}")
70tex.sprint("\\directlua{tex.sprint(tex.box[0].width)}")
71\stopbuffer
72
73\typebuffer
74
75and now we get: {\tttf\ctxluabuffer}, but this use is a bit awkward.
76
77It's not that complex to write some support code that is convenient and this can
78work out quite well but there is a drawback. If we add references to the status
79of the input pointer:
80
81\startbuffer
82print(status.input_ptr)
83tex.sprint(tex.box[0].width)
84tex.sprint("\\enspace")
85tex.sprint("\\setbox0\\hbox{!}")
86tex.sprint("\\directlua{print(status.input_ptr)\
87    tex.sprint(tex.box[0].width)}")
88\stopbuffer
89
90\typebuffer
91
92we then get \type {6} and \type {7} reported. You can imagine that when a lot of
93nested \type {\directlua} calls happen, this can lead to an overflow of the input
94level or (depending on what we do) the input stack size. Ideally we want to do a
95\LUA\ call, temporarily go to \TEX, return to \LUA, etc.\ without needing to
96worry about nesting and possible crashes due to \LUA\ itself running into
97problems. One charming solution is to use so|-|called coroutines: independent
98\LUA\ threads that one can switch between --- you jump out from the current
99routine to another and from there back to the current one. However, when we use
100\type {\directlua} for that, we still have this nesting issue and what is worse,
101we keep nesting function calls too. This can be compared to:
102
103\starttyping
104\def\whatever{\ifdone\whatever\fi}
105\stoptyping
106
107where at some point \type {\ifdone} would be false so we quit, but we keep
108nesting when the condition is met and eventually we will end up with some nesting
109related overflow. The following:
110
111\starttyping
112\def\whatever{\ifdone\expandafter\whatever\fi}
113\stoptyping
114
115is less likely to overflow because there we have tail recursion which basically
116boils down to not nesting but continuing. Do we have something similar in
117\LUATEX\ for \LUA ? Yes, we do. We can register a function, for instance:
118
119\starttyping
120lua.get_functions_table()[1] = function() print("Hi there!") end
121\stoptyping
122
123and call that one with:
124
125\starttyping
126\luafunction 1
127\stoptyping
128
129This is a bit faster than calling a function such as:
130
131\starttyping
132\directlua{HiThere()}
133\stoptyping
134
135which can also be achieved by
136
137\starttyping
138\directlua{print("Hi there!")}
139\stoptyping
140
141and is sometimes more convenient. Don't overestimate the gain in speed because
142\type {directlua} is quite efficient too (and on an average run a user doesn't
143call it that often, millions of times that is). Anyway, a function call is what
144we can use for our purpose as it doesn't involve interpretation and effectively
145behaves like a tail call. The following snippet shows what we have in mind:
146
147\startbuffer[demo]
148tex.routine(function()
149    tex.sprint(tex.box[0].width)
150    tex.sprint("\\enspace")
151    tex.sprint("\\setbox0\\hbox{!}")
152    tex.yield()
153    tex.sprint(tex.box[0].width)
154end)
155\stopbuffer
156
157\typebuffer[demo]
158
159\startbuffer[code]
160local stepper = nil
161local stack   = { }
162local fid     = 2 -- make sure to take a free slot
163local goback  = "\\luafunction" .. fid .. "\\relax"
164
165function tex.resume()
166    if coroutine.status(stepper) == "dead" then
167        stepper = table.remove(stack)
168    end
169    if stepper then
170        coroutine.resume(stepper)
171    end
172end
173
174lua.get_functions_table()[fid] = tex.resume
175
176function tex.yield()
177    tex.sprint(goback)
178    coroutine.yield()
179    texio.closeinput()
180end
181
182function tex.routine(f)
183    table.insert(stack,stepper)
184    stepper = coroutine.create(f)
185    tex.sprint(goback)
186end
187
188-- Because we protect against abuse and overload of functions, in ConTeXt we
189-- need to do the following:
190
191if context then
192    fid    = context.functions.register(tex.resume)
193    goback = "\\luafunction" .. fid .. "\\relax"
194end
195\stopbuffer
196
197We start a routine, jump out to \TEX\ in the middle, come back when we're done
198and continue. This gives us: \ctxluabuffer [code,demo], which is what we expect.
199
200% What does this accomplish (or is it left over)?
201%
202% \setbox0\hbox{xxxxxxxxxxx}
203%
204% \ctxluabuffer[demo]
205
206This mechanism permits efficient (nested) loops like:
207
208\startbuffer[demo]
209tex.routine(function()
210    for i=1,10000 do
211        tex.sprint("1")
212        tex.yield()
213        tex.sprint("2")
214        tex.routine(function()
215            for i=1,3 do
216                tex.sprint("3")
217                tex.yield()
218                tex.sprint("4")
219                tex.yield()
220                tex.sprint("5")
221            end
222        end)
223        tex.sprint("\\space")
224        tex.yield()
225    end
226end)
227\stopbuffer
228
229\typebuffer[demo]
230
231We do create coroutines, go back and forwards between \LUA\ and \TEX, but avoid
232memory being filled up with printed content. If we flush paragraphs (instead of
233e.g.\ the space) then the main difference is that instead of a small delay due to
234the loop unfolding in a large set of prints and accumulated content, we now get a
235steady flushing and processing.
236
237However, even using this scheme we can still have an overflow of input buffers
238because we still nest them: the limitation at the \TEX\ end has moved to a
239limitation at the \LUA\ end. How come? Here is the code that we use defining the
240function \type {tex.yield()}:
241
242\typebuffer[code]
243
244The \type {routine} creates a coroutine, and \type {yield} gives control to \TEX.
245The \type {resume} is done at the \TEX\ end when we're finished there. In
246practice this works fine and when you permit enough nesting and levels in \TEX\
247then you will not easily overflow.
248
249When I picked up this side project and wondered how to get around it, it suddenly
250struck me that if we could just quit the current input level then nesting would
251not be a problem. Adding a simple helper to the engine made that possible (of
252course figuring this out took a while):
253
254\startbuffer[code]
255local stepper = nil
256local stack   = { }
257local fid     = 3 -- make sure to take a frees slot
258local goback  = "\\luafunction" .. fid .. "\\relax"
259
260function tex.resume()
261    if coroutine.status(stepper) == "dead" then
262        stepper = table.remove(stack)
263    end
264    if stepper then
265        coroutine.resume(stepper)
266    end
267end
268
269lua.get_functions_table()[fid] = tex.resume
270
271if texio.closeinput then
272    function tex.yield()
273        tex.sprint(goback)
274        coroutine.yield()
275        texio.closeinput()
276    end
277else
278    function tex.yield()
279        tex.sprint(goback)
280        coroutine.yield()
281    end
282end
283
284function tex.routine(f)
285    table.insert(stack,stepper)
286    stepper = coroutine.create(f)
287    tex.sprint(goback)
288end
289
290-- Again we need to do it as follows in ConTeXt:
291
292if context then
293    fid     = context.functions.register(tex.resume)
294    goback  = "\\luafunction" .. fid .. "\\relax"
295end
296\stopbuffer
297
298\ctxluabuffer[code]
299
300\typebuffer[code]
301
302The trick is in \type {texio.closeinput}, a recent helper to the engine and one
303that should be used with care. We assume that the user knows what she or he is
304doing. On an older laptop with a i7-3840 processor running \WINDOWS\ 10 the
305following snippet takes less than 0.35 seconds with \LUATEX\ and 0.26 seconds
306with \LUAJITTEX.
307
308\startbuffer[code]
309tex.routine(function()
310    for i=1,10000 do
311        tex.sprint("\\setbox0\\hpack{x}")
312        tex.yield()
313        tex.sprint(tex.box[0].width)
314        tex.routine(function()
315            for i=1,3 do
316                tex.sprint("\\setbox0\\hpack{xx}")
317                tex.yield()
318                tex.sprint(tex.box[0].width)
319            end
320        end)
321    end
322end)
323\stopbuffer
324
325\typebuffer[code]
326
327% \testfeatureonce {1} {\setbox0\hpack{\ctxluabuffer[code]}} \elapsedtime
328
329Say that we were to run the bad snippet:
330
331\startbuffer[code]
332for i=1,10000 do
333    tex.sprint("\\setbox0\\hpack{x}")
334    tex.sprint(tex.box[0].width)
335    for i=1,3 do
336        tex.sprint("\\setbox0\\hpack{xx}")
337        tex.sprint(tex.box[0].width)
338    end
339end
340\stopbuffer
341
342\typebuffer[code]
343
344% \testfeatureonce {1} {\setbox0\hpack{\ctxluabuffer[code]}} \elapsedtime
345
346This executes in only 0.12 seconds in both engines. So what if we run this:
347
348\startbuffer[code]
349\dorecurse{10000}{%
350    \setbox0\hpack{x}
351    \number\wd0
352    \dorecurse{3}{%
353        \setbox0\hpack{xx}
354        \number\wd0
355    }%
356}
357\stopbuffer
358
359\typebuffer[code]
360
361% \testfeatureonce {1} {\setbox0\hpack{\getbuffer[code]}} \elapsedtime
362
363Pure \TEX\ needs 0.30 seconds for both engines but there we lose 0.13 seconds on
364the loop code. In the \LUA\ example where we yield, the loop code takes hardly
365any time. As we need only 0.05 seconds more it demonstrates that when we use the
366power of \LUA, the performance hit of the switch is quite small: we yield 40.000
367times! In general, such differences are far exceeded by the overhead: the time
368needed to typeset the content (which \type {\hpack} doesn't do), breaking
369paragraphs into lines, constructing pages and other overhead involved in the run.
370In \CONTEXT\ we use a slightly different variant which has 0.30 seconds more
371overhead, but that is probably true for all \LUA\ usage in \CONTEXT, but again,
372it disappears in other runtime.
373
374Here is another example:
375
376\startbuffer[code]
377\def\TestWord#1%
378  {\directlua{
379     tex.routine(function()
380       tex.sprint("\\setbox0\\hbox{\\tttf #1}")
381       tex.yield()
382       tex.sprint(math.round(100 * tex.box[0].width/tex.hsize))
383       tex.sprint(" percent of the hsize: ")
384       tex.sprint("\\box0")
385     end)
386  }}
387\stopbuffer
388
389\typebuffer[code] \getbuffer[code]
390
391\startbuffer
392The width of next word is \TestWord {inline}!
393\stopbuffer
394
395\typebuffer \getbuffer
396
397Now, in order to stay realistic, this macro can also be defined as:
398
399\startbuffer[code]
400\def\TestWord#1%
401  {\setbox0\hbox{\tttf #1}%
402   \directlua{
403      tex.sprint(math.round(100 * tex.box[0].width/tex.hsize))
404   } %
405   percent of the hsize: \box0\relax}
406\stopbuffer
407
408\typebuffer[code]
409
410We get the same result: \quotation {\getbuffer}.
411
412We have been using a \LUA|-|\TEX\ mix for over a decade now in \CONTEXT\ and have
413never really needed this mixed model. There are a few places where we could
414(have) benefited from it and now we might use it in a few places, but so far we
415have done fine without it. In fact, in most cases typesetting can be done fine at
416the \TEX\ end. It's all a matter of imagination.
417
418\stopchapter
419
420\stopcomponent
421