% language=us \startcomponent onandon-execute \environment onandon-environment \startchapter[title={Executing \TEX}] Much of the \LUA\ code in \CONTEXT\ originates from experiments. What survives in the source code is probably either used, waiting to be used, or kept for educational purposes. The functionality that we describe here has already been present for a while in \CONTEXT, but has been improved a little starting with \LUATEX\ 1.08 due to an extra helper. The code shown here is generic and is not used in \CONTEXT\ as such. Say that we have this code: \startbuffer for i=1,10000 do tex.sprint("1") tex.sprint("2") for i=1,3 do tex.sprint("3") tex.sprint("4") tex.sprint("5") end tex.sprint("\\space") end \stopbuffer \typebuffer % \ctxluabuffer When we call \type {\directlua} with this snippet we get some 30 pages of \type {12345345345}. The printed text is saved until the end of the \LUA\ call so basically we pipe some 170\,000 characters to \TEX\ that get interpreted as one paragraph. Now imagine this: \startbuffer \setbox0\hbox{xxxxxxxxxxx} \number\wd0 \stopbuffer \typebuffer which gives \getbuffer (the width of the \type {box0} register). If we check the box in \LUA, with: \startbuffer tex.sprint(tex.box[0].width) tex.sprint("\\enspace") tex.sprint("\\setbox0\\hbox{!}") tex.sprint(tex.box[0].width) \stopbuffer \typebuffer the result is {\tttf \ctxluabuffer} i.e. the same number repeated, which is not what you would expect at first sight. However, if you consider that we just pipe to a \TEX\ buffer that gets parsed \italic {after} the \LUA\ call, it will be clear that the reported width is each time the width that we started with. Our code will work all right if we use: \startbuffer tex.sprint(tex.box[0].width) tex.sprint("\\enspace") tex.sprint("\\setbox0\\hbox{!}") tex.sprint("\\directlua{tex.sprint(tex.box[0].width)}") \stopbuffer \typebuffer and now we get: {\tttf\ctxluabuffer}, but this use is a bit awkward. It's not that complex to write some support code that is convenient and this can work out quite well but there is a drawback. If we add references to the status of the input pointer: \startbuffer print(status.input_ptr) tex.sprint(tex.box[0].width) tex.sprint("\\enspace") tex.sprint("\\setbox0\\hbox{!}") tex.sprint("\\directlua{print(status.input_ptr)\ tex.sprint(tex.box[0].width)}") \stopbuffer \typebuffer we then get \type {6} and \type {7} reported. You can imagine that when a lot of nested \type {\directlua} calls happen, this can lead to an overflow of the input level or (depending on what we do) the input stack size. Ideally we want to do a \LUA\ call, temporarily go to \TEX, return to \LUA, etc.\ without needing to worry about nesting and possible crashes due to \LUA\ itself running into problems. One charming solution is to use so|-|called coroutines: independent \LUA\ threads that one can switch between --- you jump out from the current routine to another and from there back to the current one. However, when we use \type {\directlua} for that, we still have this nesting issue and what is worse, we keep nesting function calls too. This can be compared to: \starttyping \def\whatever{\ifdone\whatever\fi} \stoptyping where at some point \type {\ifdone} would be false so we quit, but we keep nesting when the condition is met and eventually we will end up with some nesting related overflow. The following: \starttyping \def\whatever{\ifdone\expandafter\whatever\fi} \stoptyping is less likely to overflow because there we have tail recursion which basically boils down to not nesting but continuing. Do we have something similar in \LUATEX\ for \LUA ? Yes, we do. We can register a function, for instance: \starttyping lua.get_functions_table()[1] = function() print("Hi there!") end \stoptyping and call that one with: \starttyping \luafunction 1 \stoptyping This is a bit faster than calling a function such as: \starttyping \directlua{HiThere()} \stoptyping which can also be achieved by \starttyping \directlua{print("Hi there!")} \stoptyping and is sometimes more convenient. Don't overestimate the gain in speed because \type {directlua} is quite efficient too (and on an average run a user doesn't call it that often, millions of times that is). Anyway, a function call is what we can use for our purpose as it doesn't involve interpretation and effectively behaves like a tail call. The following snippet shows what we have in mind: \startbuffer[demo] tex.routine(function() tex.sprint(tex.box[0].width) tex.sprint("\\enspace") tex.sprint("\\setbox0\\hbox{!}") tex.yield() tex.sprint(tex.box[0].width) end) \stopbuffer \typebuffer[demo] \startbuffer[code] local stepper = nil local stack = { } local fid = 2 -- make sure to take a free slot local goback = "\\luafunction" .. fid .. "\\relax" function tex.resume() if coroutine.status(stepper) == "dead" then stepper = table.remove(stack) end if stepper then coroutine.resume(stepper) end end lua.get_functions_table()[fid] = tex.resume function tex.yield() tex.sprint(goback) coroutine.yield() texio.closeinput() end function tex.routine(f) table.insert(stack,stepper) stepper = coroutine.create(f) tex.sprint(goback) end -- Because we protect against abuse and overload of functions, in ConTeXt we -- need to do the following: if context then fid = context.functions.register(tex.resume) goback = "\\luafunction" .. fid .. "\\relax" end \stopbuffer We start a routine, jump out to \TEX\ in the middle, come back when we're done and continue. This gives us: \ctxluabuffer [code,demo], which is what we expect. % What does this accomplish (or is it left over)? % % \setbox0\hbox{xxxxxxxxxxx} % % \ctxluabuffer[demo] This mechanism permits efficient (nested) loops like: \startbuffer[demo] tex.routine(function() for i=1,10000 do tex.sprint("1") tex.yield() tex.sprint("2") tex.routine(function() for i=1,3 do tex.sprint("3") tex.yield() tex.sprint("4") tex.yield() tex.sprint("5") end end) tex.sprint("\\space") tex.yield() end end) \stopbuffer \typebuffer[demo] We do create coroutines, go back and forwards between \LUA\ and \TEX, but avoid memory being filled up with printed content. If we flush paragraphs (instead of e.g.\ the space) then the main difference is that instead of a small delay due to the loop unfolding in a large set of prints and accumulated content, we now get a steady flushing and processing. However, even using this scheme we can still have an overflow of input buffers because we still nest them: the limitation at the \TEX\ end has moved to a limitation at the \LUA\ end. How come? Here is the code that we use defining the function \type {tex.yield()}: \typebuffer[code] The \type {routine} creates a coroutine, and \type {yield} gives control to \TEX. The \type {resume} is done at the \TEX\ end when we're finished there. In practice this works fine and when you permit enough nesting and levels in \TEX\ then you will not easily overflow. When I picked up this side project and wondered how to get around it, it suddenly struck me that if we could just quit the current input level then nesting would not be a problem. Adding a simple helper to the engine made that possible (of course figuring this out took a while): \startbuffer[code] local stepper = nil local stack = { } local fid = 3 -- make sure to take a frees slot local goback = "\\luafunction" .. fid .. "\\relax" function tex.resume() if coroutine.status(stepper) == "dead" then stepper = table.remove(stack) end if stepper then coroutine.resume(stepper) end end lua.get_functions_table()[fid] = tex.resume if texio.closeinput then function tex.yield() tex.sprint(goback) coroutine.yield() texio.closeinput() end else function tex.yield() tex.sprint(goback) coroutine.yield() end end function tex.routine(f) table.insert(stack,stepper) stepper = coroutine.create(f) tex.sprint(goback) end -- Again we need to do it as follows in ConTeXt: if context then fid = context.functions.register(tex.resume) goback = "\\luafunction" .. fid .. "\\relax" end \stopbuffer \ctxluabuffer[code] \typebuffer[code] The trick is in \type {texio.closeinput}, a recent helper to the engine and one that should be used with care. We assume that the user knows what she or he is doing. On an older laptop with a i7-3840 processor running \WINDOWS\ 10 the following snippet takes less than 0.35 seconds with \LUATEX\ and 0.26 seconds with \LUAJITTEX. \startbuffer[code] tex.routine(function() for i=1,10000 do tex.sprint("\\setbox0\\hpack{x}") tex.yield() tex.sprint(tex.box[0].width) tex.routine(function() for i=1,3 do tex.sprint("\\setbox0\\hpack{xx}") tex.yield() tex.sprint(tex.box[0].width) end end) end end) \stopbuffer \typebuffer[code] % \testfeatureonce {1} {\setbox0\hpack{\ctxluabuffer[code]}} \elapsedtime Say that we were to run the bad snippet: \startbuffer[code] for i=1,10000 do tex.sprint("\\setbox0\\hpack{x}") tex.sprint(tex.box[0].width) for i=1,3 do tex.sprint("\\setbox0\\hpack{xx}") tex.sprint(tex.box[0].width) end end \stopbuffer \typebuffer[code] % \testfeatureonce {1} {\setbox0\hpack{\ctxluabuffer[code]}} \elapsedtime This executes in only 0.12 seconds in both engines. So what if we run this: \startbuffer[code] \dorecurse{10000}{% \setbox0\hpack{x} \number\wd0 \dorecurse{3}{% \setbox0\hpack{xx} \number\wd0 }% } \stopbuffer \typebuffer[code] % \testfeatureonce {1} {\setbox0\hpack{\getbuffer[code]}} \elapsedtime Pure \TEX\ needs 0.30 seconds for both engines but there we lose 0.13 seconds on the loop code. In the \LUA\ example where we yield, the loop code takes hardly any time. As we need only 0.05 seconds more it demonstrates that when we use the power of \LUA, the performance hit of the switch is quite small: we yield 40.000 times! In general, such differences are far exceeded by the overhead: the time needed to typeset the content (which \type {\hpack} doesn't do), breaking paragraphs into lines, constructing pages and other overhead involved in the run. In \CONTEXT\ we use a slightly different variant which has 0.30 seconds more overhead, but that is probably true for all \LUA\ usage in \CONTEXT, but again, it disappears in other runtime. Here is another example: \startbuffer[code] \def\TestWord#1% {\directlua{ tex.routine(function() tex.sprint("\\setbox0\\hbox{\\tttf #1}") tex.yield() tex.sprint(math.round(100 * tex.box[0].width/tex.hsize)) tex.sprint(" percent of the hsize: ") tex.sprint("\\box0") end) }} \stopbuffer \typebuffer[code] \getbuffer[code] \startbuffer The width of next word is \TestWord {inline}! \stopbuffer \typebuffer \getbuffer Now, in order to stay realistic, this macro can also be defined as: \startbuffer[code] \def\TestWord#1% {\setbox0\hbox{\tttf #1}% \directlua{ tex.sprint(math.round(100 * tex.box[0].width/tex.hsize)) } % percent of the hsize: \box0\relax} \stopbuffer \typebuffer[code] We get the same result: \quotation {\getbuffer}. We have been using a \LUA|-|\TEX\ mix for over a decade now in \CONTEXT\ and have never really needed this mixed model. There are a few places where we could (have) benefited from it and now we might use it in a few places, but so far we have done fine without it. In fact, in most cases typesetting can be done fine at the \TEX\ end. It's all a matter of imagination. \stopchapter \stopcomponent