about-luafunctions.tex /size: 9497 b    last modification: 2023-12-21 09:43
1% language=us
2
3\startcomponent about-properties
4
5\environment about-environment
6
7\startchapter[title=Functions]
8
9\startsection[title=Introduction]
10
11As part of the crited project Luigi and I also tried to identity weak spots in
12the engine and although we found some issues not all were dealt with because
13complicating the machinery makes no sense. However just like the new \type
14{properties} mechanism provides a real simple way to associate extra \LUA\ data
15to a node without bothering about freeing it when a node is flushed, the next
16\type {luafunctions} mechanism provides an additional and fast way to cross the
17\TEX||\LUA\ boundary.
18
19\stopsection
20
21\startsection[title=Callbacks]
22
23In \LUATEX\ we can create more functionality by using \LUA\ which means that we
24end up (at least in \CONTEXT) with a constant switching between \TEX\ macro
25expansion and \LUA\ code interpretation. The magic word in this process is \type
26{callback} and there are two variants:
27
28\startitemize
29
30\startitem At well defined moments in processing its input and node lists, \TEX\
31will check if a specific callback is defined and if so, it will run that code.
32\stopitem
33
34\startitem As part of the input you can have a \type {\directlua} command and
35that one gets expanded and processed. It can print back content into the current
36input buffer. \footnote {Currently this process is somewhat more complex than
37needed, which is a side effect of supporting multiple \LUA\ states in the first
38versions of \LUATEX. We will clean up this mechanism at some point.} \stopitem
39
40\stopitemize
41
42The first type is call a \quote {direct} callback because \TEX\ calls it
43directly, and the second one is an \quote {indirect} one (even if the command is
44\type {\directlua}). It has a deferred cousin \type {\latelua} that results in a
45node being inserted that will become a \LUA\ call during shipout, when the page
46is turned into a \PDF\ stream.
47
48A callback of the first category is pretty fast because the code is already
49translated in \LUA\ bytecode. Checking if a callback has been assigned at all is
50fast too. The second variant is slower because each time the input has to be
51interpreted and checked on validity. Then there is of course some overhead in
52making the call itself.
53
54There is a subtle aspect there. If you have a document that needs say ten calls
55like:
56
57\starttyping
58\directlua{tex.print("[x]")}
59\stoptyping
60
61and you have these calls inlined, you end up with ten times conversion into
62tokens (\TEX's internal view) and ten times conversion back to a string that gets
63fed into \LUA. On the other hand,
64
65\starttyping
66\def\MyCall{\directlua{tex.print("[x]")}}
67\stoptyping
68
69where we call \type {\MyCall} ten times is more efficient because we have already
70tokenized the \type {\directlua}. If we have
71
72\starttyping
73foo foo foo \directlua{tex.print("[1]")} ...
74bar bar bar \directlua{tex.print("[2]")} ...
75\stoptyping
76
77It makes sense to wrap this into a definition:
78
79\starttyping
80\def\MyCall#1{\directlua{tex.print("[#1]")}}
81\stoptyping
82
83and use:
84
85\starttyping
86foo foo foo \MyCall{1} bar bar bar \MyCall{1} ...
87\stoptyping
88
89Of course this is not unique for \type {\directlua} and to be honest, apart from
90convenience (read: less input) the gain often can be neglected. Because a macro
91package wraps functionality in (indeed) macros we already save us the tokenization
92step. We can save some time by wrapping more in a function at the \LUA\ end:
93
94\starttyping
95\startluacode
96function MyFloat(f)
97    tex.print(string.format("%0.5f",f))
98end
99\stopluacode
100
101\def\MyFloat#1%
102  {\directlua{MyFloat(#1)}}
103\stoptyping
104
105This is somewhat more efficient than:
106
107\starttyping
108\def\MyFloat#1%
109  {\directlua{tex.print(string.format("\letterpercent0.5f",#1))}}
110\stoptyping
111
112\stopsection
113
114Of course this is only true when we call this macro a lot of times.
115
116\startsection[title=Shortcuts]
117
118When we talk of \quote {often} or \quote {a lot} we mean many thousands of calls.
119There are some places in \CONTEXT\ where this is indeed the case, for instance
120when we process large registers in critical editions: a few hundred pages of
121references generated in \LUA\ is no exception there. Think of the following:
122
123\starttyping
124\startluacode
125function GetTitle(n)
126    tex.print(Entries[n].title)
127end
128\stopluacode
129
130\def\GetTitle#1%
131  {\directlua{GetTitle(#1)}}
132\stoptyping
133
134If we call \type {\GetTitle} ourselves it's the same as the \type {\MyFloat}
135example, but how about this:
136
137\starttyping
138\def\GetTitle#1%
139  {{\bf \directlua{GetTitle(#1)}}}
140
141\startluacode
142function GetTitle(n)
143    tex.print(Entries[n].title)
144end
145
146function GetEntry(n)
147    if Entries[n] then
148        tex.print("\\directlua{GetTitle(",n,")}")
149        -- some more action
150    end
151end
152\stopluacode
153\stoptyping
154
155Here we have two calls where one is delayed till a later time. This delay results
156in a tokenization and transation to \LUA\ so it will cost time. A way out is this:
157
158\starttyping
159\def\GetTitle#1%
160  {{\bf \luafunction#1}}
161
162\startluacode
163local functions = tex.get_functions_table()
164
165function GetTitle(n)
166    tex.print(Entries[n].title)
167end
168
169function GetEntry(n)
170    if Entries[n] then
171        local m = #functions+1
172        functions[m] = function() GetTitle(n) end
173        tex.print("\\GetTitle{",m,"}")
174        -- some more action
175    end
176end
177\stopluacode
178\stoptyping
179
180We define a function at the \LUA\ end and just print a macro call. That call itself
181calls the defined function using \type {\luafunction}. For a large number
182of calls this is more efficient but it will be clear that you need to make sure that
183used functions are cleaned up. A simple way is to start again at slot one after (say)
184100.000 functions, another method is to reset used functions and keep counting.
185
186\starttyping
187\startluacode
188local functions = tex.get_functions_table()
189
190function GetTitle(n)
191    tex.print(Entries[n].title)
192end
193
194function GetEntry(n)
195    if Entries[n] then
196        local m = #functions+1
197        functions[m] = function(slot) -- the slot number is always
198            GetTitle(n)               -- passed as argument so that
199            functions[slot] = nil     -- we can reset easily
200        end
201        tex.print("\\GetTitle{",m,"}")
202        -- some more action
203    end
204end
205\stopluacode
206\stoptyping
207
208As you can expect, in \CONTEXT\ users are not expect to deal directly with
209functions at all. Already for years you can so this:
210
211\starttyping
212\def\GetTitle#1%
213  {{\bf#1}}
214
215\startluacode
216function GetEntry(n)
217    if Entries[n] then
218        context(function() context.GetTitle(Entries[n].title) end)
219        -- some more action
220    end
221end
222\stopluacode
223\stoptyping
224
225Upto \LUATEX\ 0.78 we had a \CONTEXT\ specific implementation of functions and
226from 0.79 onwards we use this new mechanism but users won't see that in practice.
227In the \type {cld-mkiv.pdf} manual you can find more about accessing \CONTEXT\
228from the \LUA\ end.
229
230Keep in mind that \type {\luafunction} is not that clever: it doesn't pick up
231arguments. That will be part of future more extensive token handling but of
232course that will then also be a real slow downer because a mix of \TEX\
233tokenization and serialization is subtoptimal (we already did extensive tests
234with that).
235
236\stopsection
237
238\startsection[title=Helpers]
239
240The above mechanism demands some orchestration in the macro package. For instance
241freeing slots should be consistent and therefore user should not mess directly
242with the functions table. If you really want to use this feature you can best do this:
243
244\starttyping
245\startctxfunction MyFunctionA
246    context(" A1 ")
247\stopctxfunction
248
249\startctxfunctiondefinition MyFunctionB
250    context(" B2 ")
251\stopctxfunctiondefinition
252
253\starttext
254    \dorecurse{10000}{\ctxfunction{MyFunctionA}}   \page
255    \dorecurse{10000}{\MyFunctionB}                \page
256    \dorecurse{10000}{\ctxlua{context(" C3 ")}}    \page
257    \dorecurse{10000}{\ctxlua{tex.sprint(" D4 ")}} \page
258\stoptext
259\stoptyping
260
261In case you're curious about performance, here are timing. Given that we have
26210.000 calls the gain is rather neglectable especially because the whole run
263takes 2.328 seconds for 52 processed pages resulting in 22.4 pages per second.
264The real gain is in more complex calls with more tokens involved and in \CONTEXT\
265we have some placed where we run into the hundreds of thousands. A similar
266situation occurs when your input comes from databases and is fetched stepwise.
267
268\starttabulate[|c|c|c|c|]
269\NC \bf A \NC \bf B \NC \bf C \NC \bf D \NC \NR
270\NC 0.053 \NC 0.044 \NC 0.081 \NC 0.081 \NC \NR
271\stoptabulate
272
273So, we can save 50\% runtime but on a simple document like this a few percent is
274not that much. Of course many such few percentages can add up, and it's one of
275the reasons why \CONTEXT\ \MKIV\ is pretty fast in spite of all the switching
276between \TEX\ and \LUA. One objective is that an average complex document should
277be processed with a rate of at least 20 pages per second and in most cases we
278succeed. This fast function accessing can of course trigger new features in
279\CONTEXT, ones we didn't consider useful because of overhead.
280
281Keep in mind that in most cases, especially when programming in \LUA\ directly
282the \type {context} command already does all kind of housekeeping for you. For
283instance it also keeps track of so called trial typesetting runs and can inject
284nodes in the current stream as well. So, be warned: there is no real need to
285complicate your code with this kind of hackery if some high level subsystem
286provides the functionality already.
287
288\stopsection
289
290\stopchapter
291
292\stopcomponent
293