luametatex-tex.tex /size: 82 Kb    last modification: 2023-12-21 09:43
1% language=us runpath=texruns:manuals/luametatex
2
3\environment luametatex-style
4
5\startcomponent luametatex-tex
6
7\startchapter[reference=tex,title={The \TEX\ related libraries}]
8
9\startsection[title={The \type {lua} library}][library=lua]
10
11\startsubsection[title={Version information}]
12
13\topicindex{libraries+\type{lua}}
14\topicindex{getversion}
15\topicindex{getstartupfile}
16
17\libindex{getversion}
18\libindex{getstartupfile}
19
20This version of the used \LUA\ interpreter (currently {\tttf \cldcontext
21{lua.getversion()}}) can be queried with:
22
23\starttyping
24<string> v = lua.getversion()
25\stoptyping
26
27The name of used startup file, if at all, is returned by:
28
29\starttyping
30<string> s = lua.getstartupfile()
31\stoptyping
32
33For this document the reported value is:
34
35\blank {\ttx \cldcontext {lua.getstartupfile()}} \blank
36
37\stopsubsection
38
39\startsubsection[title={Table allocators}]
40
41\topicindex{tables}
42
43\libindex{newtable}
44\libindex{newindex}
45
46Sometimes performance (and memory usage) can benefit a little from
47it preallocating a table with \type {newtable}:
48
49\starttyping
50<table> t = lua.newtable(100,5000)
51\stoptyping
52
53This preallocates 100 hash entries and 5000 index entries. The \type
54{newindex} function create an indexed table with preset values:
55
56\starttyping
57<table> t = lua.newindex(2500,true)
58\stoptyping
59
60\stopsubsection
61
62\startsubsection[title={Bytecode registers}]
63
64\topicindex{bytecodes}
65\topicindex{registers+bytecodes}
66
67\libindex{bytecode}
68\libindex{setbytecode}
69\libindex{getbytecode}
70\libindex{callbytecode}
71
72\LUA\ registers can be used to store \LUA\ code chunks. The accepted values for
73assignments are functions and \type {nil}. Likewise, the retrieved value is
74either a function or \type {nil}.
75
76\starttyping
77lua.bytecode[<number> n] = <function> f
78<function> f = lua.bytecode[<number> n] % -- f()
79\stoptyping
80
81The contents of the \type {lua.bytecode} array is stored inside the format file
82as actual \LUA\ bytecode, so it can also be used to preload \LUA\ code. The
83function must not contain any upvalues. The associated function calls are:
84
85\startfunctioncall
86lua.setbytecode(<number> n, <function> f)
87<function> f = lua.getbytecode(<number> n)
88\stopfunctioncall
89
90Note: Since a \LUA\ file loaded using \type {loadfile(filename)} is essentially
91an anonymous function, a complete file can be stored in a bytecode register like
92this:
93
94\startfunctioncall
95lua.setbytecode(n,loadfile(filename))
96\stopfunctioncall
97
98Now all definitions (functions, variables) contained in the file can be
99created by executing this bytecode register:
100
101\startfunctioncall
102lua.callbytecode(n)
103\stopfunctioncall
104
105Note that the path of the file is stored in the \LUA\ bytecode to be used in
106stack backtraces and therefore dumped into the format file if the above code is
107used in \INITEX. If it contains private information, i.e. the user name, this
108information is then contained in the format file as well. This should be kept in
109mind when preloading files into a bytecode register in \INITEX.
110
111\stopsubsection
112
113\startsubsection[title={Introspection}]
114
115\libindex{getstacktop}
116\libindex{getruntime}
117\libindex{getcurrenttime}
118\libindex{getpreciseticks}
119\libindex{getpreciseseconds}
120
121The \type {getstacktop} function return a number indicating how full the \LUA\
122stack is. This function only makes sense as breakpoint when checking some
123mechanism going haywire.
124
125There are four time related helpers. The \type {getruntime} function returns the
126time passed since startup. The \type {getcurrenttime} does what its name says.
127Just play with them to see how it pays off. The \type {getpreciseticks} returns a
128number that can be used later, after a similar call, to get a difference. The
129\type {getpreciseseconds} function gets such a tick (delta) as argument and
130returns the number of seconds. Ticks can differ per operating system, but one
131always creates a reference first and then deltas to this reference.
132
133\stopsubsection
134
135\stopsection
136
137\startsection[title={The \type {status} library}][library=status]
138
139\topicindex{libraries+\type{status}}
140
141\libindex{list}
142\libindex{resetmessages}
143\libindex{setexitcode}
144
145This contains a number of run|-|time configuration items that you may find useful
146in message reporting, as well as an iterator function that gets all of the names
147and values as a table.
148
149\startfunctioncall
150<table> info = status.list()
151\stopfunctioncall
152
153The keys in the table are the known items, the value is the current value. There are
154toplevel items and items that are tables with subentries. The current list is:
155
156\startluacode
157    local list = status.list()
158
159    context.starttabulate { "|Tw(10em)|Tp|" }
160    context.DB()
161    context("toplevel statistics")
162    context.BC()
163    context.NC()
164    context.NR()
165    context.TB()
166    for k, v in table.sortedhash(list) do
167        if type(v) ~= "table" then
168            context.NC()
169            context(k)
170            context.NC()
171            context(tostring(v))
172            context.NC()
173            context.NR()
174        end
175    end
176    context.LL()
177    context.stoptabulate()
178
179    for k, v in table.sortedhash(list) do
180        if type(v) == "table" then
181            context.starttabulate { "|Tw(10em)|Tp|" }
182            context.DB()
183            context(k ..".*")
184            context.BC()
185            context.NC()
186            context.NR()
187            context.TB()
188            for k, v in table.sortedhash(v) do
189                context.NC()
190                context(k)
191                context.NC()
192                context(v == "" and "unset" or tostring(v))
193                context.NC()
194                context.NR()
195            end
196            context.LL()
197            context.stoptabulate()
198        end
199    end
200\stopluacode
201
202There are also getters for the subtables. The whole repertoire of functions in
203the \type {status} table is: {\tttf \cldcontext {table . concat ( table .
204sortedkeys (status), ", ")}}. The error and warning messages can be wiped with
205the \type {resetmessages} function. The states in subtables relate to memory
206management and are mostly there for development purposes.
207
208The \type {getconstants} query gives back a table with all kind of internal
209quantities and again these are only relevant for diagnostic and development
210purposes. Many are good old \TEX\ constants that are describes in the original
211documentation of the source but some are definitely \LUAMETATEX\ specific.
212
213\startluacode
214    context.starttabulate { "|Tw(15em)|Tp|" }
215    context.DB()
216    context("constants.*")
217    context.BC()
218    context.NC()
219    context.NR()
220    context.TB()
221    for k, v in table.sortedhash(status.getconstants()) do
222        if type(v) ~= "table" then
223            context.NC()
224            context(k)
225            context.NC()
226            context(tostring(v))
227            context.NC()
228            context.NR()
229        end
230    end
231    context.LL()
232    context.stoptabulate()
233\stopluacode
234
235Most variables speak for themselves, some are more obscure. For instance the
236\type {run_state} variable indicates what the engine is doing:
237
238\starttabulate[|l|l|p|]
239\DB n \NC meaning      \NC explanation \NC \NR
240\TB
241\NC 0 \NC initializing \NC \type {--ini} mnode \NC \NR
242\NC 1 \NC updating     \NC relates to \prm {overloadmode} \NC \NR
243\NC 2 \NC production   \NC a regular (format driven) run \NC \NR
244\LL
245\stoptabulate
246
247\stopsection
248
249\startsection[title={The \type {tex} library}][library=tex]
250
251\startsubsection[title={Introduction}]
252
253\topicindex{libraries+\type{tex}}
254
255The \type {tex} table contains a large list of virtual internal \TEX\
256parameters that are partially writable.
257
258The designation \quote {virtual} means that these items are not properly defined
259in \LUA, but are only front\-ends that are handled by a metatable that operates
260on the actual \TEX\ values. As a result, most of the \LUA\ table operators (like
261\type {pairs} and \type {#}) do not work on such items.
262
263At the moment, it is possible to access almost every parameter that you can use
264after \prm {the}, is a single token or is sort of special in \TEX. This excludes
265parameters that need extra arguments, like \type {\the\scriptfont}. The subset
266comprising simple integer and dimension registers are writable as well as
267readable (like \prm {tracingcommands} and \prm {parindent}).
268
269\stopsubsection
270
271\startsubsection[title={Internal parameter values, \type {set} and \type {get}}]
272
273\topicindex{parameters+internal}
274
275\libindex{set}
276\libindex{get}
277
278For all the parameters in this section, it is possible to access them directly
279using their names as index in the \type {tex} table, or by using one of the
280functions \type {tex.get} and \type {tex.set}.
281
282The exact parameters and return values differ depending on the actual parameter,
283and so does whether \type {tex.set} has any effect. For the parameters that {\em
284can} be set, it is possible to use \type {global} as the first argument to \type
285{tex.set}; this makes the assignment global instead of local.
286
287\startfunctioncall
288tex.set (["global",] <string> n, ...)
289... = tex.get (<string> n)
290\stopfunctioncall
291
292Glue is kind of special because there are five values involved. The return value
293is a \nod {glue_spec} node but when you pass \type {false} as last argument to
294\type {tex.get} you get the width of the glue and when you pass \type {true} you
295get all five values. Otherwise you get a node which is a copy of the internal
296value so you are responsible for its freeing at the \LUA\ end. When you set a
297glue quantity you can either pass a \nod {glue_spec} or upto five numbers.
298
299Beware: as with regular \LUA\ tables you can add values to the \type {tex} table.
300So, the following is valid:
301
302\starttyping
303tex.foo = 123
304\stoptyping
305
306When you access a \TEX\ parameter a look up takes place. For read||only variables
307that means that you will get something back, but when you set them you create a
308new entry in the table thereby making the original invisible.
309
310There are a few special cases that we make an exception for: \type {prevdepth},
311\type {prevgraf} and \type {spacefactor}. These normally are accessed via the
312\type {tex.nest} table:
313
314\starttyping
315tex.nest[tex.nest.ptr].prevdepth   = p
316tex.nest[tex.nest.ptr].spacefactor = s
317\stoptyping
318
319However, the following also works:
320
321\starttyping
322tex.prevdepth   = p
323tex.spacefactor = s
324\stoptyping
325
326Keep in mind that when you mess with node lists directly at the \LUA\ end you
327might need to update the top of the nesting stack's \type {prevdepth} explicitly
328as there is no way \LUATEX\ can guess your intentions. By using the accessor in
329the \type {tex} tables, you get and set the values at the top of the nesting
330stack.
331
332\subsubsection{Integer parameters}
333
334\topicindex{parameters+integer}
335
336The integer parameters accept and return \LUA\ integers. In some cases the values
337are checked, trigger other settings or result in some immediate change of
338behaviour: \ctxlua {document.filteredprimitives ("internal_int")}.
339
340Some integer parameters are read only, because they are actually referring not to
341some internal integer register but to an engine property: \typ {deadcycles},
342\typ {insertpenalties}, \typ {parshape}, \typ {interlinepenalties}, \typ
343{clubpenalties}, \typ {widowpenalties}, \typ {displaywidowpenalties}, \typ
344{prevgraf} and \typ {spacefactor}.
345
346\subsubsection{Dimension parameters}
347
348\topicindex{parameters+dimension}
349
350The dimension parameters accept \LUA\ numbers (signifying scaled points) or
351strings (with included dimension). The result is always a number in scaled
352points. These are read|-|write: \ctxlua {document.filteredprimitives
353("internal_dimen")}.
354
355These are read|-|only: \typ {pagedepth}, \typ {pagefilllstretch}, \typ
356{pagefillstretch}, \typ {pagefilstretch}, \typ {pagegoal}, \typ {pageshrink},
357\typ {pagestretch} and \typ {pagetotal}.
358
359\subsubsection{Direction parameters}
360
361\topicindex{parameters+direction}
362
363The direction states can be queried with: \typ {gettextdir}, \typ {getlinedir},
364\typ {getmathdir} and \typ {getpardir}. You can set them with \typ
365{settextdir}, \typ {setlinedir}, \typ {setmathdir} and \typ {setpardir},
366commands that accept a number. You can also set these parameters as table
367key|/|values: \typ {textdirection}, \typ {linedirection}, \typ {mathdirection}
368and \typ {pardirection}, so the next code sets the text direction to \typ
369{r2l}:
370
371\starttyping
372tex.textdirection = 1
373\stoptyping
374
375\subsubsection{Glue parameters}
376
377\topicindex{parameters+glue}
378
379The internal glue parameters accept and return a userdata object that represents
380a \nod {glue_spec} node: \ctxlua {document.filteredprimitives ("internal_glue")}.
381
382\subsubsection{Muglue parameters}
383
384\topicindex{parameters+muglue}
385
386All muglue parameters are to be used read|-|only and return a \LUA\ string
387\ctxlua {document.filteredprimitives ("internal_mu_glue")}.
388
389\subsubsection{Tokenlist parameters}
390
391\topicindex{parameters+tokens}
392
393The tokenlist parameters accept and return \LUA\ strings. \LUA\ strings are
394converted to and from token lists using \prm {the} \prm {toks} style expansion:
395all category codes are either space (10) or other (12). It follows that assigning
396to some of these, like \quote {tex.output}, is actually useless, but it feels bad
397to make exceptions in view of a coming extension that will accept full|-|blown
398token strings. Here is the lot: \ctxlua {document.filteredprimitives
399("internal_toks")}.
400
401\stopsubsection
402
403\startsubsection[title={Convert commands}]
404
405\topicindex{convert commands}
406
407All \quote {convert} commands are read|-|only and return a \LUA\ string. The
408supported commands at this moment are: \ctxlua {document.filteredprimitives
409("convert")}. You will get an error message if an operation is not (yet)
410permitted. Some take an string or number argument, just like at the \TEX\ end
411some extra input is expected.
412
413\stopsubsection
414
415\startsubsection[title={Item commands}]
416
417\topicindex{last items}
418
419All so called \quote {item} commands are read|-|only and return a number. The
420complete list of these commands is: \ctxlua {document.filteredprimitives
421("some_item")}. No all are currently supported but eventually that might be the
422case. Like the lists in previous sections, there are differences between \LUATEX\
423and \LUAMETATEX, where some commands are organized differently in order to
424provide a consistent \LUA\ interface.
425
426\stopsubsection
427
428\startsubsection[title={Accessing registers: \type {set*}, \type {get*} and \type {is*}}]
429
430\topicindex{attributes}
431\topicindex{registers}
432
433\libindex{attribute}  \libindex{setattribute}  \libindex{getattribute}  \libindex{isattribute}
434\libindex{count}      \libindex{setcount}      \libindex{getcount}      \libindex{iscount}
435\libindex{dimen}      \libindex{setdimen}      \libindex{getdimen}      \libindex{isdimen}
436\libindex{skip}       \libindex{setskip}       \libindex{getskip}       \libindex{isskip}
437\libindex{muskip}     \libindex{setmuskip}     \libindex{getmuskip}     \libindex{ismuskip}
438\libindex{glue}       \libindex{setglue}       \libindex{getglue}       \libindex{isglue}
439\libindex{muglue}     \libindex{setmuglue}     \libindex{getmuglue}     \libindex{ismuglue}
440\libindex{toks}       \libindex{settoks}       \libindex{gettoks}       \libindex{istoks}
441\libindex{box}        \libindex{setbox}        \libindex{getbox}        \libindex{isbox}
442
443\libindex{scantoks}
444
445\libindex{getmark}
446
447\TEX's attributes (\prm {attribute}), counters (\prm {count}), dimensions (\prm
448{dimen}), skips (\prm {skip}, \prm {muskip}) and token (\prm {toks}) registers
449can be accessed and written to using two times five virtual sub|-|tables of the
450\type {tex} table:
451
452\startthreecolumns
453\starttyping
454tex.attribute
455tex.count
456tex.dimen
457tex.skip
458tex.glue
459tex.muskip
460tex.muglue
461tex.toks
462\stoptyping
463\stopthreecolumns
464
465It is possible to use the names of relevant \prm {attributedef}, \prm {countdef},
466\prm {dimendef}, \prm {skipdef}, or \prm {toksdef} control sequences as indices
467to these tables:
468
469\starttyping
470tex.count.scratchcounter = 0
471enormous = tex.dimen['maxdimen']
472\stoptyping
473
474In this case, \LUATEX\ looks up the value for you on the fly. You have to use a
475valid \prm {countdef} (or \prm {attributedef}, or \prm {dimendef}, or \prm
476{skipdef}, or \prm {toksdef}), anything else will generate an error (the intent
477is to eventually also allow \type {<chardef tokens>} and even macros that expand
478into a number).
479
480\startitemize
481
482    \startitem
483        The count registers accept and return \LUA\ numbers.
484    \stopitem
485
486    \startitem
487        The dimension registers accept \LUA\ numbers (in scaled points) or
488        strings (with an included absolute dimension; \type {em} and \type {ex}
489        and \type {px} are forbidden). The result is always a number in scaled
490        points.
491    \stopitem
492
493    \startitem
494        The token registers accept and return \LUA\ strings. \LUA\ strings are
495        converted to and from token lists using \prm {the} \prm {toks} style
496        expansion: all category codes are either space (10) or other (12).
497    \stopitem
498
499    \startitem
500        The skip registers accept and return \nod {glue_spec} userdata node
501        objects (see the description of the node interface elsewhere in this
502        manual).
503    \stopitem
504
505    \startitem
506        The glue registers are just skip registers but instead of userdata
507        are verbose.
508    \stopitem
509
510    \startitem
511        Like the counts, the attribute registers accept and return \LUA\ numbers.
512    \stopitem
513
514\stopitemize
515
516As an alternative to array addressing, there are also accessor functions defined
517for all cases, for example, here is the set of possibilities for \prm {skip}
518registers:
519
520\startfunctioncall
521tex.setskip (["global",] <number> n, <node> s)
522tex.setskip (["global",] <string> s, <node> s)
523<node> s = tex.getskip (<number> n)
524<node> s = tex.getskip (<string> s)
525\stopfunctioncall
526
527We have similar setters for \type {count}, \type {dimen}, \type {muskip}, and
528\type {toks}. Counters and dimen are represented by numbers, skips and muskips by
529nodes, and toks by strings.
530
531Again the glue variants are not using the \nod {glue-spec} userdata nodes. The
532\type {setglue} function accepts upto five arguments: width, stretch, shrink,
533stretch order and shrink order. Non|-|numeric values set the property to zero.
534The \type {getglue} function reports all five properties, unless the second
535argument is \type {false} in which case only the width is returned.
536
537Here is an example using a threesome:
538
539\startfunctioncall
540local d = tex.getdimen("foo")
541if tex.isdimen("oof") then
542    tex.setdimen("oof",d)
543end
544\stopfunctioncall
545
546There are six extra skip (glue) related helpers:
547
548\startfunctioncall
549tex.setglue (["global"], <number> n,
550    width, stretch, shrink, stretch_order, shrink_order)
551tex.setglue (["global"], <string> s,
552    width, stretch, shrink, stretch_order, shrink_order)
553width, stretch, shrink, stretch_order, shrink_order =
554    tex.getglue (<number> n)
555width, stretch, shrink, stretch_order, shrink_order =
556    tex.getglue (<string> s)
557\stopfunctioncall
558
559The other two are \type {tex.setmuglue} and \type {tex.getmuglue}.
560
561There are such helpers for \type {dimen}, \type {count}, \type {skip}, \type
562{muskip}, \type {box} and \type {attribute} registers but the glue ones
563are special because they have to deal with more properties.
564
565As with the general \type {get} and \type {set} function discussed before, for
566the skip registers \type {getskip} returns a node and \type {getglue} returns
567numbers, while \type {setskip} accepts a node and \type {setglue} expects upto 5
568numbers. Again, when you pass \type {false} as second argument to \type {getglue}
569you only get the width returned. The same is true for the \type {mu} variants
570\type {getmuskip}, \type {setmuskip}, \type {getmuskip} and\type {setmuskip}.
571
572For tokens registers we have an alternative where a catcode table is specified:
573
574\startfunctioncall
575tex.scantoks(0,3,"$e=mc^2$")
576tex.scantoks("global",0,3,"$\int\limits^1_2$")
577\stopfunctioncall
578
579In the function-based interface, it is possible to define values globally by
580using the string \type {global} as the first function argument.
581
582There is a dedicated getter for marks: \type {getmark} that takes two arguments.
583The first argument is one of \type {top}, \type {bottom}, \type {first}, \type
584{splitbottom} or \type {splitfirst}, and the second argument is a marks class
585number. When no arguments are given the current maximum number of classes is
586returned.
587
588When \type {tex.gettoks} gets an extra argument \type {true} it will return a
589table with userdata tokens.
590
591\stopsubsection
592
593\startsubsection[title={Character code registers: \type {[get|set]*code[s]}}]
594
595\topicindex{characters+codes}
596
597\libindex{lccode}    \libindex{setlccode}    \libindex{getlccode}
598\libindex{uccode}    \libindex{setuccode}    \libindex{getuccode}
599\libindex{sfcode}    \libindex{setsfcode}    \libindex{getsfcode}
600\libindex{catcode}   \libindex{setcatcode}   \libindex{getcatcode}
601\libindex{mathcode}  \libindex{setmathcode}  \libindex{getmathcode}
602\libindex{delcode}   \libindex{setdelcode}   \libindex{getdelcode}
603
604\libindex{setdelcodes}   \libindex{getdelcodes}
605\libindex{setmathcodes}  \libindex{getmathcodes}
606
607\TEX's character code tables (\prm {lccode}, \prm {uccode}, \prm {sfcode}, \prm
608{catcode}, \prm {mathcode}, \prm {delcode}) can be accessed and written to using
609six virtual subtables of the \type {tex} table
610
611\startthreecolumns
612\starttyping
613tex.lccode
614tex.uccode
615tex.sfcode
616tex.catcode
617tex.mathcode
618tex.delcode
619\stoptyping
620\stopthreecolumns
621
622The function call interfaces are roughly as above, but there are a few twists.
623\type {sfcode}s are the simple ones:
624
625\startfunctioncall
626tex.setsfcode (["global",] <number> n, <number> s)
627<number> s = tex.getsfcode (<number> n)
628\stopfunctioncall
629
630The function call interface for \type {lccode} and \type {uccode} additionally
631allows you to set the associated sibling at the same time:
632
633\startfunctioncall
634tex.setlccode (["global"], <number> n, <number> lc)
635tex.setlccode (["global"], <number> n, <number> lc, <number> uc)
636<number> lc = tex.getlccode (<number> n)
637tex.setuccode (["global"], <number> n, <number> uc)
638tex.setuccode (["global"], <number> n, <number> uc, <number> lc)
639<number> uc = tex.getuccode (<number> n)
640\stopfunctioncall
641
642The function call interface for \type {catcode} also allows you to specify a
643category table to use on assignment or on query (default in both cases is the
644current one):
645
646\startfunctioncall
647tex.setcatcode (["global"], <number> n, <number> c)
648tex.setcatcode (["global"], <number> cattable, <number> n, <number> c)
649<number> lc = tex.getcatcode (<number> n)
650<number> lc = tex.getcatcode (<number> cattable, <number> n)
651\stopfunctioncall
652
653The interfaces for \type {delcode} and \type {mathcode} use small array tables to
654set and retrieve values:
655
656\startfunctioncall
657tex.setmathcode (["global"], <number> n, <table> mval )
658<table> mval = tex.getmathcode (<number> n)
659tex.setdelcode (["global"], <number> n, <table> dval )
660<table> dval = tex.getdelcode (<number> n)
661\stopfunctioncall
662
663Where the table for \type {mathcode} is an array of 3 numbers, like this:
664
665\starttyping
666{
667    <number> class,
668    <number> family,
669    <number> character
670}
671\stoptyping
672
673And the table for \type {delcode} is an array with 4 numbers, like this:
674
675\starttyping
676{
677    <number> small_fam,
678    <number> small_char,
679    <number> large_fam,
680    <number> large_char
681}
682\stoptyping
683
684You can also avoid the table:
685
686\startfunctioncall
687tex.setmathcode (["global"], <number> n, <number> class,
688    <number> family, <number> character)
689class, family, char =
690    tex.getmathcodes (<number> n)
691tex.setdelcode (["global"], <number> n, <number> smallfam,
692    <number> smallchar, <number> largefam, <number> largechar)
693smallfam, smallchar, largefam, largechar =
694    tex.getdelcodes (<number> n)
695\stopfunctioncall
696
697Normally, the third and fourth values in a delimiter code assignment will be zero
698according to \prm {Udelcode} usage, but the returned table can have values there
699(if the delimiter code was set using \prm {delcode}, for example). Unset \type
700{delcode}'s can be recognized because \type {dval[1]} is $-1$.
701
702\stopsubsection
703
704\startsubsection[title={Box registers: \type {[get|set]box}}]
705
706\topicindex{registers}
707\topicindex{boxes}
708
709\libindex{box}
710\libindex{setbox}  \libindex{getbox}
711
712It is possible to set and query actual boxes, coming for instance from \prm
713{hbox}, \prm {vbox} or \prm {vtop}, using the node interface as defined in the
714\type {node} library:
715
716\starttyping
717tex.box
718\stoptyping
719
720for array access, or
721
722\starttyping
723tex.setbox(["global",] <number> n, <node> s)
724tex.setbox(["global",] <string> cs, <node> s)
725<node> n = tex.getbox(<number> n)
726<node> n = tex.getbox(<string> cs)
727\stoptyping
728
729for function|-|based access. In the function-based interface, it is possible to
730define values globally by using the string \type {global} as the first function
731argument.
732
733Be warned that an assignment like
734
735\starttyping
736tex.box[0] = tex.box[2]
737\stoptyping
738
739does not copy the node list, it just duplicates a node pointer. If \type {\box2}
740will be cleared by \TEX\ commands later on, the contents of \type {\box0} becomes
741invalid as well. To prevent this from happening, always use \type
742{node.copy_list} unless you are assigning to a temporary variable:
743
744\starttyping
745tex.box[0] = node.copy_list(tex.box[2])
746\stoptyping
747
748\stopsubsection
749
750\startsubsection[title={\type {triggerbuildpage}}]
751
752\topicindex{pages}
753
754\libindex{triggerbuildpage}
755
756You should not expect to much from the \type {triggerbuildpage} helpers because
757often \TEX\ doesn't do much if it thinks nothing has to be done, but it might be
758useful for some applications. It just does as it says it calls the internal
759function that build a page, given that there is something to build.
760
761\stopsubsection
762
763\startsubsection[title={\type {splitbox}}]
764
765\topicindex{boxes+split}
766
767\libindex{splitbox}
768
769You can split a box:
770
771\starttyping
772local vlist = tex.splitbox(n,height,mode)
773\stoptyping
774
775The remainder is kept in the original box and a packaged vlist is returned. This
776operation is comparable to the \prm {vsplit} operation. The mode can be \type
777{additional} or \type {exactly} and concerns the split off box.
778
779\stopsubsection
780
781\startsubsection[title={Accessing math parameters: \type {[get|set]math}}]
782
783\topicindex{math+parameters}
784\topicindex{parameters+math}
785
786\libindex{setmath}
787\libindex{getmath}
788
789It is possible to set and query the internal math parameters using:
790
791\startfunctioncall
792tex.setmath(["global",] <string> n, <string> t, <number> n)
793<number> n = tex.getmath(<string> n, <string> t)
794\stopfunctioncall
795
796As before an optional first parameter \type {global} indicates a global
797assignment.
798
799The first string is the parameter name minus the leading \quote {Umath}, and the
800second string is the style name minus the trailing \quote {style}. Just to be
801complete, the values for the math parameter name are:
802
803\starttyping
804quad                axis                operatorsize
805overbarkern         overbarrule         overbarvgap
806underbarkern        underbarrule        underbarvgap
807radicalkern         radicalrule         radicalvgap
808radicaldegreebefore radicaldegreeafter  radicaldegreeraise
809stackvgap           stacknumup          stackdenomdown
810fractionrule        fractionnumvgap     fractionnumup
811fractiondenomvgap   fractiondenomdown   fractiondelsize
812limitabovevgap      limitabovebgap      limitabovekern
813limitbelowvgap      limitbelowbgap      limitbelowkern
814underdelimitervgap  underdelimiterbgap
815overdelimitervgap   overdelimiterbgap
816subshiftdrop        supshiftdrop        subshiftdown
817subsupshiftdown     subtopmax           supshiftup
818supbottommin        supsubbottommax     subsupvgap
819spaceafterscript    connectoroverlapmin
820ordordspacing       ordopspacing        ordbinspacing     ordrelspacing
821ordopenspacing      ordclosespacing     ordpunctspacing   ordinnerspacing
822opordspacing        opopspacing         opbinspacing      oprelspacing
823opopenspacing       opclosespacing      oppunctspacing    opinnerspacing
824binordspacing       binopspacing        binbinspacing     binrelspacing
825binopenspacing      binclosespacing     binpunctspacing   bininnerspacing
826relordspacing       relopspacing        relbinspacing     relrelspacing
827relopenspacing      relclosespacing     relpunctspacing   relinnerspacing
828openordspacing      openopspacing       openbinspacing    openrelspacing
829openopenspacing     openclosespacing    openpunctspacing  openinnerspacing
830closeordspacing     closeopspacing      closebinspacing   closerelspacing
831closeopenspacing    closeclosespacing   closepunctspacing closeinnerspacing
832punctordspacing     punctopspacing      punctbinspacing   punctrelspacing
833punctopenspacing    punctclosespacing   punctpunctspacing punctinnerspacing
834innerordspacing     inneropspacing      innerbinspacing   innerrelspacing
835inneropenspacing    innerclosespacing   innerpunctspacing innerinnerspacing
836\stoptyping
837
838The values for the style parameter are:
839
840\starttyping
841display       crampeddisplay
842text          crampedtext
843script        crampedscript
844scriptscript  crampedscriptscript
845\stoptyping
846
847The value is either a number (representing a dimension or number) or a glue spec
848node representing a muskip for \type {ordordspacing} and similar spacing
849parameters.
850
851\stopsubsection
852
853\startsubsection[title={Special list heads: \type {[get|set]list}}]
854
855\topicindex{lists}
856
857\libindex{lists}
858\libindex{setlist}
859\libindex{getlist}
860
861The virtual table \type {tex.lists} contains the set of internal registers that
862keep track of building page lists.
863
864\starttabulate[|l|p|]
865\DB field                    \BC explanation \NC \NR
866\TB
867\NC \type{pageinserthead}    \NC circular list of pending insertions \NC \NR
868\NC \type{contributehead}    \NC the recent contributions \NC \NR
869\NC \type{pagehead}          \NC the current page content \NC \NR
870%NC \type{temphead}          \NC \NC \NR
871\NC \type{holdhead}          \NC used for held-over items for next page \NC \NR
872\NC \type{postadjusthead}    \NC head of the (pending) post adjustments \NC \NR
873\NC \type{preadjusthead}     \NC head of the (pending) pre adjustments \NC \NR
874\NC \type{postmigratehead}   \NC head of the (pending) post migrations \NC \NR
875\NC \type{premigratehead}    \NC head of the (pending) pre migrations \NC \NR
876%NC \type{alignhead}         \NC \NC \NR
877\NC \type{pagediscardshead}  \NC head of the discarded items of a page break \NC \NR
878\NC \type{splitdiscardshead} \NC head of the discarded items in a vsplit \NC \NR
879\LL
880\stoptabulate
881
882The getter and setter functions are \type {getlist} and \type {setlist}. You have
883to be careful with what you set as \TEX\ can have expectations with regards to
884how a list is constructed or in what state it is.
885
886\stopsubsection
887
888\startsubsection[title={Semantic nest levels: \type {getnest} and \type {ptr}}]
889
890\topicindex{nesting}
891
892\libindex{nest}
893\libindex{ptr}
894%libindex{setnest} % only a message
895\libindex{getnest}
896
897The virtual table \type {nest} contains the currently active semantic nesting
898state. It has two main parts: a zero-based array of userdata for the semantic
899nest itself, and the numerical value \type {ptr}, which gives the highest
900available index. Neither the array items in \type {nest[]} nor \type {ptr} can be
901assigned to (as this would confuse the typesetting engine beyond repair), but you
902can assign to the individual values inside the array items, e.g.\ \type
903{tex.nest[tex.nest.ptr].prevdepth}.
904
905\type {tex.nest[tex.nest.ptr]} is the current nest state, \type {nest[0]} the
906outermost (main vertical list) level. The getter function is \type {getnest}. You
907can pass a number (which gives you a list), nothing or \type {top}, which returns
908the topmost list, or the string \type {ptr} which gives you the index of the
909topmost list.
910
911The known fields are:
912
913\starttabulate[|l|l|l|p|]
914\DB key                \BC type    \BC modes \BC explanation \NC \NR
915\TB
916\NC \type{mode}        \NC number  \NC all   \NC the meaning of these numbers depends on the engine
917                                                 and sometimes even the version; you can use \typ
918                                                 {tex.getmodevalues()} to get the mapping: positive
919                                                 values signal vertical, horizontal and math mode,
920                                                 while negative values indicate inner and inline
921                                                 variants \NC \NR
922\NC \type{modeline}    \NC number  \NC all   \NC source input line where this mode was entered in,
923                                                 negative inside the output routine \NC \NR
924\NC \type{head}        \NC node    \NC all   \NC the head of the current list \NC \NR
925\NC \type{tail}        \NC node    \NC all   \NC the tail of the current list \NC \NR
926\NC \type{prevgraf}    \NC number  \NC vmode \NC number of lines in the previous paragraph \NC \NR
927\NC \type{prevdepth}   \NC number  \NC vmode \NC depth of the previous paragraph \NC \NR
928\NC \type{spacefactor} \NC number  \NC hmode \NC the current space factor \NC \NR
929\NC \type{direction}   \NC node    \NC hmode \NC stack used for temporary storage by the line break algorithm \NC \NR
930\NC \type{noad}        \NC node    \NC mmode \NC used for temporary storage of a pending fraction numerator,
931                                                 for \prm {over} etc. \NC \NR
932\NC \type{delimiter}   \NC node    \NC mmode \NC used for temporary storage of the previous math delimiter,
933                                                 for \prm {middle} \NC \NR
934\NC \type{mathdir}     \NC boolean \NC mmode \NC true when during math processing the \prm {mathdirection} is not
935                                                 the same as the surrounding \prm {textdirection} \NC \NR
936\NC \type{mathstyle}   \NC number  \NC mmode \NC the current \prm {mathstyle} \NC \NR
937\LL
938\stoptabulate
939
940When a second string argument is given to the \type {getnest}, the value with
941that name is returned. Of course the level must be valid. When \type {setnest}
942gets a third argument that value is assigned to the field given as second
943argument.
944
945\stopsubsection
946
947\startsubsection[reference=sec:luaprint,title={Print functions}]
948
949\topicindex{printing}
950
951The \type {tex} table also contains the three print functions that are the major
952interface from \LUA\ scripting to \TEX. The arguments to these three functions
953are all stored in an in|-|memory virtual file that is fed to the \TEX\ scanner as
954the result of the expansion of \prm {directlua}.
955
956The total amount of returnable text from a \prm {directlua} command is only
957limited by available system \RAM. However, each separate printed string has to
958fit completely in \TEX's input buffer. The result of using these functions from
959inside callbacks is undefined at the moment.
960
961\subsubsection{\type {print}}
962
963\libindex{print}
964
965\startfunctioncall
966tex.print(<string> s, ...)
967tex.print(<number> n, <string> s, ...)
968tex.print(<table> t)
969tex.print(<number> n, <table> t)
970\stopfunctioncall
971
972Each string argument is treated by \TEX\ as a separate input line. If there is a
973table argument instead of a list of strings, this has to be a consecutive array
974of strings to print (the first non-string value will stop the printing process).
975
976The optional parameter can be used to print the strings using the catcode regime
977defined by \prm {catcodetable}~\type {n}. If \type {n} is $-1$, the currently
978active catcode regime is used. If \type {n} is $-2$, the resulting catcodes are
979the result of \prm {the} \prm {toks}: all category codes are 12 (other) except for
980the space character, that has category code 10 (space). Otherwise, if \type {n}
981is not a valid catcode table, then it is ignored, and the currently active
982catcode regime is used instead.
983
984The very last string of the very last \type {tex.print} command in a \prm
985{directlua} will not have the \prm {endlinechar} appended, all others do.
986
987\subsubsection{\type {sprint}}
988
989\libindex{sprint}
990
991\startfunctioncall
992tex.sprint(<string> s, ...)
993tex.sprint(<number> n, <string> s, ...)
994tex.sprint(<table> t)
995tex.sprint(<number> n, <table> t)
996\stopfunctioncall
997
998Each string argument is treated by \TEX\ as a special kind of input line that
999makes it suitable for use as a partial line input mechanism:
1000
1001\startitemize[packed]
1002\startitem
1003    \TEX\ does not switch to the \quote {new line} state, so that leading spaces
1004    are not ignored.
1005\stopitem
1006\startitem
1007    No \prm {endlinechar} is inserted.
1008\stopitem
1009\startitem
1010    Trailing spaces are not removed. Note that this does not prevent \TEX\ itself
1011    from eating spaces as result of interpreting the line. For example, in
1012
1013    \starttyping
1014    before\directlua{tex.sprint("\\relax")tex.sprint(" in between")}after
1015    \stoptyping
1016
1017    the space before \type {in between} will be gobbled as a result of the \quote
1018    {normal} scanning of \prm {relax}.
1019\stopitem
1020\stopitemize
1021
1022If there is a table argument instead of a list of strings, this has to be a
1023consecutive array of strings to print (the first non-string value will stop the
1024printing process).
1025
1026The optional argument sets the catcode regime, as with \type {tex.print}. This
1027influences the string arguments (or numbers turned into strings).
1028
1029Although this needs to be used with care, you can also pass token or node
1030userdata objects. These get injected into the stream. Tokens had best be valid
1031tokens, while nodes need to be around when they get injected. Therefore it is
1032important to realize the following:
1033
1034\startitemize
1035\startitem
1036    When you inject a token, you need to pass a valid token userdata object. This
1037    object will be collected by \LUA\ when it no longer is referenced. When it gets
1038    printed to \TEX\ the token itself gets copied so there is no interference with the
1039    \LUA\ garbage collection. You manage the object yourself. Because tokens are
1040    actually just numbers, there is no real extra overhead at the \TEX\ end.
1041\stopitem
1042\startitem
1043    When you inject a node, you need to pass a valid node userdata object. The
1044    node related to the object will not be collected by \LUA\ when it no longer
1045    is referenced. It lives on at the \TEX\ end in its own memory space. When it
1046    gets printed to \TEX\ the node reference is used assuming that node stays
1047    around. There is no \LUA\ garbage collection involved. Again, you manage the
1048    object yourself. The node itself is freed when \TEX\ is done with it.
1049\stopitem
1050\stopitemize
1051
1052If you consider the last remark you might realize that we have a problem when a
1053printed mix of strings, tokens and nodes is reused. Inside \TEX\ the sequence
1054becomes a linked list of input buffers. So, \type {"123"} or \type {"\foo{123}"}
1055gets read and parsed on the fly, while \typ {<token userdata>} already is
1056tokenized and effectively is a token list now. A \typ {<node userdata>} is also
1057tokenized into a token list but it has a reference to a real node. Normally this
1058goes fine. But now assume that you store the whole lot in a macro: in that case
1059the tokenized node can be flushed many times. But, after the first such flush the
1060node is used and its memory freed. You can prevent this by using copies which is
1061controlled by setting \prm {luacopyinputnodes} to a non|-|zero value. This is one
1062of these fuzzy areas you have to live with if you really mess with these low
1063level issues.
1064
1065\subsubsection{\type {tprint}}
1066
1067\libindex{tprint}
1068
1069\startfunctioncall
1070tex.tprint({<number> n, <string> s, ...}, {...})
1071\stopfunctioncall
1072
1073This function is basically a shortcut for repeated calls to \type
1074{tex.sprint(<number> n, <string> s, ...)}, once for each of the supplied argument
1075tables.
1076
1077\subsubsection{\type {cprint}}
1078
1079\libindex{cprint}
1080
1081This function takes a number indicating the to be used catcode, plus either a
1082table of strings or an argument list of strings that will be pushed into the
1083input stream.
1084
1085\startfunctioncall
1086tex.cprint( 1," 1: $&{\\foo}") tex.print("\\par") -- a lot of \bgroup s
1087tex.cprint( 2," 2: $&{\\foo}") tex.print("\\par") -- matching \egroup s
1088tex.cprint( 9," 9: $&{\\foo}") tex.print("\\par") -- all get ignored
1089tex.cprint(10,"10: $&{\\foo}") tex.print("\\par") -- all become spaces
1090tex.cprint(11,"11: $&{\\foo}") tex.print("\\par") -- letters
1091tex.cprint(12,"12: $&{\\foo}") tex.print("\\par") -- other characters
1092tex.cprint(14,"12: $&{\\foo}") tex.print("\\par") -- comment triggers
1093\stopfunctioncall
1094
1095% \subsubsection{\type {write}, \type {twrite}, \type {nwrite}}
1096\subsubsection{\type {write}}
1097
1098\libindex{write}
1099% \libindex{twrite}
1100% \libindex{nwrite}
1101
1102\startfunctioncall
1103tex.write(<string> s, ...)
1104tex.write(<table> t)
1105\stopfunctioncall
1106
1107Each string argument is treated by \TEX\ as a special kind of input line that
1108makes it suitable for use as a quick way to dump information:
1109
1110\startitemize
1111\item All catcodes on that line are either \quote{space} (for '~') or \quote
1112      {character} (for all others).
1113\item There is no \prm {endlinechar} appended.
1114\stopitemize
1115
1116If there is a table argument instead of a list of strings, this has to be a
1117consecutive array of strings to print (the first non-string value will stop the
1118printing process).
1119
1120% The functions \type {twrite} and \type {nwrite} can be used to write a token or
1121% node back to \TEX\, possibly intermixed with regular strings that will be
1122% tokenized. You have to make sure that you pass the right data because sometimes
1123% \TEX\ has expectations that need to be met.
1124
1125\stopsubsection
1126
1127\startsubsection[title={Helper functions}]
1128
1129\subsubsection{\type {round}}
1130
1131\topicindex {helpers}
1132
1133\libindex{round}
1134
1135\startfunctioncall
1136<number> n = tex.round(<number> o)
1137\stopfunctioncall
1138
1139Rounds \LUA\ number \type {o}, and returns a number that is in the range of a
1140valid \TEX\ register value. If the number starts out of range, it generates a
1141\quote {number too big} error as well.
1142
1143\subsubsection{\type {scale}}
1144
1145\libindex{scale}
1146
1147\startfunctioncall
1148<number> n = tex.scale(<number> o, <number> delta)
1149<table> n = tex.scale(table o, <number> delta)
1150\stopfunctioncall
1151
1152Multiplies the \LUA\ numbers \type {o} and \nod {delta}, and returns a rounded
1153number that is in the range of a valid \TEX\ register value. In the table
1154version, it creates a copy of the table with all numeric top||level values scaled
1155in that manner. If the multiplied number(s) are of range, it generates
1156\quote{number too big} error(s) as well.
1157
1158Note: the precision of the output of this function will depend on your computer's
1159architecture and operating system, so use with care! An interface to \LUATEX's
1160internal, 100\% portable scale function will be added at a later date.
1161
1162\subsubsection{\type {number} and \type {romannumeral}}
1163
1164\libindex{number}
1165\libindex{romannumeral}
1166
1167These are the companions to the primitives \prm {number} and \prm
1168{romannumeral}. They can be used like:
1169
1170\startfunctioncall
1171tex.print(tex.romannumeral(123))
1172\stopfunctioncall
1173
1174\subsubsection{\type {fontidentifier} and \type {fontname}}
1175
1176\libindex{fontidentifier}
1177\libindex{fontname}
1178
1179The first one returns the name only, the second one reports the size too.
1180
1181\startfunctioncall
1182tex.print(tex.fontname(1))
1183tex.print(tex.fontidentifier(1))
1184\stopfunctioncall
1185
1186\subsubsection{\type {sp}}
1187
1188\libindex{sp}
1189
1190\startfunctioncall
1191<number> n = tex.sp(<number> o)
1192<number> n = tex.sp(<string> s)
1193\stopfunctioncall
1194
1195Converts the number \type {o} or a string \type {s} that represents an explicit
1196dimension into an integer number of scaled points.
1197
1198For parsing the string, the same scanning and conversion rules are used that
1199\LUATEX\ would use if it was scanning a dimension specifier in its \TEX|-|like
1200input language (this includes generating errors for bad values), expect for the
1201following:
1202
1203\startitemize[n]
1204\startitem
1205    only explicit values are allowed, control sequences are not handled
1206\stopitem
1207\startitem
1208    infinite dimension units (\type {fil...}) are forbidden
1209\stopitem
1210\startitem
1211    \type {mu} units do not generate an error (but may not be useful either)
1212\stopitem
1213\stopitemize
1214
1215\subsubsection{\type {tex.getlinenumber} and \type {tex.setlinenumber}}
1216
1217\libindex{getlinenumber}
1218\libindex{setlinenumber}
1219
1220You can mess with the current line number:
1221
1222\startfunctioncall
1223local n = tex.getlinenumber()
1224tex.setlinenumber(n+10)
1225\stopfunctioncall
1226
1227which can be shortcut to:
1228
1229\startfunctioncall
1230tex.setlinenumber(10,true)
1231\stopfunctioncall
1232
1233This might be handy when you have a callback that reads numbers from a file and
1234combines them in one line (in which case an error message probably has to refer
1235to the original line). Interference with \TEX's internal handling of numbers is
1236of course possible.
1237
1238\subsubsection{\type {error}, \type {show_context} and \type {gethelptext}}
1239
1240\topicindex{errors}
1241
1242\libindex{error}
1243\libindex{show_context}
1244\libindex{gethelptext}
1245
1246\startfunctioncall
1247tex.error(<string> s)
1248tex.error(<string> s, <table> help)
1249<string> s = tex.gethelptext()
1250\stopfunctioncall
1251
1252This creates an error somewhat like the combination of \prm {errhelp} and \prm
1253{errmessage} would. During this error, deletions are disabled.
1254
1255The array part of the \type {help} table has to contain strings, one for each
1256line of error help.
1257
1258In case of an error the \type {show_context} function will show the current
1259context where we're at (in the expansion).
1260
1261\subsubsection{\type {getfamilyoffont}}
1262
1263\libindex {getfamilyoffont}
1264
1265When you pass a proper family identifier the next helper will return the font
1266currently associated with it.
1267
1268\startfunctioncall
1269<integer> id = font.getfamilyoffont(<integer> fam)
1270\stopfunctioncall
1271
1272\subsubsection{\type {[set|get]interaction}}
1273
1274\libindex{setinteraction}
1275\libindex{getinteraction}
1276
1277The engine can be in one of four modes:
1278
1279\starttabulate[|lT|l|pl|]
1280\DB value \NC mode      \BC meaning \NC \NR
1281\TB
1282\NC 0     \NC batch     \NC omits all stops and omits terminal output \NC \NR
1283\NC 1     \NC nonstop   \NC omits all stops \NC \NR
1284\NC 2     \NC scroll    \NC omits error stops \NC \NR
1285\NC 3     \NC errorstop \NC stops at every opportunity to interact \NC \NR
1286\LL
1287\stoptabulate
1288
1289The mode can be queried and set with:
1290
1291\startfunctioncall
1292<integer> i = tex.getinteraction()
1293tex.setinteraction(<integer> i)
1294\stopfunctioncall
1295
1296\subsubsection{\type {runtoks} and \type {quittoks}}
1297
1298Because of the fact that \TEX\ is in a complex dance of expanding, dealing with
1299fonts, typesetting paragraphs, messing around with boxes, building pages, and so
1300on, you cannot easily run a nested \TEX\ run (read nested main loop). However,
1301there is an option to force a local run with \type {runtoks}. The content of the
1302given token list register gets expanded locally after which we return to where we
1303triggered this expansion, at the \LUA\ end. Instead a function can get passed
1304that does some work. You have to make sure that at the end \TEX\ is in a sane
1305state and this is not always trivial. A more complex mechanism would complicate
1306\TEX\ itself (and probably also harm performance) so this simple local expansion
1307loop has to do.
1308
1309\startfunctioncall
1310tex.runtoks(<token register>)
1311tex.runtoks(<lua function>)
1312tex.runtoks(<macro name>)
1313tex.runtoks(<register name>)
1314\stopfunctioncall
1315
1316When the \prm {tracingnesting} parameter is set to a value larger than~2 some
1317information is reported about the state of the local loop. The return value indicates
1318an error:
1319
1320\starttabulate[|lT|pl|]
1321\DB value \NC meaning \NC \NR
1322\TB
1323\NC 0     \NC no error \NC \NR
1324\NC 1     \NC bad register number \NC \NR
1325\NC 2     \NC unknown macro or register name \NC \NR
1326\NC 3     \NC macro is unsuitable for runtoks (has arguments) \NC \NR
1327\LL
1328\stoptabulate
1329
1330This function has two optional arguments in case a token register is passed:
1331
1332\startfunctioncall
1333tex.runtoks(<token register>,force,grouped,obeymode)
1334\stopfunctioncall
1335
1336Inside for instance an \type {\edef} the \type {runtoks} function behaves (at
1337least tries to) like it were an \type {\the}. This prevents unwanted side
1338effects: normally in such an definition tokens remain tokens and (for instance)
1339characters don't become nodes. With the second argument you can force the local
1340main loop, no matter what. The third argument adds a level of grouping. The last
1341argument tells the scanner to stay in the current mode.
1342
1343You can quit the local loop with \type {\endlocalcontrol} or from the \LUA\ end
1344with \type {tex.quittoks}. In that case you end one level up! Of course in the
1345end that can mean that you arrive at the main level in which case an extra end
1346will trigger a redundancy warning (not an abort!).
1347
1348\subsubsection{\type {forcehmode}}
1349
1350\libindex{forcehmode}
1351
1352An example of a (possible error triggering) complication is that \TEX\ expects to
1353be in some state, say horizontal mode, and you have to make sure it is when you
1354start feeding back something from \LUA\ into \TEX. Normally a user will not run
1355into issues but when you start writing tokens or nodes or have a nested run there
1356can be situations that you need to run \type {forcehmode}. There is no recipe for
1357this and intercepting possible cases would weaken \LUATEX's flexibility.
1358
1359\subsubsection{\type {hashtokens}}
1360
1361\libindex{hashtokens}
1362
1363\topicindex{hash}
1364
1365\startfunctioncall
1366for i,v in pairs (tex.hashtokens()) do ... end
1367\stopfunctioncall
1368
1369Returns a list of names. This can be useful for debugging, but note that this
1370also reports control sequences that may be unreachable at this moment due to
1371local redefinitions: it is strictly a dump of the hash table. You can use \type
1372{token.create} to inspect properties, for instance when the \type {command} key
1373in a created table equals \type {123}, you have the \type {cmdname} value \type
1374{undefined_cs}.
1375
1376\subsubsection{\type {definefont}}
1377
1378\topicindex{fonts+defining}
1379
1380\libindex{definefont}
1381
1382\startfunctioncall
1383tex.definefont(<string> csname, <number> fontid)
1384tex.definefont(<boolean> global, <string> csname, <number> fontid)
1385\stopfunctioncall
1386
1387Associates \type {csname} with the internal font number \type {fontid}. The
1388definition is global if (and only if) \type {global} is specified and true (the
1389setting of \type {globaldefs} is not taken into account).
1390
1391\stopsubsection
1392
1393\startsubsection[reference=luaprimitives,title={Functions for dealing with primitives}]
1394
1395\subsubsection{\type {enableprimitives}}
1396
1397\libindex{enableprimitives}
1398
1399\topicindex{initialization}
1400\topicindex{primitives}
1401
1402\startfunctioncall
1403tex.enableprimitives(<string> prefix, <table> primitive names)
1404\stopfunctioncall
1405
1406This function accepts a prefix string and an array of primitive names. For each
1407combination of \quote {prefix} and \quote {name}, the \type
1408{tex.enableprimitives} first verifies that \quote {name} is an actual primitive
1409(it must be returned by one of the \type {tex.extraprimitives} calls explained
1410below, or part of \TEX82, or \prm {directlua}). If it is not, \type
1411{tex.enableprimitives} does nothing and skips to the next pair.
1412
1413But if it is, then it will construct a csname variable by concatenating the
1414\quote {prefix} and \quote {name}, unless the \quote {prefix} is already the
1415actual prefix of \quote {name}. In the latter case, it will discard the \quote
1416{prefix}, and just use \quote {name}.
1417
1418Then it will check for the existence of the constructed csname. If the csname is
1419currently undefined (note: that is not the same as \prm {relax}), it will
1420globally define the csname to have the meaning: run code belonging to the
1421primitive \quote {name}. If for some reason the csname is already defined, it
1422does nothing and tries the next pair.
1423
1424An example:
1425
1426\starttyping
1427tex.enableprimitives('LuaTeX', {'formatname'})
1428\stoptyping
1429
1430will define \type {\LuaTeXformatname} with the same intrinsic meaning as the
1431documented primitive \prm {formatname}, provided that the control sequences \type
1432{\LuaTeXformatname} is currently undefined.
1433
1434When \LUATEX\ is run with \type {--ini} only the \TEX82 primitives and \prm
1435{directlua} are available, so no extra primitives {\bf at all}.
1436
1437If you want to have all the new functionality available using their default
1438names, as it is now, you will have to add
1439
1440\starttyping
1441\ifx\directlua\undefined \else
1442    \directlua {tex.enableprimitives('',tex.extraprimitives ())}
1443\fi
1444\stoptyping
1445
1446near the beginning of your format generation file. Or you can choose different
1447prefixes for different subsets, as you see fit.
1448
1449Calling some form of \type {tex.enableprimitives} is highly important though,
1450because if you do not, you will end up with a \TEX82-lookalike that can run \LUA\
1451code but not do much else. The defined csnames are (of course) saved in the
1452format and will be available at runtime.
1453
1454\subsubsection{\type {extraprimitives}}
1455
1456\libindex{extraprimitives}
1457
1458\startfunctioncall
1459<table> t = tex.extraprimitives(<string> s, ...)
1460\stopfunctioncall
1461
1462This function returns a list of the primitives that originate from the engine(s)
1463given by the requested string value(s). The possible values and their (current)
1464return values are given in the following table. In addition the somewhat special
1465primitives \quote{\tex{ }}, \quote{\tex {/}} and \quote{\type {-}} are defined.
1466
1467\starttabulate[|l|pl|]
1468\DB name   \BC values \NC \NR
1469\TB
1470\NC tex    \NC \ctxlua{document.showprimitives('tex')    } \NC \NR
1471\NC core   \NC \ctxlua{document.showprimitives('core')   } \NC \NR
1472\NC etex   \NC \ctxlua{document.showprimitives('etex')   } \NC \NR
1473\NC luatex \NC \ctxlua{document.showprimitives('luatex') } \NC \NR
1474\LL
1475\stoptabulate
1476
1477Note that \type {luatex} does not contain \type {directlua}, as that is
1478considered to be a core primitive, along with all the \TEX82 primitives, so it is
1479part of the list that is returned from \type {'core'}.
1480
1481Running \type {tex.extraprimitives} will give you the complete list of
1482primitives \type {-ini} startup. It is exactly equivalent to \type
1483{tex.extraprimitives("etex","luatex")}.
1484
1485\subsubsection{\type {primitives}}
1486
1487\libindex{primitives}
1488
1489\startfunctioncall
1490<table> t = tex.primitives()
1491\stopfunctioncall
1492
1493This function returns a list of all primitives that \LUATEX\ knows about.
1494
1495\stopsubsection
1496
1497\startsubsection[title={Core functionality interfaces}]
1498
1499\subsubsection{\type {badness}}
1500
1501\libindex{badness}
1502
1503\startfunctioncall
1504<number> b = tex.badness(<number> t, <number> s)
1505\stopfunctioncall
1506
1507This helper function is useful during linebreak calculations. \type {t} and \type
1508{s} are scaled values; the function returns the badness for when total \type {t}
1509is supposed to be made from amounts that sum to \type {s}. The returned number is
1510a reasonable approximation of \mathematics {100(t/s)^3};
1511
1512\subsubsection{\type {tex.resetparagraph}}
1513
1514\topicindex {paragraphs+reset}
1515
1516\libindex{resetparagraph}
1517
1518This function resets the parameters that \TEX\ normally resets when a new paragraph
1519is seen.
1520
1521\subsubsection{\type {linebreak}}
1522
1523\topicindex {linebreaks}
1524
1525\libindex{linebreak}
1526
1527\startfunctioncall
1528local <node> nodelist, <table> info =
1529    tex.linebreak(<node> listhead, <table> parameters)
1530\stopfunctioncall
1531
1532The understood parameters are as follows:
1533
1534\starttabulate[|l|l|p|]
1535\DB name                        \BC type            \BC explanation \NC \NR
1536\TB
1537\NC \type{pardir}               \NC string          \NC \NC \NR
1538\NC \type{pretolerance}         \NC number          \NC \NC \NR
1539\NC \type{tracingparagraphs}    \NC number          \NC \NC \NR
1540\NC \type{tolerance}            \NC number          \NC \NC \NR
1541\NC \type{looseness}            \NC number          \NC \NC \NR
1542\NC \type{hyphenpenalty}        \NC number          \NC \NC \NR
1543\NC \type{exhyphenpenalty}      \NC number          \NC \NC \NR
1544\NC \type{pdfadjustspacing}     \NC number          \NC \NC \NR
1545\NC \type{adjdemerits}          \NC number          \NC \NC \NR
1546\NC \type{protrudechars}        \NC number          \NC \NC \NR
1547\NC \type{linepenalty}          \NC number          \NC \NC \NR
1548\NC \type{lastlinefit}          \NC number          \NC \NC \NR
1549\NC \type{doublehyphendemerits} \NC number          \NC \NC \NR
1550\NC \type{finalhyphendemerits}  \NC number          \NC \NC \NR
1551\NC \type{hangafter}            \NC number          \NC \NC \NR
1552\NC \type{interlinepenalty}     \NC number or table \NC if a table, then it is an array like \prm {interlinepenalties} \NC \NR
1553\NC \type{clubpenalty}          \NC number or table \NC if a table, then it is an array like \prm {clubpenalties} \NC \NR
1554\NC \type{widowpenalty}         \NC number or table \NC if a table, then it is an array like \prm {widowpenalties} \NC \NR
1555\NC \type{brokenpenalty}        \NC number          \NC \NC \NR
1556\NC \type{emergencystretch}     \NC number          \NC in scaled points \NC \NR
1557\NC \type{hangindent}           \NC number          \NC in scaled points \NC \NR
1558\NC \type{hsize}                \NC number          \NC in scaled points \NC \NR
1559\NC \type{leftskip}             \NC glue_spec node  \NC \NC \NR
1560\NC \type{rightskip}            \NC glue_spec node  \NC \NC \NR
1561\NC \type{parshape}             \NC table           \NC \NC \NR
1562\LL
1563\stoptabulate
1564
1565Note that there is no interface for \prm {displaywidowpenalties}, you have to
1566pass the right choice for \type {widowpenalties} yourself.
1567
1568It is your own job to make sure that \type {listhead} is a proper paragraph list:
1569this function does not add any nodes to it. To be exact, if you want to replace
1570the core line breaking, you may have to do the following (when you are not
1571actually working in the \cbk {pre_linebreak_filter} or \cbk {linebreak_filter}
1572callbacks, or when the original list starting at listhead was generated in
1573horizontal mode):
1574
1575\startitemize
1576\startitem
1577    add an \quote {indent box} and perhaps a \nod {par} node at the start
1578    (only if you need them)
1579\stopitem
1580\startitem
1581    replace any found final glue by an infinite penalty (or add such a penalty,
1582    if the last node is not a glue)
1583\stopitem
1584\startitem
1585    add a glue node for the \prm {parfillskip} after that penalty node
1586\stopitem
1587\startitem
1588    make sure all the \type {prev} pointers are OK
1589\stopitem
1590\stopitemize
1591
1592The result is a node list, it still needs to be vpacked if you want to assign it
1593to a \prm {vbox}. The returned \type {info} table contains four values that are
1594all numbers:
1595
1596\starttabulate[|l|p|]
1597\DB name      \BC explanation \NC \NR
1598\TB
1599\NC prevdepth \NC depth of the last line in the broken paragraph \NC \NR
1600\NC prevgraf  \NC number of lines in the broken paragraph \NC \NR
1601\NC looseness \NC the actual looseness value in the broken paragraph \NC \NR
1602\NC demerits  \NC the total demerits of the chosen solution  \NC \NR
1603\LL
1604\stoptabulate
1605
1606Note there are a few things you cannot interface using this function: You cannot
1607influence font expansion other than via \type {pdfadjustspacing}, because the
1608settings for that take place elsewhere. The same is true for hbadness and hfuzz
1609etc. All these are in the \type {hpack} routine, and that fetches its own
1610variables via globals.
1611
1612\subsubsection{\type {shipout}}
1613
1614\topicindex {shipout}
1615
1616\libindex{shipout}
1617
1618\startfunctioncall
1619tex.shipout(<number> n)
1620\stopfunctioncall
1621
1622Ships out box number \type {n} to the output file, and clears the box register.
1623
1624\subsubsection{\type {getpagestate}}
1625
1626\topicindex {pages}
1627
1628\libindex{getpagestate}
1629
1630This helper reports the current page state: \type {empty}, \type {box_there} or
1631\type {inserts_only} as integer value.
1632
1633\subsubsection{\type {getlocallevel}}
1634
1635\topicindex {nesting}
1636
1637\libindex{getlocallevel}
1638
1639This integer reports the current level of the local loop. It's only useful for
1640debugging and the (relative state) numbers can change with the implementation.
1641
1642\stopsubsection
1643
1644\startsubsection[reference=synctex,title={Functions related to synctex}]
1645
1646\topicindex {synctex}
1647
1648\libindex{setsynctexmode}      \libindex{getsynctexmode}
1649\libindex{setsynctexnofiles}
1650\libindex{setsynctextag}       \libindex{getsynctextag}   \libindex{forcesynctextag}
1651\libindex{setsynctexline}      \libindex{getsynctexline}  \libindex{forcesynctexline}
1652
1653The next helpers only make sense when you implement your own synctex logic. Keep in
1654mind that the library used in editors assumes a certain logic and is geared for
1655plain and \LATEX, so after a decade users expect a certain behaviour.
1656
1657\starttabulate[|l|p|]
1658\DB name                     \BC explanation \NC \NR
1659\TB
1660\NC \type{setsynctexmode}    \NC \type {0} is the default and used normal synctex
1661                                 logic, \type {1} uses the values set by the next
1662                                 helpers while \type {2} also sets these for glyph
1663                                 nodes; \type{3} sets glyphs and glue and \type {4}
1664                                 sets only glyphs \NC \NR
1665\NC \type{setsynctextag}     \NC set the current tag (file) value (obeys save stack) \NC \NR
1666\NC \type{setsynctexline}    \NC set the current line value (obeys save stack) \NC \NR
1667\NC \type{setsynctexnofiles} \NC disable synctex file logging \NC \NR
1668\NC \type{getsynctexmode}    \NC returns the current mode (for values see above) \NC \NR
1669\NC \type{getsynctextag}     \NC get the currently set value of tag (file) \NC \NR
1670\NC \type{getsynctexline}    \NC get the currently set value of line \NC \NR
1671\NC \type{forcesynctextag}   \NC overload the tag (file) value (\type {0} resets) \NC \NR
1672\NC \type{forcesynctexline}  \NC overload the line value  (\type {0} resets) \NC \NR
1673\LL
1674\stoptabulate
1675
1676The last one is somewhat special. Due to the way files are registered in \SYNCTEX\ we need
1677to explicitly disable that feature if we provide our own alternative if we want to avoid
1678that overhead. Passing a value of 1 disables registering.
1679
1680\stopsubsection
1681
1682\stopsection
1683
1684\startsection[title={The \type {texconfig} table},reference=texconfig][library=texconfig]
1685
1686\topicindex{libraries+\type{texconfig}}
1687
1688\topicindex {configuration}
1689
1690This is a table that is created empty. A startup \LUA\ script could fill this
1691table with a number of settings that are read out by the executable after loading
1692and executing the startup file. Watch out: some keys are different from \LUATEX,
1693which is a side effect of a more granular and dynamic memory management.
1694
1695\starttabulate[|l|l|l|l|]
1696\DB key                      \BC type         \BC default  \BC comment \NC \NR
1697\TB
1698\NC \type{buffersize}        \NC number/table \NC  1000000 \NC input buffer bytes \NC \NR
1699\NC \type{filesize}          \NC number/table \NC     1000 \NC max number of open files \NC \NR
1700\NC \type{fontsize}          \NC number/table \NC      250 \NC number of permitted fonts \NC \NR
1701\NC \type{hashsize}          \NC number/table \NC   150000 \NC number of hash entries \NC \NR
1702\NC \type{inputsize}         \NC number/table \NC    10000 \NC maximum input stack \NC \NR
1703\NC \type{languagesize}      \NC number/table \NC      250 \NC number of permitted languages \NC \NR
1704\NC \type{marksize}          \NC number/table \NC       50 \NC number of mark classes \NC \NR
1705\NC \type{nestsize}          \NC number/table \NC     1000 \NC max depth of nesting \NC \NR
1706\NC \type{nodesize}          \NC number/table \NC  1000000 \NC max node memory (various size) \NC \NR
1707\NC \type{parametersize}     \NC number/table \NC    20000 \NC max size of parameter stack \NC \NR
1708\NC \type{poolsize}          \NC number/table \NC 10000000 \NC max number of string bytes \NC \NR
1709\NC \type{savesize}          \NC number/table \NC   100000 \NC mas size of save stack \NC \NR
1710\NC \type{stringsize}        \NC number/table \NC   150000 \NC max number of strings \NC \NR
1711\NC \type{tokensize}         \NC number/table \NC  1000000 \NC max token memory \NC \NR
1712\ML
1713\NC \type{expandsize}        \NC number/table \NC    10000 \NC max expansion nesting \NC \NR
1714\NC \type{propertiessize}    \NC number       \NC        0 \NC initial size of node properties table \NC \NR
1715\NC \type{functionsize}      \NC number       \NC        0 \NC initial size of \LUA\ functions table \NC \NR
1716\NC \type{errorlinesize}     \NC number       \NC       79 \NC how much or an error is shown \NC \NR
1717\NC \type{halferrorlinesize} \NC number       \NC       50 \NC idem \NC \NR
1718\ML
1719\NC \type{formatname}        \NC string       \NC          \NC \NC \NR
1720\NC \type{jobname}           \NC string       \NC          \NC \NC \NR
1721\ML
1722\NC \type{starttime}         \NC number       \NC          \NC for testing only \NC \NR
1723\NC \type{useutctime}        \NC number       \NC          \NC for testing only \NC \NR
1724\NC \type{permitloadlib}     \NC number       \NC          \NC for testing only \NC \NR
1725\LL
1726\stoptabulate
1727
1728If no format name or jobname is given on the command line, the related keys will
1729be tested first instead of simply quitting. The statistics library has methods for
1730tracking down how much memory is available and has been configured. The size parameters
1731take a number (for the maximum allocated size) or a table with three possible keys:
1732\type {size}, \type {plus} (for extra size) and step for the increment when more memory
1733is needed. They all start out with a hard coded minimum and also have an hard coded maximum,
1734the the configured size sits somewhere between these.
1735
1736\stopsection
1737
1738\startsection[title={The \type {texio} library}][library=texio]
1739
1740\topicindex{libraries+\type{texio}}
1741\topicindex{\IO}
1742
1743This library takes care of the low|-|level I/O interface: writing to the log file
1744and|/|or console.
1745
1746\startsubsection[title={\type {write} and \type {writeselector}}]
1747
1748\libindex{write}
1749\libindex{writeselector}
1750
1751\startfunctioncall
1752texio.write(<string> target, <string> s, ...)
1753texio.write(<string> s, ...)
1754texio.writeselector(<string> s, ...)
1755\stopfunctioncall
1756
1757Without the \type {target} argument, writes all given strings to the same
1758location(s) \TEX\ writes messages to at this moment. If \prm {batchmode} is in
1759effect, it writes only to the log, otherwise it writes to the log and the
1760terminal. The optional \type {target} can be one of \type {terminal},
1761\type {logfile} or \type {terminal_and_logfile}.
1762
1763Note: If several strings are given, and if the first of these strings is or might
1764be one of the targets above, the \type {target} must be specified explicitly to
1765prevent \LUA\ from interpreting the first string as the target.
1766
1767\stopsubsection
1768
1769\startsubsection[title={\type {writenl} and \type {writeselectornl}}]
1770
1771\libindex{writenl}
1772\libindex{writeselectornl}
1773
1774\startfunctioncall
1775texio.writenl(<string> target, <string> s, ...)
1776texio.writenl(<string> s, ...)
1777texio.writeselectornl(<string> target, ...)
1778\stopfunctioncall
1779
1780This function behaves like \type {texio.write}, but makes sure that the given
1781strings will appear at the beginning of a new line. You can pass a single empty
1782string if you only want to move to the next line.
1783
1784The selector variants always expect a selector, so there is no misunderstanding
1785if \type {logfile} is a string or selector.
1786
1787\stopsubsection
1788
1789\startsubsection[title={\type {setescape}}]
1790
1791\libindex{setescape}
1792
1793You can disable \type {^^} escaping of control characters by passing a value of
1794zero.
1795
1796\stopsubsection
1797
1798\startsubsection[title={\type {closeinput}}]
1799
1800\libindex{closeinput}
1801
1802This function should be used with care. It acts as \prm {endinput} but at the
1803\LUA\ end. You can use it to (sort of) force a jump back to \TEX. Normally a
1804\LUA\ call will just collect prints and at the end bump an input level and flush
1805these prints. This function can help you stay at the current level but you need
1806to know what you're doing (or more precise: what \TEX\ is doing with input).
1807
1808\stopsubsection
1809
1810\stopsection
1811
1812\startsection[title={The \type {token} library}][library=token]
1813
1814\startsubsection[title={The scanner}]
1815
1816\topicindex{libraries+\type{token}}
1817\topicindex{tokens}
1818
1819\libindex{scankeyword}
1820\libindex{scankeywordcs}
1821\libindex{scanint}
1822\libindex{scanreal}
1823\libindex{scanfloat}
1824\libindex{scandimen}
1825\libindex{scanglue}
1826\libindex{scantoks}
1827\libindex{scancode}
1828\libindex{scanstring}
1829\libindex{scanargument}
1830\libindex{scanword}
1831\libindex{scancsname}
1832\libindex{scanlist}
1833
1834The token library provides means to intercept the input and deal with it at the
1835\LUA\ level. The library provides a basic scanner infrastructure that can be used
1836to write macros that accept a wide range of arguments. This interface is on
1837purpose kept general and as performance is quite okay so one can build additional
1838parsers without too much overhead. It's up to macro package writers to see how
1839they can benefit from this as the main principle behind \LUATEX\ is to provide a
1840minimal set of tools and no solutions. The scanner functions are probably the
1841most intriguing.
1842
1843\starttabulate[|l|l|p|]
1844\DB function             \BC argument           \BC result \NC \NR
1845\TB
1846\NC \type{scankeyword}   \NC string             \NC returns true if the given keyword is gobbled; as with
1847                                                    the regular \TEX\ keyword scanner this is case insensitive
1848                                                    (and \ASCII\ based) \NC \NR
1849\NC \type{scankeywordcs} \NC string             \NC returns true if the given keyword is gobbled; this variant
1850                                                    is case sensitive and also suitable for \UTF8 \NC \NR
1851\NC \type{scanint}       \NC                    \NC returns an integer \NC \NR
1852\NC \type{scanreal}      \NC                    \NC returns a number from e.g.\ \type {1},  \type {1.1}, \type {.1} with optional collapsed signs \NC \NR
1853\NC \type{scanfloat}     \NC                    \NC returns a number from e.g.\ \type {1},  \type {1.1}, \type {.1}, \type {1.1E10}, , \type {.1e-10} with optional collapsed signs \NC \NR
1854\NC \type{scandimen}     \NC infinity, mu-units \NC returns a number representing a dimension or two numbers being the filler and order \NC \NR
1855\NC \type{scanglue}      \NC mu-units           \NC returns a glue spec node \NC \NR
1856\NC \type{scantoks}      \NC definer, expand    \NC returns a table of tokens \NC \NR
1857\NC \type{scancode}      \NC bitset             \NC returns a character if its category is in the given bitset (representing catcodes) \NC \NR
1858\NC \type{scanstring}    \NC                    \NC returns a string given between \type {{}}, as \type {\macro} or as sequence of characters with catcode 11 or 12 \NC \NR
1859\NC \type{scanargument}  \NC                    \NC this one is simular to \type {scanstring} but also accepts a \type {\cs}
1860                                                    (which then get expanded) \NC \NR
1861\NC \type{scanword}      \NC                    \NC returns a sequence of characters with catcode 11 or 12 as string \NC \NR
1862\NC \type{scancsname}    \NC                    \NC returns \type {foo} after scanning \type {\foo} \NC \NR
1863\NC \type{scanlist}      \NC                    \NC picks up a box specification and returns a \type {[h|v]list} node \NC \NR
1864\LL
1865\stoptabulate
1866
1867The integer, dimension and glue scanners take an extra optional argument that
1868signals that en optional equal is permitted.
1869
1870The scanners can be considered stable apart from the one scanning for a token.
1871The \type {scancode} function takes an optional number, the \type {scankeyword}
1872function a normal \LUA\ string. The \type {infinity} boolean signals that we also
1873permit \type {fill} as dimension and the \type {mu-units} flags the scanner that
1874we expect math units. When scanning tokens we can indicate that we are defining a
1875macro, in which case the result will also provide information about what
1876arguments are expected and in the result this is separated from the meaning by a
1877separator token. The \type {expand} flag determines if the list will be expanded.
1878
1879The \type {scanargument} function expands the given argument. When a braced
1880argument is scanned, expansion can be prohibited by passing \type {false}
1881(default is \type {true}). In case of a control sequence passing \type {false}
1882will result in a one|-|level expansion (the meaning of the macro).
1883
1884The string scanner scans for something between curly braces and expands on the
1885way, or when it sees a control sequence it will return its meaning. Otherwise it
1886will scan characters with catcode \type {letter} or \type {other}. So, given the
1887following definition:
1888
1889\startbuffer
1890\def\oof{oof}
1891\def\foo{foo-\oof}
1892\stopbuffer
1893
1894\typebuffer \getbuffer
1895
1896we get:
1897
1898\starttabulate[|l|Tl|l|]
1899\DB name \BC result \NC \NR
1900\TB
1901\NC \type {\directlua{token.scanstring()}{foo}} \NC \directlua{context("{\\red\\type {"..token.scanstring().."}}")} {foo} \NC full expansion \NC \NR
1902\NC \type {\directlua{token.scanstring()}foo}   \NC \directlua{context("{\\red\\type {"..token.scanstring().."}}")} foo   \NC letters and others \NC \NR
1903\NC \type {\directlua{token.scanstring()}\foo}  \NC \directlua{context("{\\red\\type {"..token.scanstring().."}}")}\foo   \NC meaning \NC \NR
1904\LL
1905\stoptabulate
1906
1907The \type {\foo} case only gives the meaning, but one can pass an already
1908expanded definition (\prm {edef}'d). In the case of the braced variant one can of
1909course use the \prm {detokenize} and \prm {unexpanded} primitives since there we
1910do expand.
1911
1912The \type {scanword} scanner can be used to implement for instance a number
1913scanner. An optional boolean argument can signal that a trailing space or \type
1914{\relax} should be gobbled:
1915
1916\starttyping
1917function token.scannumber(base)
1918    return tonumber(token.scanword(),base)
1919end
1920\stoptyping
1921
1922This scanner accepts any valid \LUA\ number so it is a way to pick up floats
1923in the input.
1924
1925You can use the \LUA\ interface as follows:
1926
1927\starttyping
1928\directlua {
1929    function mymacro(n)
1930        ...
1931    end
1932}
1933
1934\def\mymacro#1{%
1935    \directlua {
1936        mymacro(\number\dimexpr#1)
1937    }%
1938}
1939
1940\mymacro{12pt}
1941\mymacro{\dimen0}
1942\stoptyping
1943
1944You can also do this:
1945
1946\starttyping
1947\directlua {
1948    function mymacro()
1949        local d = token.scandimen()
1950        ...
1951    end
1952}
1953
1954\def\mymacro{%
1955    \directlua {
1956        mymacro()
1957    }%
1958}
1959
1960\mymacro 12pt
1961\mymacro \dimen0
1962\stoptyping
1963
1964It is quite clear from looking at the code what the first method needs as
1965argument(s). For the second method you need to look at the \LUA\ code to see what
1966gets picked up. Instead of passing from \TEX\ to \LUA\ we let \LUA\ fetch from
1967the input stream.
1968
1969In the first case the input is tokenized and then turned into a string, then it
1970is passed to \LUA\ where it gets interpreted. In the second case only a function
1971call gets interpreted but then the input is picked up by explicitly calling the
1972scanner functions. These return proper \LUA\ variables so no further conversion
1973has to be done. This is more efficient but in practice (given what \TEX\ has to
1974do) this effect should not be overestimated. For numbers and dimensions it saves
1975a bit but for passing strings conversion to and from tokens has to be done anyway
1976(although we can probably speed up the process in later versions if needed).
1977
1978\stopsubsection
1979
1980\startsubsection[title={Picking up one token}]
1981
1982\libindex {scannext}
1983\libindex {scannextexpanded}
1984\libindex {skipnext}
1985\libindex {skipnextexpanded}
1986\libindex {peeknext}
1987\libindex {peeknextexpanded}
1988\libindex {scantoken}
1989\libindex {expand}
1990
1991The scanners look for a sequence. When you want to pick up one token from the
1992input you use \type {scannext}. This creates a token with the (low level)
1993properties as discussed next. This token is just the next one. If you want to
1994enforce expansion first you can use \type {scantoken} or the \type {_expanded}
1995variants. Internally tokens are characterized by a number that packs a lot of
1996information. In order to access the bits of information a token is wrapped in a
1997userdata object.
1998
1999The \type {expand} function will trigger expansion of the next token in the
2000input. This can be quite unpredictable but when you call it you probably know
2001enough about \TEX\ not to be too worried about that. It basically is a call to
2002the internal expand related function.
2003
2004\starttabulate[|lT|p|]
2005\DB name             \BC explanation \NC \NR
2006\TB
2007\NC scannext         \NC get the next token \NC \NR
2008\NC scannextexpanded \NC get the next expanded token \NC \NR
2009\NC skipnext         \NC skip the next token \NC \NR
2010\NC skipnextexpanded \NC skip the next expanded token \NC \NR
2011\NC peeknext         \NC get the next token and put it back in the input \NC \NR
2012\NC peeknextexpanded \NC get the next expanded token and put it back in the input \NC \NR
2013\LL
2014\stoptabulate
2015
2016The peek function accept a boolean argument that triggers skipping spaces and
2017alike.
2018
2019\stopsubsection
2020
2021\startsubsection[title={Creating tokens}]
2022
2023\libindex{create}
2024\libindex{new}
2025
2026\libindex{is_defined}
2027\libindex{is_token}
2028\libindex{biggest_char}
2029
2030\libindex{commands}
2031\libindex{command_id}
2032
2033\libindex{getcommand}
2034\libindex{getcmdname}
2035\libindex{getcsname}
2036\libindex{getid}
2037\libindex{getactive}
2038\libindex{getexpandable}
2039\libindex{getprotected}
2040\libindex{getmode}
2041\libindex{getindex}
2042\libindex{gettok}
2043\libindex{getfrozen}
2044\libindex{getuser}
2045
2046\libindex{scannext}
2047
2048The creator function can be used as follows:
2049
2050\starttyping
2051local t = token.create("relax")
2052\stoptyping
2053
2054This gives back a token object that has the properties of the \prm {relax}
2055primitive. The possible properties of tokens are:
2056
2057\starttabulate[|l|p|]
2058\DB name \BC explanation \NC \NR
2059\TB
2060\NC \type {command}    \NC a number representing the internal command number \NC \NR
2061\NC \type {cmdname}    \NC the type of the command (for instance the catcode in case of a
2062                           character or the classifier that determines the internal
2063                           treatment) \NC \NR
2064\NC \type {csname}     \NC the associated control sequence (if applicable) \NC \NR
2065\NC \type {id}         \NC the unique id of the token \NC \NR
2066\NC \type {tok}        \NC the full token number as stored in \TEX \NC \NR
2067\NC \type {active}     \NC a boolean indicating the active state of the token \NC \NR
2068\NC \type {expandable} \NC a boolean indicating if the token (macro) is expandable \NC \NR
2069\NC \type {protected}  \NC a boolean indicating if the token (macro) is protected \NC \NR
2070\NC \type {frozen}     \NC a boolean indicating if the token is a frozen command \NC \NR
2071\NC \type {user}       \NC a boolean indicating if the token is a user defined command \NC \NR
2072\NC \type {index}      \NC a number that indicated the subcommand; differs per command \NC \NR
2073\LL
2074\stoptabulate
2075
2076Alternatively you can use a getter \type {get<fieldname>} to access a property
2077of a token.
2078
2079The numbers that represent a catcode are the same as in \TEX\ itself, so using
2080this information assumes that you know a bit about \TEX's internals. The other
2081numbers and names are used consistently but are not frozen. So, when you use them
2082for comparing you can best query a known primitive or character first to see the
2083values.
2084
2085You can ask for a list of commands:
2086
2087\starttyping
2088local t = token.commands()
2089\stoptyping
2090
2091The id of a token class can be queried as follows:
2092
2093\starttyping
2094local id = token.command_id("math_shift")
2095\stoptyping
2096
2097If you really know what you're doing you can create character tokens by not
2098passing a string but a number:
2099
2100\starttyping
2101local letter_x = token.create(string.byte("x"))
2102local other_x  = token.create(string.byte("x"),12)
2103\stoptyping
2104
2105Passing weird numbers can give side effects so don't expect too much help with
2106that. As said, you need to know what you're doing. The best way to explore the
2107way these internals work is to just look at how primitives or macros or \prm
2108{chardef}'d commands are tokenized. Just create a known one and inspect its
2109fields. A variant that ignores the current catcode table is:
2110
2111\starttyping
2112local whatever = token.new(123,12)
2113\stoptyping
2114
2115You can test if a control sequence is defined with \type {is_defined}, which
2116accepts a string and returns a boolean:
2117
2118\starttyping
2119local okay = token.is_defined("foo")
2120\stoptyping
2121
2122The largest character possible is returned by \type {biggest_char}, just in case you
2123need to know that boundary condition.
2124
2125\stopsubsection
2126
2127\startsubsection[title={Macros}]
2128
2129\topicindex {macros}
2130
2131\libindex{setmacro}
2132\libindex{getmacro}
2133\libindex{getmeaning}
2134\libindex{setchar}
2135\libindex{setlua}
2136\libindex{getfunctionstable}
2137\libindex{pushmacro}
2138\libindex{popmacro}
2139
2140The \type {set_macro} function can get upto 4 arguments:
2141
2142\starttyping
2143set_macro("csname","content")
2144set_macro("csname","content","global")
2145set_macro("csname")
2146\stoptyping
2147
2148You can pass a catcodetable identifier as first argument:
2149
2150\starttyping
2151set_macro(catcodetable,"csname","content")
2152set_macro(catcodetable,"csname","content","global")
2153set_macro(catcodetable,"csname")
2154\stoptyping
2155
2156The results are like:
2157
2158\starttyping
2159 \def\csname{content}
2160\gdef\csname{content}
2161 \def\csname{}
2162\stoptyping
2163
2164The \type {getmacro} function can be used to get the content of a macro while
2165the \type {getmeaning} function gives the meaning including the argument
2166specification (as usual in \TEX\ separated by \type {->}).
2167
2168The \type {set_char} function can be used to do a \prm {chardef} at the
2169\LUA\ end, where invalid assignments are silently ignored:
2170
2171\starttyping
2172set_char("csname",number)
2173set_char("csname",number,"global")
2174\stoptyping
2175
2176A special one is the following:
2177
2178\starttyping
2179set_lua("mycode",id)
2180set_lua("mycode",id,"global","protected")
2181\stoptyping
2182
2183This creates a token that refers to a \LUA\ function with an entry in the table
2184that you can access with \type {lua.getfunctions_table}. It is the companion
2185to \prm {luadef}. When the first (and only) argument is true the size will preset
2186to the value of \type {texconfig.function_size}.
2187
2188The \type {pushmacro} and \type {popmacro} function are very experimental and
2189can be used to get and set an existing macro. The push call returns a user data
2190object and the pop takes such a userdata object. These object have no accessors
2191and are to be seen as abstractions.
2192
2193\stopsubsection
2194
2195\startsubsection[title={Pushing back}]
2196
2197\libindex{scannext}
2198\libindex{putnext}
2199
2200There is a (for now) experimental putter:
2201
2202\starttyping
2203local t1 = token.scannext()
2204local t2 = token.scannext()
2205local t3 = token.scannext()
2206local t4 = token.scannext()
2207-- watch out, we flush in sequence
2208token.putnext { t1, t2 }
2209-- but this one gets pushed in front
2210token.putnext ( t3, t4 )
2211\stoptyping
2212
2213When we scan \type {wxyz!} we get \type {yzwx!} back. The argument is either a
2214table with tokens or a list of tokens. The \type {token.expand} function will
2215trigger expansion but what happens really depends on what you're doing where.
2216
2217This putter is actually a bit more flexible because the following input also
2218works out okay:
2219
2220\startbuffer
2221\def\foo#1{[#1]}
2222
2223\directlua {
2224    local list = { 101, 102, 103, token.create("foo"), "{abracadabra}" }
2225    token.putnext("(the)")
2226    token.putnext(list)
2227    token.putnext("(order)")
2228    token.putnext(unpack(list))
2229    token.putnext("(is reversed)")
2230}
2231\stopbuffer
2232
2233\typebuffer
2234
2235We get this: \blank {\tt \inlinebuffer} \blank So, strings get converted to
2236individual tokens according to the current catcode regime and numbers become
2237characters also according to this regime.
2238
2239\stopsubsection
2240
2241\startsubsection[title={Nota bene}]
2242
2243When scanning for the next token you need to keep in mind that we're not scanning
2244like \TEX\ does: expanding, changing modes and doing things as it goes. When we
2245scan with \LUA\ we just pick up tokens. Say that we have:
2246
2247\pushmacro\oof \let\oof\undefined
2248
2249\starttyping
2250\oof
2251\stoptyping
2252
2253but \type {\oof} is undefined. Normally \TEX\ will then issue an error message.
2254However, when we have:
2255
2256\starttyping
2257\def\foo{\oof}
2258\stoptyping
2259
2260We get no error, unless we expand \type {\foo} while \type {\oof} is still
2261undefined. What happens is that as soon as \TEX\ sees an undefined macro it will
2262create a hash entry and when later it gets defined that entry will be reused. So,
2263\type {\oof} really exists but can be in an undefined state.
2264
2265\startbuffer[demo]
2266oof        : \directlua{tex.print(token.scancsname())}\oof
2267foo        : \directlua{tex.print(token.scancsname())}\foo
2268myfirstoof : \directlua{tex.print(token.scancsname())}\myfirstoof
2269\stopbuffer
2270
2271\startlines
2272\getbuffer[demo]
2273\stoplines
2274
2275This was entered as:
2276
2277\typebuffer[demo]
2278
2279The reason that you see \type {oof} reported and not \type {myfirstoof} is that
2280\type {\oof} was already used in a previous paragraph.
2281
2282If we now say:
2283
2284\startbuffer
2285\def\foo{}
2286\stopbuffer
2287
2288\typebuffer \getbuffer
2289
2290we get:
2291
2292\startlines
2293\getbuffer[demo]
2294\stoplines
2295
2296And if we say
2297
2298\startbuffer
2299\def\foo{\oof}
2300\stopbuffer
2301
2302\typebuffer \getbuffer
2303
2304we get:
2305
2306\startlines
2307\getbuffer[demo]
2308\stoplines
2309
2310When scanning from \LUA\ we are not in a mode that defines (undefined) macros at
2311all. There we just get the real primitive undefined macro token.
2312
2313\startbuffer
2314\directlua{local t = token.scannext() tex.print(t.id.." "..t.tok)}\myfirstoof
2315\directlua{local t = token.scannext() tex.print(t.id.." "..t.tok)}\mysecondoof
2316\directlua{local t = token.scannext() tex.print(t.id.." "..t.tok)}\mythirdoof
2317\stopbuffer
2318
2319\startlines
2320\getbuffer
2321\stoplines
2322
2323This was generated with:
2324
2325\typebuffer
2326
2327So, we do get a unique token because after all we need some kind of \LUA\ object
2328that can be used and garbage collected, but it is basically the same one,
2329representing an undefined control sequence.
2330
2331\popmacro\oof
2332
2333\stopsubsection
2334
2335\stopsection
2336
2337\stopchapter
2338
2339\stopcomponent
2340