cld-abitoflua.tex /size: 21 Kb    last modification: 2021-10-28 13:50
1% language=us runpath=texruns:manuals/cld
2
3\startcomponent cld-abitoflua
4
5\environment cld-environment
6
7\startchapter[title=A bit of Lua]
8
9\startsection[title=The language]
10
11\index[lua]{\LUA}
12
13Small is beautiful and this is definitely true for the programming language \LUA\
14(moon in Portuguese). We had good reasons for using this language in \LUATEX:
15simplicity, speed, syntax and size to mention a few. Of course personal taste
16also played a role and after using a couple of scripting languages extensively
17the switch to \LUA\ was rather pleasant.
18
19As the \LUA\ reference manual is an excellent book there is no reason to discuss
20the language in great detail: just buy \quote {Programming in \LUA} by the \LUA\
21team. Nevertheless I will give a short summary of the important concepts but
22consult the book if you want more details.
23
24\stopsection
25
26\startsection[title=Data types]
27
28\index{functions}
29\index{variables}
30\index{strings}
31\index{numbers}
32\index{booleans}
33\index{tables}
34
35The most basic data type is \type {nil}. When we define a variable, we don't need
36to give it a value:
37
38\starttyping
39local v
40\stoptyping
41
42Here the variable \type {v} can get any value but till that
43happens it equals \type {nil}. There are simple data types like
44\type {numbers}, \type {booleans} and \type {strings}. Here are
45some numbers:
46
47\starttyping
48local n = 1 + 2 * 3
49local x = 2.3
50\stoptyping
51
52Numbers are always floats \footnote {This is true for all versions upto 5.2 but
53following version can have a more hybrid model.} and you can use the normal
54arithmetic operators on them as well as functions defined in the math library.
55Inside \TEX\ we have only integers, although for instance dimensions can be
56specified in points using floats but that's more syntactic sugar. One reason for
57using integers in \TEX\ has been that this was the only way to guarantee
58portability across platforms. However, we're 30 years along the road and in \LUA\
59the floats are implemented identical across platforms, so we don't need to worry
60about compatibility.
61
62Strings in \LUA\ can be given between quotes or can be so called long strings
63forced by square brackets.
64
65\starttyping
66local s = "Whatever"
67local t = s .. ' you want'
68local u = t .. [[ to know]] .. [[--[ about Lua!]--]]
69\stoptyping
70
71The two periods indicate a concatenation. Strings are hashed, so when you say:
72
73\starttyping
74local s = "Whatever"
75local t = "Whatever"
76local u = t
77\stoptyping
78
79only one instance of \type {Whatever} is present in memory and this fact makes
80\LUA\ very efficient with respect to strings. Strings are constants and therefore
81when you change variable \type {s}, variable \type {t} keeps its value. When you
82compare strings, in fact you compare pointers, a method that is really fast. This
83compensates the time spent on hashing pretty well.
84
85Booleans are normally used to keep a state or the result from an expression.
86
87\starttyping
88local b = false
89local c = n > 10 and s == "whatever"
90\stoptyping
91
92The other value is \type {true}. There is something that you need
93to keep in mind when you do testing on variables that are yet
94unset.
95
96\starttyping
97local b = false
98local n
99\stoptyping
100
101The following applies when \type {b} and \type {n} are defined this way:
102
103\starttabulate[|Tl|Tl|]
104\NC b == false \NC true  \NC \NR
105\NC n == false \NC false \NC \NR
106\NC n == nil   \NC true  \NC \NR
107\NC b == nil   \NC false \NC \NR
108\NC b == n     \NC false \NC \NR
109\NC n == nil   \NC true  \NC \NR
110\stoptabulate
111
112Often a test looks like:
113
114\starttyping
115if somevar then
116    ...
117else
118    ...
119end
120\stoptyping
121
122In this case we enter the else branch when \type {somevar} is either \type {nil}
123or \type {false}. It also means that by looking at the code we cannot beforehand
124conclude that \type {somevar} equals \type {true} or something else. If you want
125to really distinguish between the two cases you can be more explicit:
126
127\starttyping
128if somevar == nil then
129    ...
130elseif somevar == false then
131    ...
132else
133    ...
134end
135\stoptyping
136
137or
138
139\starttyping
140if somevar == true then
141    ...
142else
143    ...
144end
145\stoptyping
146
147but such an explicit test is seldom needed.
148
149There are a few more data types: tables and functions. Tables are very important
150and you can recognize them by the same curly braces that make \TEX\ famous:
151
152\starttyping
153local t = { 1, 2, 3 }
154local u = { a = 4, b = 9, c = 16 }
155local v = { [1] = "a", [3] = "2", [4] = false }
156local w = { 1, 2, 3, a = 4, b = 9, c = 16 }
157\stoptyping
158
159The \type {t} is an indexed table and \type {u} a hashed table. Because the
160second slot is empty, table \type {v} is partially indexed (slot 1) and partially
161hashed (the others). There is a gray area there, for instance, what happens when
162you nil a slot in an indexed table? In practice you will not run into problems as
163you will either use a hashed table, or an indexed table (with no holes), so table
164\type {w} is not uncommon.
165
166We mentioned that strings are in fact shared (hashed) but that an assignment of a
167string to a variable makes that variable behave like a constant. Contrary to
168that, when you assign a table, and then copy that variable, both variables can be
169used to change the table. Take this:
170
171\starttyping
172local t = { 1, 2, 3 }
173local u = t
174\stoptyping
175
176We can change the content of the table as follows:
177
178\starttyping
179t[1], t[3] = t[3], t[1]
180\stoptyping
181
182Here we swap two cells. This is an example of a parallel assigment. However, the
183following does the same:
184
185\starttyping
186t[1], t[3] = u[3], u[1]
187\stoptyping
188
189After this, both \type {t} and \type {u} still share the same table. This kind of
190behaviour is quite natural. Keep in mind that expressions are evaluated first, so
191
192\starttyping
193t[#t+1], t[#t+1] = 23, 45
194\stoptyping
195
196Makes no sense, as the values end up in the same slot. There is no gain in speed
197so using parallel assignments is mostly a convenience feature.
198
199There are a few specialized data types in \LUA, like \type {coroutines} (built
200in), \type {file} (when opened), \type {lpeg} (only when this library is linked
201in or loaded). These are called \quote {userdata} objects and in \LUATEX\ we have
202more userdata objects as we will see in later chapters. Of them nodes are the
203most noticeable: they are the core data type of the \TEX\ machinery. Other
204libraries, like \type {math} and \type {bit32} are just collections of functions
205operating on numbers.
206
207Functions look like this:
208
209\starttyping
210function sum(a,b)
211  print(a, b, a + b)
212end
213\stoptyping
214
215or this:
216
217\starttyping
218function sum(a,b)
219  return a + b
220end
221\stoptyping
222
223There can be many arguments of all kind of types and there can be multiple return
224values. A function is a real type, so you can say:
225
226\starttyping
227local f = function(s) print("the value is: " .. s) end
228\stoptyping
229
230In all these examples we defined variables as \type {local}. This is a good
231practice and avoids clashes. Now watch the following:
232
233\starttyping
234local n = 1
235
236function sum(a,b)
237  n = n + 1
238  return a + b
239end
240
241function report()
242  print("number of summations: " .. n)
243end
244\stoptyping
245
246Here the variable \type {n} is visible after its definition and accessible for
247the two global functions. Actually the variable is visible to all the code
248following, unless of course we define a new variable with the same name. We can
249hide \type {n} as follows:
250
251\starttyping
252do
253  local n = 1
254
255  sum = function(a,b)
256    n = n + 1
257    return a + b
258  end
259
260  report = function()
261    print("number of summations: " .. n)
262  end
263end
264\stoptyping
265
266This example also shows another way of defining the function: by assignment.
267
268The \typ {do ... end} creates a so called closure. There are many places where
269such closures are created, for instance in function bodies or branches like \typ
270{if ... then ... else}. This means that in the following snippet, variable \type
271{b} is not seen after the end:
272
273\starttyping
274if a > 10 then
275  local b = a + 10
276  print(b*b)
277end
278\stoptyping
279
280When you process a blob of \LUA\ code in \TEX\ (using \type {\directlua} or \type
281{\latelua}) it happens in a closure with an implied \typ {do ... end}. So, \type
282{local} defined variables are really local.
283
284\stopsection
285
286\startsection[title=\TEX's data types]
287
288We mentioned \type {numbers}. At the \TEX\ end we have counters as well as
289dimensions. Both are numbers but dimensions are specified differently
290
291\starttyping
292local n = tex.count[0]
293local m = tex.dimen.lineheight
294local o = tex.sp("10.3pt") -- sp or 'scaled point' is the smallest unit
295\stoptyping
296
297The unit of dimension is \quote {scaled point} and this is a pretty small unit:
29810 points equals to 655360 such units.
299
300Another accessible data type is tokens. They are automatically converted to
301strings and vice versa.
302
303\starttyping
304tex.toks[0] = "message"
305print(tex.toks[0])
306\stoptyping
307
308Be aware of the fact that the tokens are letters so the following will come out
309as text and not issue a message:
310
311\starttyping
312tex.toks[0] = "\message{just text}"
313print(tex.toks[0])
314\stoptyping
315
316\stopsection
317
318\startsection[title=Control structures]
319
320\index{loops}
321
322Loops are not much different from other languages: we have \typ {for ... do},
323\typ {while ... do} and \typ {repeat ... until}. We start with the simplest case:
324
325\starttyping
326for index=1,10 do
327  print(index)
328end
329\stoptyping
330
331You can specify a step and go downward as well:
332
333\starttyping
334for index=22,2,-2 do
335  print(index)
336end
337\stoptyping
338
339Indexed tables can be traversed this way:
340
341\starttyping
342for index=1,#list do
343  print(index, list[index])
344end
345\stoptyping
346
347Hashed tables on the other hand are dealt with as follows:
348
349\starttyping
350for key, value in next, list do
351  print(key, value)
352end
353\stoptyping
354
355Here \type {next} is a built in function. There is more to say about this
356mechanism but the average user will use only this variant. Slightly less
357efficient is the following, more readable variant:
358
359\starttyping
360for key, value in pairs(list) do
361  print(key, value)
362end
363\stoptyping
364
365and for an indexed table:
366
367\starttyping
368for index, value in ipairs(list) do
369  print(index, value)
370end
371\stoptyping
372
373The function call to \type {pairs(list)} returns \typ {next, list} so there is an
374(often neglectable) extra overhead of one function call.
375
376The other two loop variants, \type {while} and \type {repeat}, are similar.
377
378\starttyping
379i = 0
380while i < 10  do
381  i = i + 1
382  print(i)
383end
384\stoptyping
385
386This can also be written as:
387
388\starttyping
389i = 0
390repeat
391  i = i + 1
392  print(i)
393until i = 10
394\stoptyping
395
396Or:
397
398\starttyping
399i = 0
400while true do
401  i = i + 1
402  print(i)
403  if i = 10 then
404    break
405  end
406end
407\stoptyping
408\stopsection
409
410Of course you can use more complex expressions in such constructs.
411
412\startsection[title=Conditions]
413
414\index{expressions}
415
416Conditions have the following form:
417
418\starttyping
419if a == b or c > d or e then
420  ...
421elseif f == g then
422  ...
423else
424  ...
425end
426\stoptyping
427
428Watch the double \type {==}. The complement of this is \type {~=}. Precedence is
429similar to other languages. In practice, as strings are hashed. Tests like
430
431\starttyping
432if key == "first" then
433  ...
434end
435\stoptyping
436
437and
438
439\starttyping
440if n == 1 then
441  ...
442end
443\stoptyping
444
445are equally efficient. There is really no need to use numbers to identify states
446instead of more verbose strings.
447
448\stopsection
449
450\startsection[title=Namespaces]
451
452\index{namespaces}
453
454Functionality can be grouped in libraries. There are a few default libraries,
455like \type {string}, \type {table}, \type {lpeg}, \type {math}, \type {io} and
456\type {os} and \LUATEX\ adds some more, like \type {node}, \type {tex} and \type
457{texio}.
458
459A library is in fact nothing more than a bunch of functionality organized using a
460table, where the table provides a namespace as well as place to store public
461variables. Of course there can be local (hidden) variables used in defining
462functions.
463
464\starttyping
465do
466  mylib = { }
467
468  local n = 1
469
470  function mylib.sum(a,b)
471    n = n + 1
472    return a + b
473  end
474
475  function mylib.report()
476    print("number of summations: " .. n)
477  end
478end
479\stoptyping
480
481The defined function can be called like:
482
483\starttyping
484mylib.report()
485\stoptyping
486
487You can also create a shortcut, This speeds up the process because there are less
488lookups then. In the following code multiple calls take place:
489
490\starttyping
491local sum = mylib.sum
492
493for i=1,10 do
494  for j=1,10 do
495    print(i, j, sum(i,j))
496  end
497end
498
499mylib.report()
500\stoptyping
501
502As \LUA\ is pretty fast you should not overestimate the speedup, especially not
503when a function is called seldom. There is an important side effect here: in the
504case of:
505
506\starttyping
507  print(i, j, sum(i,j))
508\stoptyping
509
510the meaning of \type {sum} is frozen. But in the case of
511
512\starttyping
513  print(i, j, mylib.sum(i,j))
514\stoptyping
515
516The current meaning is taken, that is: each time the interpreter will access
517\type {mylib} and get the current meaning of \type {sum}. And there can be a good
518reason for this, for instance when the meaning is adapted to different
519situations.
520
521In \CONTEXT\ we have quite some code organized this way. Although much is exposed
522(if only because it is used all over the place) you should be careful in using
523functions (and data) that are still experimental. There are a couple of general
524libraries and some extend the core \LUA\ libraries. You might want to take a look
525at the files in the distribution that start with \type {l-}, like \type
526{l-table.lua}. These files are preloaded.\footnote {In fact, if you write scripts
527that need their functionality, you can use \type {mtxrun} to process the script,
528as \type {mtxrun} has the core libraries preloaded as well.} For instance, if you
529want to inspect a table, you can say:
530
531\starttyping
532local t = { "aap", "noot", "mies" }
533table.print(t)
534\stoptyping
535
536You can get an overview of what is implemented by running the following command:
537
538\starttyping
539context s-tra-02 --mode=tablet
540\stoptyping
541
542{\em todo: add nice synonym for this module and also add helpinfo at the to so
543that we can do \type {context --styles}}
544
545\stopsection
546
547\startsection[title=Comment]
548
549\index{comment}
550
551You can add comments to your \LUA\ code. There are basically two methods: one
552liners and multi line comments.
553
554\starttyping
555local option = "test" -- use this option with care
556
557local method = "unknown" --[[comments can be very long and when entered
558                             this way they and span multiple lines]]
559\stoptyping
560
561The so called long comments look like long strings preceded by \type {--} and
562there can be more complex boundary sequences.
563
564\stopsection
565
566\startsection[title=Pitfalls]
567
568Sometimes \type {nil} can bite you, especially in tables, as they have a dual nature:
569indexed as well as hashed.
570
571\startbuffer
572\startluacode
573local n1 = # { nil, 1, 2, nil }      -- 3
574local n2 = # { nil, nil, 1, 2, nil } -- 0
575
576context("n1 = %s and n2 = %s",n1,n2)
577\stopluacode
578\stopbuffer
579
580\typebuffer
581
582results in: \getbuffer
583
584So, you cannot really depend on the length operator here. On the other hand, with:
585
586\startbuffer
587\startluacode
588local function check(...)
589    return select("#",...)
590end
591
592local n1 = check ( nil, 1, 2, nil )      -- 4
593local n2 = check ( nil, nil, 1, 2, nil ) -- 5
594
595context("n1 = %s and n2 = %s",n1,n2)
596\stopluacode
597\stopbuffer
598
599\typebuffer
600
601we get: \getbuffer, so the \type {select} is quite useable. However, that function also
602has its specialities. The following example needs some close reading:
603
604\startbuffer
605\startluacode
606local function filter(n,...)
607    return select(n,...)
608end
609
610local v1 = { filter ( 1, 1, 2, 3 ) }
611local v2 = { filter ( 2, 1, 2, 3 ) }
612local v3 = { filter ( 3, 1, 2, 3 ) }
613
614context("v1 = %+t and v2 = %+t and v3 = %+t",v1,v2,v3)
615\stopluacode
616\stopbuffer
617
618\typebuffer
619
620We collect the result in a table and show the concatination:
621
622\getbuffer
623
624So, what you effectively get is the whole list starting with the given offset.
625
626\startbuffer
627\startluacode
628local function filter(n,...)
629    return (select(n,...))
630end
631
632local v1 = { filter ( 1, 1, 2, 3 ) }
633local v2 = { filter ( 2, 1, 2, 3 ) }
634local v3 = { filter ( 3, 1, 2, 3 ) }
635
636context("v1 = %+t and v2 = %+t and v3 = %+t",v1,v2,v3)
637\stopluacode
638\stopbuffer
639
640\typebuffer
641
642Now we get: \getbuffer. The extra \type {()} around the result makes sure that
643we only get one return value.
644
645Of course the same effect can be achieved as follows:
646
647\starttyping
648local function filter(n,...)
649    return select(n,...)
650end
651
652local v1 = filter ( 1, 1, 2, 3 )
653local v2 = filter ( 2, 1, 2, 3 )
654local v3 = filter ( 3, 1, 2, 3 )
655
656context("v1 = %s and v2 = %s and v3 = %s",v1,v2,v3)
657\stoptyping
658
659\stopsection
660
661\startsection[title={A few suggestions}]
662
663You can wrap all kind of functionality in functions but sometimes it makes no
664sense to add the overhead of a call as the same can be done with hardly any code.
665
666If you want a slice of a table, you can copy the range needed to a new table. A
667simple version with no bounds checking is:
668
669\starttyping
670local new = { } for i=a,b do new[#new+1] = old[i] end
671\stoptyping
672
673Another, much faster, variant is the following.
674
675\starttyping
676local new = { unpack(old,a,b) }
677\stoptyping
678
679You can use this variant for slices that are not extremely large. The function
680\type {table.sub} is an equivalent:
681
682\starttyping
683local new = table.sub(old,a,b)
684\stoptyping
685
686An indexed table is empty when its size equals zero:
687
688\starttyping
689if #indexed == 0 then ... else ... end
690\stoptyping
691
692Sometimes this is better:
693
694\starttyping
695if indexed and #indexed == 0 then ... else ... end
696\stoptyping
697
698So how do we test if a hashed table is empty? We can use the
699\type {next} function as in:
700
701\starttyping
702if hashed and next(indexed) then ... else ... end
703\stoptyping
704
705Say that we have the following table:
706
707\starttyping
708local t = { a=1, b=2, c=3 }
709\stoptyping
710
711The call \type {next(t)} returns the first key and value:
712
713\starttyping
714local k, v = next(t)   -- "a", 1
715\stoptyping
716
717The second argument to \type {next} can be a key in which case the
718following key and value in the hash table is returned. The result
719is not predictable as a hash is unordered. The generic for loop
720uses this to loop over a hashed table:
721
722\starttyping
723for k, v in next, t do
724    ...
725end
726\stoptyping
727
728Anyway, when \type {next(t)} returns zero you can be sure that the table is
729empty. This is how you can test for exactly one entry:
730
731\starttyping
732if t and not next(t,next(t)) then ... else ... end
733\stoptyping
734
735Here it starts making sense to wrap it into a function.
736
737\starttyping
738function table.has_one_entry(t)
739    t and not next(t,next(t))
740end
741\stoptyping
742
743On the other hand, this is not that usefull, unless you can spent the runtime on
744it:
745
746\starttyping
747function table.is_empty(t)
748    return not t or not next(t)
749end
750\stoptyping
751
752\stopsection
753
754\startsection[title=Interfacing]
755
756We have already seen that you can embed \LUA\ code using commands like:
757
758\starttyping
759\startluacode
760    print("this works")
761\stopluacode
762\stoptyping
763
764This command should not be confused with:
765
766\starttyping
767\startlua
768    print("this works")
769\stoplua
770\stoptyping
771
772The first variant has its own catcode regime which means that tokens between the start
773and stop command are treated as \LUA\ tokens, with the exception of \TEX\ commands. The
774second variant operates under the regular \TEX\ catcode regime.
775
776Their short variants are \type {\ctxluacode} and \type {\ctxlua} as in:
777
778\starttyping
779\ctxluacode{print("this works")}
780\ctxlua{print("this works")}
781\stoptyping
782
783In practice you will probably use \type {\startluacode} when using or defining % \stopluacode
784a blob of \LUA\ and \type {\ctxlua} for inline code. Keep in mind that the
785longer versions need more initialization and have more overhead.
786
787There are some more commands. For instance \type {\ctxcommand} can be used as
788an efficient way to access functions in the \type {commands} namespace. The
789following two calls are equivalent:
790
791\starttyping
792\ctxlua    {commands.thisorthat("...")}
793\ctxcommand         {thisorthat("...")}
794\stoptyping
795
796There are a few shortcuts to the \type {context} namespace. Their use can best be
797seen from their meaning:
798
799\starttyping
800\cldprocessfile#1{\directlua{context.runfile("#1")}}
801\cldloadfile   #1{\directlua{context.loadfile("#1")}}
802\cldcontext    #1{\directlua{context(#1)}}
803\cldcommand    #1{\directlua{context.#1}}
804\stoptyping
805
806The \type {\directlua{}} command can also be implemented using the token parser
807and \LUA\ itself. A variant is therefore \type {\luascript{}} which can be
808considered an alias but with a bit different error reporting. A variant on this
809is the \type {\luathread {name} {code}} command. Here is an example of their
810usage:
811
812\startbuffer
813\luascript        {        context("foo 1:") context(i) } \par
814\luathread {test} { i = 10 context("bar 1:") context(i) } \par
815\luathread {test} {        context("bar 2:") context(i) } \par
816\luathread {test} {} % resets
817\luathread {test} {        context("bar 3:") context(i) } \par
818\luascript        {        context("foo 2:") context(i) } \par
819\stopbuffer
820
821\typebuffer
822
823These commands result in:
824
825\startpacked \getbuffer \stoppacked
826
827% \testfeatureonce{100000}{\directlua        {local a = 10 local a = 10 local a = 10}} % 0.53s
828% \testfeatureonce{100000}{\luascript        {local a = 10 local a = 10 local a = 10}} % 0.62s
829% \testfeatureonce{100000}{\luathread {test} {local a = 10 local a = 10 local a = 10}} % 0.79s
830
831The variable \type {i} is local to the thread (which is not really a thread in
832\LUA\ but more a named piece of code that provides an environment which is shared
833over the calls with the same name. You will probably never need these.
834
835Each time a call out to \LUA\ happens the argument eventually gets parsed, converted
836into tokens, then back into a string, compiled to bytecode and executed. The next
837example code shows a mechanism that avoids this:
838
839\starttyping
840\startctxfunction MyFunctionA
841    context(" A1 ")
842\stopctxfunction
843
844\startctxfunctiondefinition MyFunctionB
845    context(" B2 ")
846\stopctxfunctiondefinition
847\stoptyping
848
849The first command associates a name with some \LUA\ code and that code can be
850executed using:
851
852\starttyping
853\ctxfunction{MyFunctionA}
854\stoptyping
855
856The second definition creates a command, so there we do:
857
858\starttyping
859\MyFunctionB
860\stoptyping
861
862There are some more helpers but for use in document sources they make less sense. You
863can always browse the source code for examples.
864
865\stopsection
866
867\stopchapter
868
869\stopcomponent
870