cld-luafunctions.tex /size: 75 Kb    last modification: 2021-10-28 13:50
1% language=us runpath=texruns:manuals/cld
2
3% table.unnest  : only used in special cases
4% table.derive  : set metatable if unset
5% table.compact : remove empty subtables
6
7\environment cld-environment
8
9\startcomponent cld-luafunctions
10
11\startchapter[title=Lua Functions]
12
13\startsection[title={Introduction}]
14
15When you run \CONTEXT\ you have some libraries preloaded. If you look into the
16\LUA\ files you will find more than is discussed here, but keep in mind that what
17is not documented, might be gone or done different one day. Some extensions live
18in the same namespace as those provided by stock \LUA\ and \LUATEX, others have
19their own. There are many more functions and the more obscure (or never being
20used) ones will go away.
21
22The \LUA\ code in \CONTEXT\ is organized in quite some modules. Those with names
23like \type {l-*.lua} are rather generic and are automatically available when you
24use \type {mtxrun} to run a \LUA\ file. These are discusses in this chapter. A
25few more modules have generic properties, like some in the categories \type
26{util-*.lua}, \type {trac-*.lua}, \type {luat-*.lua}, \type {data-*.lua} and
27\type {lxml-*.lua}. They contain more specialized functions and are discussed
28elsewhere.
29
30Before we move on the the real code, let's introduce a handy helper:
31
32\starttyping
33inspect(somevar)
34\stoptyping
35
36Whenever you feel the need to see what value a variable has you can insert this
37function to get some insight. It knows how to deal with several data types.
38
39\stopsection
40
41\startsection[title={Tables}]
42
43\startsummary[title={[lua] concat}]
44
45These functions come with \LUA\ itself and are discussed in detail in the \LUA\
46reference manual so we stick to some examples. The \type {concat} function
47stitches table entries in an indexed table into one string, with an optional
48separator in between. If can also handle a slice of the table
49
50\starttyping
51local str = table.concat(t)
52local str = table.concat(t,separator)
53local str = table.concat(t,separator,first)
54local str = table.concat(t,separator,first,last)
55\stoptyping
56
57Only strings and numbers can be concatenated.
58
59\ShowLuaExampleThree {table} {concat} {{"a","b","c","d","e"}}
60\ShowLuaExampleThree {table} {concat} {{"a","b","c","d","e"},"+"}
61\ShowLuaExampleThree {table} {concat} {{"a","b","c","d","e"},"+",2,3}
62
63\stopsummary
64
65\startsummary[title={[lua] insert remove}]
66
67You can use \type {insert} and \type {remove} for adding or replacing entries in
68an indexed table.
69
70\starttyping
71table.insert(t,position,value)
72value = table.remove(t,position)
73\stoptyping
74
75The position is optional and defaults to the last entry in the table. For
76instance a stack is built this way:
77
78\starttyping
79table.insert(stack,"top")
80local top = table.remove(stack)
81\stoptyping
82
83Beware, the \type {insert} function returns nothing. You can provide an
84additional position:
85
86\starttyping
87table.insert(list,"injected in slot 2",2)
88local thiswastwo = table.remove(list,2)
89\stoptyping
90
91\stopsummary
92
93\startsummary[title={[lua] unpack}]
94
95You can access entries in an indexed table as follows:
96
97\starttyping
98local a, b, c = t[1], t[2], t[3]
99\stoptyping
100
101but this does the same:
102
103\starttyping
104local a, b, c = table.unpack(t)
105\stoptyping
106
107This is less efficient but there are situations where \type {unpack}
108comes in handy.
109
110\stopsummary
111
112\startsummary[title={[lua] sort}]
113
114Sorting is done with \type {sort}, a function that does not return a value but
115operates on the given table.
116
117\starttyping
118table.sort(t)
119table.sort(t,comparefunction)
120\stoptyping
121
122The compare function has to return a consistent equivalent of \type {true} or
123\type {false}. For sorting more complex data structures there is a specialized
124sort module available.
125
126\ShowLuaExampleFour {table} {sort} {{"a","b","c"}} {}
127\ShowLuaExampleFour {table} {sort} {{"a","b","c"}} {,function(x,y) return x > y end}
128\ShowLuaExampleFour {table} {sort} {{"a","b","c"}} {,function(x,y) return x < y end}
129
130\stopsummary
131
132\startsummary[title={sorted}]
133
134The built|-|in \type {sort} function does not return a value but sometimes it can be
135if the (sorted) table is returned. This is why we have:
136
137\starttyping
138local a = table.sorted(b)
139\stoptyping
140
141\stopsummary
142
143% table.strip
144
145\startsummary[title={keys sortedkeys sortedhashkeys sortedhash}]
146
147The \type {keys} function returns an indexed list of keys. The order is undefined
148as it depends on how the table was constructed. A sorted list is provided by
149\type {sortedkeys}. This function is rather liberal with respect to the keys. If
150the keys are strings you can use the faster alternative \type {sortedhashkeys}.
151
152\starttyping
153local s = table.keys (t)
154local s = table.sortedkeys (t)
155local s = table.sortedhashkeys (t)
156\stoptyping
157
158Because a sorted list is often processed there is also an iterator:
159
160\starttyping
161for key, value in table.sortedhash(t) do
162    print(key,value)
163end
164\stoptyping
165
166There is also a synonym \type {sortedpairs} which sometimes looks more natural
167when used alongside the \type {pairs} and \type {ipairs} iterators.
168
169\ShowLuaExampleTwo {table} {keys}           {{ [1] = 2, c = 3, [true] = 1 }}
170\ShowLuaExampleTwo {table} {sortedkeys}     {{ [1] = 2, c = 3, [true] = 1 }}
171\ShowLuaExampleTwo {table} {sortedhashkeys} {{ a = 2, c = 3, b = 1 }}
172
173\stopsummary
174
175\startsummary[title={serialize print tohandle tofile}]
176
177The \type {serialize} function converts a table into a verbose representation.
178The \type {print} function does the same but prints the result to the console
179which is handy for tracing. The \type {tofile} function writes the table to a
180file, using reasonable chunks so that less memory is used. The fourth variant
181\type {tohandle} takes a handle so that you can do whatever you like with the
182result.
183
184\starttyping
185table.serialize (root, name, reduce, noquotes, hexify)
186table.print (root, name, reduce, noquotes, hexify)
187table.tofile (filename, root, name, reduce, noquotes, hexify)
188table.tohandle (handle, root, name, reduce, noquotes, hexify)
189\stoptyping
190
191The serialization can be controlled in several ways. Often only the first two
192options makes sense:
193
194\ShowLuaExampleOne {table} {serialize} {{ a = 2 }}
195\ShowLuaExampleOne {table} {serialize} {{ a = 2 }, "name"}
196\ShowLuaExampleOne {table} {serialize} {{ a = 2 }, true}
197\ShowLuaExampleOne {table} {serialize} {{ a = 2 }, false}
198\ShowLuaExampleOne {table} {serialize} {{ a = 2 }, "return"}
199\ShowLuaExampleOne {table} {serialize} {{ a = 2 }, 12}
200
201\ShowLuaExampleOne {table} {serialize} {{ a = 2, [3] = "b", [true] = "6" }, nil, true}
202\ShowLuaExampleOne {table} {serialize} {{ a = 2, [3] = "b", [true] = "6" }, nil, true, true}
203\ShowLuaExampleOne {table} {serialize} {{ a = 2, [3] = "b", [true] = "6" }, nil, true, true, true}
204
205In \CONTEXT\ there is also a \type {tocontext} function that typesets the table
206verbose. This is handy for manuals and tracing.
207
208\stopsummary
209
210\startsummary[title={identical are_equal}]
211
212These two function compare two tables that have a similar structure. The \type
213{identical} variant operates on a hash while \type {are_equal} assumes an indexed
214table.
215
216\starttyping
217local b = table.identical (one, two)
218local b = table.are_equal (one, two)
219\stoptyping
220
221\ShowLuaExampleThree {table} {identical} {{ a = { x = 2 } }, { a = { x = 3 } }}
222\ShowLuaExampleThree {table} {identical} {{ a = { x = 2 } }, { a = { x = 2 } }}
223
224\ShowLuaExampleThree {table} {are_equal} {{ a = { x = 2 } }, { a = { x = 3 } }}
225\ShowLuaExampleThree {table} {are_equal} {{ a = { x = 2 } }, { a = { x = 2 } }}
226
227\ShowLuaExampleThree {table} {identical} {{ "one", "two" }, { "one", "two" }}
228\ShowLuaExampleThree {table} {identical} {{ "one", "two" }, { "two", "one" }}
229
230\ShowLuaExampleThree {table} {are_equal} {{ "one", "two" }, { "one", "two" }}
231\ShowLuaExampleThree {table} {are_equal} {{ "one", "two" }, { "two", "one" }}
232
233\stopsummary
234
235\startsummary[title={tohash fromhash swapped swaphash reversed reverse mirrored}]
236
237We use \type {tohash} quite a lot in \CONTEXT. It converts a list into a hash so
238that we can easily check if (a string) is in a given set. The \type {fromhash}
239function does the opposite: it creates a list of keys from a hashed table where
240each value that is not \type {false} or \type {nil} is present.
241
242\starttyping
243local hashed  = table.tohash  (indexed)
244local indexed = table.fromhash(hashed)
245\stoptyping
246
247The function \type {swapped} turns keys into values vise versa while the \type
248{reversed} and \type {reverse} reverses the values in an indexed table. The last
249one reverses the table itself (in|-|place).
250
251\starttyping
252local swapped  = table.swapped  (indexedtable)
253local reversed = table.reversed (indexedtable)
254local reverse  = table.reverse  (indexedtable)
255local mirrored = table.mirrored (hashedtable)
256\stoptyping
257
258\ShowLuaExampleTwo {table} {tohash}   {{ "a", "b", "c" }}
259\ShowLuaExampleTwo {table} {fromhash} {{ a = true, b = false, c = true }}
260\ShowLuaExampleTwo {table} {swapped}  {{ "a", "b", "c" }}
261\ShowLuaExampleTwo {table} {reversed} {{ "a", "b", "c" }}
262\ShowLuaExampleTwo {table} {reverse}  {{ 1, 2, 3, 4 }}
263\ShowLuaExampleTwo {table} {mirrored} {{ a = "x", b = "y", c = "z" }}
264
265\stopsummary
266
267\startsummary[title={append prepend}]
268
269These two functions operate on a pair of indexed tables. The first table gets
270appended or prepended by the second. The first table is returned as well.
271
272\starttyping
273table.append (one, two)
274table.prepend(one, two)
275\stoptyping
276
277The functions are similar to loops using \type {insert}.
278
279\ShowLuaExampleTwo {table} {append}  {{ "a", "b", "c" }, { "d", "e" }}
280\ShowLuaExampleTwo {table} {prepend} {{ "a", "b", "c" }, { "d", "e" }}
281
282\stopsummary
283
284\startsummary[title={merge merged imerge imerged}]
285
286You can merge multiple hashes with \type {merge} and indexed tables with \type
287{imerge}. The first table is the target and is returned.
288
289\starttyping
290table.merge   (one, two, ...)
291table.imerge  (one, two, ...)
292\stoptyping
293
294The variants ending with a \type {d} merge the given list of tables and return
295the result leaving the first argument untouched.
296
297\starttyping
298local merged = table.merged  (one, two, ...)
299local merged = table.imerged (one, two, ...)
300\stoptyping
301
302\ShowLuaExampleTwo {table} {merge}  {{ a = 1, b = 2, c = 3 }, { d = 1 }, { a = 0 }}
303\ShowLuaExampleTwo {table} {imerge} {{ "a", "b", "c" }, { "d", "e" }, { "f", "g" }}
304
305% \ShowLuaExampleTwo {table} {merged}   {{ a = 1, b = 2, c = 3 }, { d = 1 }, { a = 0 }}
306% \ShowLuaExampleTwo {table} {imerged}  {{ "a", "b", "c" }, { "d", "e" }, { "f", "g" }}
307
308\stopsummary
309
310\startsummary[title={copy fastcopy}]
311
312When copying a table we need to make a real and deep copy. The \type {copy}
313function is an adapted version from the \LUA\ wiki. The \type {fastcopy} is faster
314because it does not check for circular references and does not share tables when
315possible. In practice using the fast variant is okay.
316
317\starttyping
318local copy = table.copy    (t)
319local copy = table.fastcopy(t)
320\stoptyping
321
322\stopsummary
323
324\startsummary[title={flattened}]
325
326A nested table can be unnested using \type {flattened}. Normally you will only
327use this function if the content is somewhat predictable. Often using one of the
328merge functions does a similar job.
329
330\starttyping
331local flattened = table.flatten(t)
332\stoptyping
333
334\ShowLuaExampleTwo {table} {flattened} {{ a = 1, b = 2, { c = 3 }, d = 4}}
335\ShowLuaExampleTwo {table} {flattened} {{ 1, 2, { 3, { 4 } }, 5}}
336\ShowLuaExampleTwo {table} {flattened} {{ 1, 2, { 3, { 4 } }, 5}, 1}
337\ShowLuaExampleTwo {table} {flattened} {{ a = 1, b = 2, { c = 3 }, d = 4}}
338\ShowLuaExampleTwo {table} {flattened} {{ 1, 2, { 3, { c = 4 } }, 5}}
339\ShowLuaExampleTwo {table} {flattened} {{ 1, 2, { 3, { c = 4 } }, 5}, 1}
340
341\stopsummary
342
343\startsummary[title={loweredkeys}]
344
345The name says it all: this function returns a new table with the keys being lower
346case. This is handy in cases where the keys have a change to be inconsistent, as
347can be the case when users input keys and values in less controlled ways.
348
349\starttyping
350local normalized = table.loweredkeys { a = "a", A = "b", b = "c" }
351\stoptyping
352
353\ShowLuaExampleTwo {table} {loweredkeys} {{ a = 1, b = 2, C = 3}}
354
355\stopsummary
356
357\startsummary[title={contains}]
358
359This function works with indexed tables. Watch out, when you look for a match,
360the number \type {1} is not the same as string \type {"1"}. The function returns
361the index or \type {false}.
362
363\starttyping
364if table.contains(t, 5 ) then ... else ... end
365if table.contains(t,"5") then ... else ... end
366\stoptyping
367
368\ShowLuaExampleThree {table} {contains} {{ "a", 2, true, "1"}, 1}
369\ShowLuaExampleThree {table} {contains} {{ "a", 2, true, "1"}, "1"}
370
371\stopsummary
372
373\startsummary[title={unique}]
374
375When a table (can) contain duplicate entries you can get rid of them by using the
376\type {unique} helper:
377
378\starttyping
379local t = table.unique { 1, 2, 3, 4, 3, 2, 5, 6 }
380\stoptyping
381
382\ShowLuaExampleTwo {table} {unique} { { "a", "b", "c", "a", "d" } }
383
384\stopsummary
385
386\startsummary[title={count}]
387
388The name speaks for itself: this function counts the number of entries in the
389given table. For an indexed table \type {#t} is faster.
390
391\starttyping
392local n = table.count(t)
393\stoptyping
394
395\ShowLuaExampleThree {table} {count} {{ 1, 2, [4] = 4, a = "a" }}
396
397\stopsummary
398
399\startsummary[title={sequenced}]
400
401Normally, when you trace a table, printing the serialized version is quite
402convenient. However, when it concerns a simple table, a more compact variant is:
403
404\starttyping
405print(table.sequenced(t, separator))
406\stoptyping
407
408% beware: by default sequences has | as separator
409
410\ShowLuaExampleThree {table} {sequenced} {{ 1, 2, 3, 4}}
411\ShowLuaExampleThree {table} {sequenced} {{ 1, 2, [4] = 4, a = "a" }, ", "}
412
413\stopsummary
414
415\stopsection
416
417\startsection[title=Math]
418
419In addition to the built-in math function we provide: \type {round}, \type {odd},
420\type {even}, \type {div}, \type {mod}, \type {sind}, \type {cosd} and
421\type {tand}.
422
423At the \TEX\ end we have a helper \type {luaexpr} that you can use to do
424calculations:
425
426\startbuffer
427  \luaexpr{1 + 2.3 * 4.5 + math.pi} = \cldcontext{1 + 2.3 * 4.5 + math.pi}
428\stopbuffer
429
430\typebuffer
431
432Both calls return the same result, but the first one is normally faster than the
433\type {context} command which has quite some overhead.
434
435\blank \getbuffer \blank
436
437The \type {\luaexpr} command can also better deal with for instance conditions,
438where it returns \type {true} or \type {false}, while \type {\cldcontext} would
439interpret the boolean value as a special signal.
440
441\stopsection
442
443\startsection[title=Booleans]
444
445\startsummary[title={tonumber}]
446
447This function returns the number one or zero. You will seldom need this function.
448
449\starttyping
450local state = boolean.tonumber(str)
451\stoptyping
452
453\ShowLuaExampleThree {boolean} {tonumber} {true}
454
455\stopsummary
456
457\startsummary[title={toboolean}]
458
459When dealing with configuration files or tables a bit flexibility in setting a
460state makes sense, if only because in some cases it's better to say \type {yes}
461than \type {true}.
462
463\starttyping
464local b = toboolean(str)
465local b = toboolean(str,tolerant)
466\stoptyping
467
468When the second argument is true, the strings \type {true}, \type {yes}, \type
469{on}, \type {1}, \type {t} and the number \type {1} all turn into \type {true}.
470Otherwise only \type {true} is honoured. This function is also defined in the
471global namespace.
472
473\ShowLuaExampleThree {string} {toboolean} {"true"}
474\ShowLuaExampleThree {string} {toboolean} {"yes"}
475\ShowLuaExampleThree {string} {toboolean} {"yes",true}
476
477\stopsummary
478
479\startsummary[title={is_boolean}]
480
481This function is somewhat similar to the previous one. It interprets the strings
482\type {true}, \type {yes}, \type {on} and \type {t} as \type {true} and
483\type{false}, \type {no}, \type {off} and \type {f} as \type {false}. Otherwise
484\type {nil} is returned, unless a default value is given, in which case that is
485returned.
486
487\starttyping
488if is_boolean(str)         then ... end
489if is_boolean(str,default) then ... end
490\stoptyping
491
492\ShowLuaExampleThree {string} {is_boolean} {"true"}
493\ShowLuaExampleThree {string} {is_boolean} {"off"}
494\ShowLuaExampleThree {string} {is_boolean} {"crap",true}
495
496\stopsummary
497
498\stopsection
499
500\startsection[title=Strings]
501
502\LUA\ strings are simply sequences of bytes. Of course in some places special
503treatment takes place. For instance \type {\n} expands to one or more characters
504representing a newline, depending on the operating system, but normally, as long
505as you manipulate strings in the perspective of \LUATEX, you don't need to worry
506about such issues too much. As \LUATEX\ is a \UTF-8 engine, strings normally are
507in that encoding but again, it does not matter much as \LUA\ is quite agnostic
508about the content of strings: it does not care about three characters reflecting
509one \UNICODE\ character or not. This means that when you use for instance the
510functions discussed here, or use libraries like \type {lpeg} behave as you
511expect.
512
513Versions later than 0.75 are likely to have some basic \UNICODE\ support on board
514but we can easily adapt to that. At least till \LUATEX\ version 0.75 we provided
515the \type {slunicode} library but users cannot assume that that will be present for
516ever. If you want to mess around with \UTF\ string, use the \type {utf} library
517instead as that is the one we provide in \MKIV. It presents the stable interface to
518whatever \LUA\ itself provides and|/|or what \LUATEX\ offers and|/|or what
519is there because \MKIV\ implements it.
520
521\startsummary[title={[lua] byte char}]
522
523As long as we're dealing with \ASCII\ characters we can use these two functions to
524go from numbers to characters and vise versa.
525
526\ShowLuaExampleSeven {string} {byte} {"luatex"}
527\ShowLuaExampleSeven {string} {byte} {"luatex",1,3}
528\ShowLuaExampleSeven {string} {byte} {"luatex",-3,-1}
529
530\ShowLuaExampleSeven {string} {char} {65}
531\ShowLuaExampleSeven {string} {char} {65,66,67}
532
533\stopsummary
534
535\startsummary[title={[lua] sub}]
536
537You cannot directly access a character in a string but you can take any slice you
538want using \type {sub}. You need to provide a start position and negative values
539will count backwards from the end.
540
541\starttyping
542local slice = string.sub(str,first,last)
543\stoptyping
544
545\ShowLuaExampleThree {string} {sub} {"abcdef",2}
546\ShowLuaExampleThree {string} {sub} {"abcdef",2,3}
547\ShowLuaExampleThree {string} {sub} {"abcdef",-3,-2}
548
549\stopsummary
550
551\startsummary[title={[lua] gsub}]
552
553There are two ways of analyzing the content of a string. The more modern and
554flexible approach is to use \type {lpeg}. The other one uses some functions in
555the \type {string} namespace that accept so called patterns for matching. While
556\type {lpeg} is more powerfull than regular expressions, the pattern matching is
557less powerfull but sometimes faster and also easier to specify. In many cases it
558can do the job quite well.
559
560\starttyping
561local new, count = string.gsub(old,pattern,replacement)
562\stoptyping
563
564The replacement can be a function. Often you don't want the number
565of matches, and the way to avoid this is either to store the result
566in a variable:
567
568\starttyping
569local new = string.gsub(old,"lua","LUA")
570print(new)
571\stoptyping
572
573or to use parentheses to signal the interpreter that only one value
574is return.
575
576\starttyping
577print((string.gsub(old,"lua","LUA"))
578\stoptyping
579
580Patterns can be more complex so you'd better read the \LUA\ manual if you want to
581know more about them.
582
583\ShowLuaExampleThree {string} {gsub} {"abcdef","b","B"}
584\ShowLuaExampleThree {string} {gsub} {"abcdef","[bc]",string.upper}
585
586An optional fourth argument specifies how often the replacement has to happen
587
588\ShowLuaExampleThree {string} {gsub} {"textextextex","tex","abc"}
589\ShowLuaExampleThree {string} {gsub} {"textextextex","tex","abc",1}
590\ShowLuaExampleThree {string} {gsub} {"textextextex","tex","abc",2}
591
592\stopsummary
593
594\startsummary[title={[lua] find}]
595
596The \type {find} function returns the first and last position of the match:
597
598\starttyping
599local first, last = find(str,pattern)
600\stoptyping
601
602If you're only interested if there is a match at all, it's enough to know that
603there is a first position. No match returns \type {nil}. So,
604
605\starttyping
606if find("luatex","tex") then ... end
607\stoptyping
608
609works out okay. You can pass an extra argument to \type {find} that indicates the
610start position. So you can use this function to loop over all matches: just start
611again at the end of the last match.
612
613A fourth optional argument is a boolean that signals not to interpret the pattern
614but use it as|-|is.
615
616\ShowLuaExampleThree {string} {find} {"abc.def","c\letterpercent.d",1,false}
617\ShowLuaExampleThree {string} {find} {"abc.def","c\letterpercent.d",1,true}
618\ShowLuaExampleThree {string} {find} {"abc\letterpercent.def","c\letterpercent.d",1,false}
619\ShowLuaExampleThree {string} {find} {"abc\letterpercent.def","c\letterpercent.d",1,true}
620
621\stopsummary
622
623\startsummary[title={[lua] match gmatch}]
624
625With \type {match} you can split of bits and pieces of a string. The parenthesis
626indicate the captures.
627
628\starttyping
629local a, b, c, ... = string.match(str,pattern)
630\stoptyping
631
632The \type {gmatch} function is used to loop over a string, for instance the
633following code prints the elements in a comma separated list, ignoring spaces
634after commas.
635
636\starttyping
637for s in string.gmatch(str,"([^,%s])+") do
638  print(s)
639end
640\stoptyping
641
642A more detailed description can be found in the \LUA\ reference manual, so we
643only mention the special directives. Characters are grouped in classes:
644
645\starttabulate[|lT|l|]
646\HL
647\NC \letterpercent a \NC letters                  \NC \NR
648\NC \letterpercent l \NC lowercase letters        \NC \NR
649\NC \letterpercent u \NC uppercase letters        \NC \NR
650\NC \letterpercent d \NC digits                   \NC \NR
651\NC \letterpercent w \NC letters and digits       \NC \NR
652\NC \letterpercent c \NC control characters       \NC \NR
653\NC \letterpercent p \NC punctuation              \NC \NR
654\NC \letterpercent x \NC hexadecimal characters   \NC \NR
655\NC \letterpercent s \NC space related characters \NC \NR
656\HL
657\stoptabulate
658
659You can create sets too:
660
661\starttabulate[|lT|l|]
662\HL
663\NC [\letterpercent l\letterpercent d]  \NC lowercase letters and digits \NC \NR
664\NC [^\letterpercent d\letterpercent p] \NC all characters except digits and punctuation \NC \NR
665\NC [p-z]                               \NC all characters in the range \type {p} upto \type {z} \NC \NR
666\NC [pqr]                               \NC all characters \type {p}, \type {q} and \type {r} \NC \NR
667\HL
668\stoptabulate
669
670There are some characters with special meanings:
671
672\starttabulate[|lT|l|]
673\HL
674\NC \letterhat       \NC the beginning of a string \NC \NR
675\NC \letterdollar    \NC end of a string \NC \NR
676\NC .                \NC any character \NC \NR
677\NC *                \NC zero or more of the preceding specifier, greedy \NC \NR
678\NC -                \NC zero or more of the preceding specifier, least possible \NC \NR
679\NC +                \NC one or more of the preceding specifier \NC \NR
680\NC ?                \NC zero or one of the preceding specifier \NC \NR
681\NC ( )              \NC encapsulate capture \NC \NR
682\NC \letterpercent b \NC capture all between the following two characters \NC \NR
683\HL
684\stoptabulate
685
686You can use whatever you like to be matched:
687
688\starttabulate[|lT|l|]
689\HL
690\NC pqr                           \NC the sequence \type {pqr} \NC \NR
691\NC my name is (\letterpercent w) \NC the word following \type {my name is} \NC \NR
692\HL
693\stoptabulate
694
695If you want to specify such a token as it is, then you can precede it with a
696percent sign, so to get a percent, you need two in a row.
697
698\ShowLuaExampleThree {string} {match} {"before:after","^(.-):"}
699\ShowLuaExampleThree {string} {match} {"before:after","^([^:])"}
700\ShowLuaExampleThree {string} {match} {"before:after","bef(.*)ter"}
701\ShowLuaExampleThree {string} {match} {"abcdef","[b-e]+"}
702\ShowLuaExampleThree {string} {match} {"abcdef","[b-e]*"}
703\ShowLuaExampleThree {string} {match} {"abcdef","b-e+"}
704\ShowLuaExampleThree {string} {match} {"abcdef","b-e*"}
705
706\stopsummary
707
708Such patterns should not be confused with regular expressions, although to some
709extent they can do the same. If you really want to do complex matches, you should
710look into \LPEG.
711
712\startsummary[title={[lua] lower upper}]
713
714These two function spreak for themselves.
715
716\ShowLuaExampleThree {string} {lower} {"LOW"}
717\ShowLuaExampleThree {string} {upper} {"upper"}
718
719\stopsummary
720
721\startsummary[title={[lua] format}]
722
723The \type {format} function takes a template as first argument and one or more
724additional arguments depending on the format. The template is similar to the one
725used in \CCODE\ but it has some extensions.
726
727\starttyping
728local s = format(format, str, ...)
729\stoptyping
730
731The following table gives an overview of the possible format directives. The
732\type {s} is the most probably candidate and can handle numbers well as strings.
733Watch how the minus sign influences the alignment. \footnote {There can be
734differences between platforms although so far we haven't run into problems. Also,
735\LUA\ 5.2 does a bit more checking on correct arguments and \LUA\ 5.3 is more
736picky on integers.}
737
738\starttabulate[|lB|lT|lT|lT|]
739\HL
740\NC integer     \NC \letterpercent i    \NC 12345  \NC \cldcontext{string.format("\letterpercent i",   12345 )} \NC \NR
741\NC integer     \NC \letterpercent d    \NC 12345  \NC \cldcontext{string.format("\letterpercent d",   12345 )} \NC \NR
742\NC unsigned    \NC \letterpercent u    \NC -12345 \NC \cldcontext{string.format("\letterpercent u",   12345 )} \NC \NR
743\NC character   \NC \letterpercent c    \NC 123    \NC \cldcontext{string.format("\letterpercent c",   89    )} \NC \NR
744\NC hexadecimal \NC \letterpercent x    \NC 123    \NC \cldcontext{string.format("\letterpercent x",   123   )} \NC \NR
745\NC             \NC \letterpercent X    \NC 123    \NC \cldcontext{string.format("\letterpercent X",   123   )} \NC \NR
746\NC octal       \NC \letterpercent o    \NC 12345  \NC \cldcontext{string.format("\letterpercent o",   12345 )} \NC \NR
747\HL
748\NC string      \NC \letterpercent s    \NC abc    \NC \cldcontext{string.format("\letterpercent s",   "abcd")} \NC \NR
749\NC             \NC \letterpercent -8s  \NC 123    \NC \cldcontext{string.format("\letterpercent -8s", 123   )} \NC \NR
750\NC             \NC \letterpercent 8s   \NC 123    \NC \cldcontext{string.format("\letterpercent 8s",  123   )} \NC \NR
751\HL
752\NC float       \NC \letterpercent 0.2f \NC 12.345 \NC \cldcontext{string.format("\letterpercent 0.2f",12.345)} \NC \NR
753\NC exponential \NC \letterpercent 0.2e \NC 12.345 \NC \cldcontext{string.format("\letterpercent 0.2e",12.345)} \NC \NR
754\NC             \NC \letterpercent 0.2E \NC 12.345 \NC \cldcontext{string.format("\letterpercent 0.2E",12.345)} \NC \NR
755\NC autofloat   \NC \letterpercent 0.2g \NC 12.345 \NC \cldcontext{string.format("\letterpercent 0.2g",12.345)} \NC \NR
756\NC             \NC \letterpercent 0.2G \NC 12.345 \NC \cldcontext{string.format("\letterpercent 0.2G",12.345)} \NC \NR
757\HL
758\stoptabulate
759
760\startasciimode
761\ShowLuaExampleThree {string} {format} {"U+\letterpercent 05X",2010}
762\stopasciimode
763
764\stopsummary
765
766\startsummary[title={striplines}]
767
768The \type {striplines} function can strip leading and trailing empty lines,
769collapse or delete intermediate empty lines and strips leading and trailing
770spaces. We will demonstrate this with string \type {str}:
771
772\startluacode
773local str = table.concat( {
774"  ",
775"    aap",
776"  noot mies",
777"  ",
778"    ",
779" wim    zus   jet",
780"teun    vuur gijs",
781"       lam    kees bok weide",
782"    ",
783"does hok duif schapen  ",
784"  ",
785}, "\n")
786
787document.TempString = str
788
789function document.ShowStrippedString(str)
790    str = string.gsub(str," ","\\allowbreak<sp>\\allowbreak ")
791    str = string.gsub(str,"([\010])","\\allowbreak<lf>\\allowbreak ")
792    context.startalign { "flushleft,verytolerant" }
793        context("{\\tttf %s}",str)
794    context.stopalign()
795end
796
797function document.ShowStrippedBuffer(name,str)
798    context.tobuffer(name,str)
799    context.typebuffer( { name }, { numbering = "line" })
800    context.resetbuffer { name }
801end
802
803function document.ShowStrippedCommand(option)
804    context.type( { style = "ttbf" }, [[utilities.strings.striplines(str,"]] .. option .. [[")]])
805end
806
807context.blank { "big" }
808document.ShowStrippedString(str)
809document.ShowStrippedBuffer("dummy",str)
810
811\stopluacode
812
813The different options for stripping are demonstrated below, We use verbose
814descriptions instead of vague boolean flags.
815
816\startluacode
817local str = document.TempString ; document.TempString = nil
818
819for option in table.sortedhash(utilities.strings.striplinepatterns) do
820    local s = utilities.strings.striplines(str,option)
821    context.blank()
822    document.ShowStrippedCommand(option)
823    context.blank { "big,samepage" }
824    document.ShowStrippedString(s)
825    context.blank { "big,samepage" }
826    document.ShowStrippedBuffer(option,str)
827end
828\stopluacode
829
830You can of course mix usage with the normal \type {context} helper commands, for
831instance put them in buffers. Buffers normally will prune leading and trailing
832empty lines anyway.
833
834\starttyping
835context.tobuffer("dummy",utilities.strings.striplines(str))
836context.typebuffer( { "dummy" }, { numbering = "line" })
837\stoptyping
838
839\stopsummary
840
841\startsummary[title={formatters}]
842
843The \type {format} function discussed before is the built|-|in. As an alternative
844\CONTEXT\ provides an additional formatter that has some extensions. Interesting
845is that that one is often more efficient, although there are cases where the
846speed is comparable. As we run out of keys, some extra ones are a bit counter
847intuitive, like \type {l} for booleans (logical).
848
849\start \setuptype[color=]
850
851\starttabulate[|lB|lT|lT|lT|]
852\HL
853\NC utf character       \NC \letterpercent c            \NC 322           \NC \cldcontext{"\letterpercent c",322} \NC \NR
854\HL
855\NC string              \NC \letterpercent s            \NC foo           \NC \cldcontext{"\letterpercent s","foo"} \NC \NR
856\NC force tostring      \NC \letterpercent S            \NC nil           \NC \cldcontext{"\letterpercent S",nil} \NC \NR
857\NC quoted string       \NC \letterpercent q            \NC foo           \NC \cldcontext{"\letterpercent q","foo"} \NC \NR
858\NC force quoted string \NC \letterpercent Q            \NC nil           \NC \cldcontext{"\letterpercent Q",nil} \NC \NR
859\NC                     \NC \letterpercent N            \NC 0123          \NC \cldcontext{"\letterpercent N","0123"} \NC \NR
860\NC automatic quoted    \NC \letterpercent a            \NC true          \NC \cldcontext{"\letterpercent a",true} \NC \NR\NC \NR
861\NC                     \NC \letterpercent A            \NC true          \NC \cldcontext{"\letterpercent A",true} \NC \NR\NC \NR
862\NC left aligned utf    \NC \letterpercent 30<          \NC xx½xx         \NC \cldcontext{"\letterpercent 30<","xx½xx"} \NC \NR\NC \NR
863\NC right aligned utf   \NC \letterpercent 30>          \NC xx½xx         \NC \cldcontext{"\letterpercent 30>","xx½xx"} \NC \NR\NC \NR
864\HL
865\NC integer             \NC \letterpercent i            \NC 1234          \NC \cldcontext{"\letterpercent i",1234} \NC \NR
866\NC integer             \NC \letterpercent d            \NC 1234          \NC \cldcontext{"\letterpercent d",1234} \NC \NR
867\NC signed number       \NC \letterpercent I            \NC 1234          \NC \cldcontext{"\letterpercent I",1234} \NC \NR
868\NC rounded number      \NC \letterpercent r            \NC 1234.56       \NC \cldcontext{"\letterpercent r",1234.56} \NC \NR
869\NC stripped number     \NC \letterpercent N            \NC 000123        \NC \cldcontext{"\letterpercent N","000123"} \NC \NR
870\NC comma/period float  \NC \letterpercent m            \NC 12.34         \NC \cldcontext{"\letterpercent m",12.34} \NC \NR
871\NC period/comma float  \NC \letterpercent M            \NC 12.34         \NC \cldcontext{"\letterpercent M",12.34} \NC \NR
872\HL
873\NC hexadecimal         \NC \letterpercent x            \NC 1234          \NC \cldcontext{"\letterpercent x",1234} \NC \NR
874\NC                     \NC \letterpercent X            \NC 1234          \NC \cldcontext{"\letterpercent X",1234} \NC \NR
875\NC octal               \NC \letterpercent o            \NC 1234          \NC \cldcontext{"\letterpercent o",1234} \NC \NR
876\HL
877\NC float               \NC \letterpercent 0.2f         \NC 12.345        \NC \cldcontext{"\letterpercent 0.2f",12.345} \NC \NR
878\NC formatted float     \NC \letterpercent 2.3k         \NC 12.3456       \NC \cldcontext{"\letterpercent 2.3f",12.3456} \NC \NR
879\NC checked float       \NC \letterpercent 0.2F         \NC 12.30         \NC \cldcontext{"\letterpercent 0.2F",12.3} \NC \NR
880\NC exponential         \NC \letterpercent .2e          \NC 12.345e120    \NC \cldcontext{"\letterpercent 0.2j",12.345e120} \NC \NR
881\NC                     \NC \letterpercent .2E          \NC 12.345e120    \NC \cldcontext{"\letterpercent 0.2J",12.345e120} \NC \NR
882\NC sparse exp          \NC \letterpercent 0.2j         \NC 12.345e120    \NC \cldcontext{"\letterpercent 0.2j",12.345e120} \NC \NR
883\NC                     \NC \letterpercent 0.2J         \NC 12.345e120    \NC \cldcontext{"\letterpercent 0.2J",12.345e120} \NC \NR
884\NC autofloat           \NC \letterpercent g            \NC 12.345        \NC \cldcontext{"\letterpercent 0.2J",12.345} \NC \NR
885\NC                     \NC \letterpercent G            \NC 12.345        \NC \cldcontext{"\letterpercent 0.2J",12.345} \NC \NR
886\HL
887\NC unicode value 0x    \NC \letterpercent h            \NC ł 1234        \NC \cldcontext{"\letterpercent v \letterpercent v", "ł",1234} \NC \NR
888\NC                     \NC \letterpercent H            \NC ł 1234        \NC \cldcontext{"\letterpercent V \letterpercent V", "ł",1234} \NC \NR
889\NC unicode value U+    \NC \letterpercent u            \NC ł 1234        \NC \cldcontext{"\letterpercent u \letterpercent u", "ł",1234} \NC \NR
890\NC                     \NC \letterpercent U            \NC ł 1234        \NC \cldcontext{"\letterpercent U \letterpercent U", "ł",1234} \NC \NR
891\HL
892\NC points              \NC \letterpercent p            \NC 1234567       \NC \cldcontext{"\letterpercent p",1234567} \NC \NR
893\NC basepoints          \NC \letterpercent b            \NC 1234567       \NC \cldcontext{"\letterpercent b",1234567} \NC \NR
894\HL
895\NC table concat        \NC \letterpercent t            \NC \arg{1,2,3}   \NC \cldcontext{"\letterpercent t",{1,2,3}}  \NC \NR
896\NC                     \NC \letterpercent *t           \NC \arg{1,2,3}   \NC \cldcontext{"\letterpercent *t",{1,2,3}} \NC \NR
897\NC                     \NC \letterpercent \arg{ AND }t \NC \arg{a=1,b=3} \NC \cldcontext{"\letterpercent +{ AND }T",{a=1,b=2}} \NC \NR
898\NC table serialize     \NC \letterpercent T            \NC \arg{1,2,3}   \NC \cldcontext{"\letterpercent *t",{1,2,3}} \NC \NR
899\NC                     \NC \letterpercent T            \NC \arg{a=1,b=3} \NC \let|\relax\cldcontext{"\letterpercent T",{a=1,b=2}} \NC \NR
900\NC                     \NC \letterpercent +T           \NC \arg{a=1,b=3} \NC \cldcontext{"\letterpercent [+T]",{a=1,b=2}} \NC \NR
901\HL
902\NC boolean (logic)     \NC \letterpercent l            \NC "a" == "b"    \NC \cldcontext{"\letterpercent l","a"=="b"} \NC \NR
903\NC                     \NC \letterpercent L            \NC "a" == "b"    \NC \cldcontext{"\letterpercent L","a"=="b"} \NC \NR
904\HL
905\NC whitespace          \NC \letterpercent w            \NC 3             \NC \obeyspaces\vl\cldcontext{"\letterpercent w",3}\vl \NC \NR
906\NC                     \NC \letterpercent 2w           \NC 3             \NC \obeyspaces\vl\cldcontext{"\letterpercent 2w",3}\vl \NC \NR
907\NC                     \NC \letterpercent 4W           \NC               \NC \obeyspaces\vl\cldcontext{"\letterpercent 4W"}\vl \NC \NR
908\HL
909\NC skip                \NC \letterpercent 2z            \NC 1,2,3,4      \NC \obeyspaces\vl\cldcontext{"\letterpercent s\letterpercent 2z\letterpercent s",1,2,3,4}\vl \NC \NR
910\HL
911\stoptabulate
912
913\stop
914
915The generic formatters \type {a} and \type {A} convert the argument into a string
916and deals with strings, number, booleans, tables and whatever. We mostly use
917these in tracing. The lowercase variant uses single quotes, and the uppercase
918variant uses double quotes.
919
920A special one is the alignment formatter, which is a variant on the \type {s} one
921that also takes an optional positive of negative number:
922
923\startbuffer
924\startluacode
925context.start()
926context.tttf()
927context.verbatim("[[% 30<]]","xxaxx") context.par()
928context.verbatim("[[% 30<]]","xx½xx") context.par()
929context.verbatim("[[% 30>]]","xxaxx") context.par()
930context.verbatim("[[% 30>]]","xx½xx") context.par()
931context.verbatim("[[%-30<]]","xxaxx") context.par()
932context.verbatim("[[%-30<]]","xx½xx") context.par()
933context.verbatim("[[%-30>]]","xxaxx") context.par()
934context.verbatim("[[%-30>]]","xx½xx") context.par()
935context.stop()
936\stopluacode
937\stopbuffer
938
939\typebuffer \getbuffer
940
941There are two more formatters plugged in: \type {!xml!} and \type {!tex!}. These
942are best demonstrated with an example:
943
944\starttyping
945local xf = formatter["xml escaped: %!xml!"]
946local xr = formatter["tex escaped: %!tex!"]
947
948print(xf("x > 1 && x < 10"))
949print(xt("this will cost me $123.00 at least"))
950\stoptyping
951
952weird, this fails when cld-verbatim is there as part of the big thing:
953catcodetable 4 suddenly lacks the comment being a other
954
955The \type {context} command uses the formatter so one can say:
956
957\startbuffer
958\startluacode
959context("first some xml: %!xml!, and now some %!tex!",
960    "x > 1 && x < 10", "this will cost me $123.00 at least")
961\stopluacode
962\stopbuffer
963
964\typebuffer
965
966This renders as follows:
967
968\blank \getbuffer \blank
969
970You can extend the formatter but we advise you not to do that unless you're sure
971what you're doing. You never know what \CONTEXT\ itself might add for its own
972benefit.
973
974However, you can define your own formatter and add to that without interference.
975In fact, the main formatter is just defined that way. This is how it works:
976
977\startbuffer[definition]
978local MyFormatter = utilities.strings.formatters.new()
979
980utilities.strings.formatters.add (
981    MyFormatter,
982    "upper",
983    "global.string.upper(%s)"
984)
985\stopbuffer
986
987\typebuffer[definition]
988
989Now you can use this one as:
990
991\startbuffer[usage]
992context.bold(MyFormatter["It's %s or %!upper!."]("this","that"))
993\stopbuffer
994
995\typebuffer[usage]
996
997\blank \ctxluabuffer[definition,usage] \blank
998
999Because we're running inside \CONTEXT, a better definition would be this:
1000
1001\startbuffer
1002local MyFormatter = utilities.strings.formatters.new()
1003
1004utilities.strings.formatters.add (
1005    MyFormatter,
1006    "uc",
1007    "myupper(%s)",
1008 -- "local myupper = global.characters.upper"
1009    { myupper = global.characters.upper }
1010)
1011
1012utilities.strings.formatters.add (
1013    MyFormatter,
1014    "lc",
1015    "mylower(%s)",
1016 -- "local mylower = global.characters.lower"
1017    { mylower = global.characters.lower }
1018)
1019
1020utilities.strings.formatters.add (
1021    MyFormatter,
1022    "sh",
1023    "myshaped(%s)",
1024 -- "local myshaped = global.characters.shaped"
1025    { myshaped = global.characters.shaped }
1026)
1027
1028context(MyFormatter["Uppercased: %!uc!"]("ÀÁÂÃÄÅàáâãäå"))
1029context.par()
1030context(MyFormatter["Lowercased: %!lc!"]("ÀÁÂÃÄÅàáâãäå"))
1031context.par()
1032context(MyFormatter["Reduced: %!sh!"]("ÀÁÂÃÄÅàáâãäå"))
1033\stopbuffer
1034
1035\typebuffer
1036
1037The last arguments creates shortcuts. As expected we get:
1038
1039\blank \ctxluabuffer \blank
1040
1041Of course you can also apply the casing functions directly so in practice you
1042shouldn't use formatters without need. Among the advantages of using formatters
1043are:
1044
1045\startitemize[packed]
1046\startitem They provide a level of abstraction. \stopitem
1047\startitem They can replace multiple calls to \type {\context}. \stopitem
1048\startitem Sometimes they make source code look better. \stopitem
1049\startitem Using them is often more efficient and faster. \stopitem
1050\stopitemize
1051
1052The last argument might sound strange but considering the overhead involved in
1053the \type {context} (related) functions, doing more in one step has benefits.
1054Also, formatters are implemented quite efficiently, so their overhead can be
1055neglected.
1056
1057In the examples you see that a formatter extension is itself a template.
1058
1059\startbuffer
1060local FakeXML = utilities.strings.formatters.new()
1061
1062utilities.strings.formatters.add(FakeXML,"b",[["<" ..%s..">" ]])
1063utilities.strings.formatters.add(FakeXML,"e",[["</"..%s..">" ]])
1064utilities.strings.formatters.add(FakeXML,"n",[["<" ..%s.."/>"]])
1065
1066context(FakeXML["It looks like %!b!xml%!e! doesn't it?"]("it","it"))
1067\stopbuffer
1068
1069\typebuffer
1070
1071This gives: \ctxluabuffer. Of course we could go over the top here:
1072
1073\startbuffer
1074local FakeXML = utilities.strings.formatters.new()
1075
1076local stack = { }
1077
1078function document.f_b(s)
1079    table.insert(stack,s)
1080    return "<" .. s .. ">"
1081end
1082
1083function document.f_e()
1084    return "</" .. table.remove(stack) .. ">"
1085end
1086
1087utilities.strings.formatters.add(FakeXML,"b",[[global.document.f_b(%s)]])
1088utilities.strings.formatters.add(FakeXML,"e",[[global.document.f_e()]])
1089
1090context(FakeXML["It looks like %1!b!xml%0!e! doesn't it?"]("it"))
1091\stopbuffer
1092
1093\typebuffer
1094
1095This gives: \ctxluabuffer. Such a template look horrible, although it's not too
1096far from the regular format syntax: just compare \type {%1f} with \type {%1!e!}.
1097The zero trick permits us to inject information that we've put on the stack. As
1098this kind of duplicate usage might occur most, a better solution is available:
1099
1100\startbuffer
1101local FakeXML = utilities.strings.formatters.new()
1102
1103utilities.strings.formatters.add(FakeXML,"b",[["<"  .. %s .. ">"]])
1104utilities.strings.formatters.add(FakeXML,"e",[["</" .. %s .. ">"]])
1105
1106context(FakeXML["It looks like %!b!xml%-1!e! doesn't it?"]("it"))
1107\stopbuffer
1108
1109\typebuffer
1110
1111We get: \ctxluabuffer. Anyhow, in most cases you will never feel the need for
1112such hackery and the regular formatter works fine. Adding this extension
1113mechanism was rather trivial and it doesn't influence the performance.
1114
1115In \CONTEXT\ we have a few more extensions:
1116
1117\starttyping
1118utilities.strings.formatters.add (
1119    strings.formatters, "unichr",
1120    [["U+" .. format("%%05X",%s) .. " (" .. utfchar(%s) .. ")"]]
1121)
1122
1123utilities.strings.formatters.add (
1124    strings.formatters, "chruni",
1125    [[utfchar(%s) .. " (U+" .. format("%%05X",%s) .. ")"]]
1126)
1127\stoptyping
1128
1129This one is used in messages:
1130
1131\startbuffer
1132context("Missing character %!chruni! in font.",234) context.par()
1133context("Missing character %!unichr! in font.",234)
1134\stopbuffer
1135
1136\typebuffer
1137
1138This shows up as:
1139
1140\blank \getbuffer \blank
1141
1142If you look closely to the definition, you will notice that we use \type {%s}
1143twice. This is a feature of the definer function: if only one argument is
1144picked up (which is default) then the replacement format can use that two
1145times. Because we use a format in the constructor, we need to escape the
1146percent sign there.
1147
1148\stopsummary
1149
1150\startsummary[title={strip}]
1151
1152This function removes any leading and trailing whitespace characters.
1153
1154\starttyping
1155local s = string.strip(str)
1156\stoptyping
1157
1158\ShowLuaExampleThree {string} {strip} {" lua + tex = luatex "}
1159
1160\stopsummary
1161
1162\startsummary[title={split splitlines checkedsplit}]
1163
1164The line splitter is a special case of the generic splitter. The \type {split}
1165function can get a string as well an \type {lpeg} pattern. The \type
1166{checkedsplit} function removes empty substrings.
1167
1168\starttyping
1169local t = string.split        (str, pattern)
1170local t = string.split        (str, lpeg)
1171local t = string.checkedsplit (str, lpeg)
1172local t = string.splitlines   (str)
1173\stoptyping
1174
1175\start \let\ntex\relax % hack
1176
1177\ShowLuaExampleTwo {string} {split}        {"a, b,c, d", ","}
1178\ShowLuaExampleTwo {string} {split}        {"p.q,r", lpeg.S(",.")}
1179\ShowLuaExampleTwo {string} {checkedsplit} {";one;;two", ";"}
1180\ShowLuaExampleTwo {string} {splitlines}   {"lua\ntex nic"}
1181
1182\stop
1183
1184\stopsummary
1185
1186\startsummary[title={quoted unquoted}]
1187
1188You will hardly need these functions. The \type {quoted} function can normally be
1189avoided using the \type {format} pattern \type {%q}. The \type {unquoted}
1190function removes single or double quotes but only when the string starts and ends
1191with the same quote.
1192
1193\starttyping
1194local q = string.quoted  (str)
1195local u = string.unquoted(str)
1196\stoptyping
1197
1198\ShowLuaExampleThree {string} {quoted}   {[[test]]}
1199\ShowLuaExampleThree {string} {quoted}   {[[test"test]]}
1200\ShowLuaExampleThree {string} {unquoted} {[["test]]}
1201\ShowLuaExampleThree {string} {unquoted} {[["t\"est"]]}
1202\ShowLuaExampleThree {string} {unquoted} {[["t\"est"x]]}
1203\ShowLuaExampleThree {string} {unquoted} {"\'test\'"}
1204
1205\stopsummary
1206
1207\startsummary[title={count}]
1208
1209The function \type {count} returns the number of times that a given pattern
1210occurs. Beware: if you want to deal with \UTF\ strings, you need the variant that
1211sits in the \type {lpeg} namespace.
1212
1213\starttyping
1214local n = count(str,pattern)
1215\stoptyping
1216
1217\ShowLuaExampleThree {string} {count} {"test me", "e"}
1218
1219\stopsummary
1220
1221\startsummary[title={limit}]
1222
1223This function can be handy when you need to print messages that can be rather
1224long. By default, three periods are appended when the string is chopped.
1225
1226\starttyping
1227print(limit(str,max,sentinel)
1228\stoptyping
1229
1230\ShowLuaExampleThree {string} {limit} {"too long", 6}
1231\ShowLuaExampleThree {string} {limit} {"too long", 6, " (etc)"}
1232
1233\stopsummary
1234
1235\startsummary[title={is_empty}]
1236
1237A string considered empty by this function when its length is zero or when it
1238only contains spaces.
1239
1240\starttyping
1241if is_empty(str) then ... end
1242\stoptyping
1243
1244\ShowLuaExampleThree {string} {is_empty} {""}
1245\ShowLuaExampleThree {string} {is_empty} {" "}
1246\ShowLuaExampleThree {string} {is_empty} {" ? "}
1247
1248\stopsummary
1249
1250\startsummary[title={escapedpattern topattern}]
1251
1252These two functions are rather specialized. They come in handy when you need to
1253escape a pattern, i.e.\ prefix characters with a special meaning by a \type {%}.
1254
1255\starttyping
1256local e = escapedpattern(str, simple)
1257local p = topattern     (str, lowercase, strict)
1258\stoptyping
1259
1260The simple variant does less escaping (only \type {-.?*} and is for instance used
1261in wildcard patterns when globbing directories. The \type {topattern} function
1262always does the simple escape. A strict pattern gets anchored to the beginning
1263and end. If you want to see what these functions do you can best look at their
1264implementation.
1265
1266\stopsummary
1267
1268% strings.tabtospace(str,n)
1269% strings.striplong
1270
1271\stopsection
1272
1273\startsection[title=\UTF]
1274
1275We used to have the \type {slunicode} library available but as most of it is not
1276used and because it has a somewhat fuzzy state, we will no longer rely on it. In
1277fact we only used a few functions in the \type {utf} namespace so as \CONTEXT\
1278user you'd better stick to what is presented here. You don't have to worry how
1279they are implemented. Depending on the version of \LUATEX\ it can be that a
1280library, a native function, or \LPEG is used.
1281
1282\startsummary[title={char byte}]
1283
1284As \UTF\ is a multibyte encoding the term char in fact refers to a \LUA\
1285string of one upto four 8|-|bit characters.
1286
1287\starttyping
1288local b = utf.byte("å")
1289local c = utf.char(0xE5)
1290\stoptyping
1291
1292The number of places in \CONTEXT\ where do such conversion is not that large:
1293it happens mostly in tracing messages.
1294
1295\starttyping
1296logs.report("panic","the character U+%05X is used",utf.byte("æ"))
1297\stoptyping
1298
1299\ShowLuaExampleThree {utf} {byte} {"æ"}
1300\ShowLuaExampleThree {utf} {char} {0xE6}
1301
1302\stopsummary
1303
1304\startsummary[title={sub}]
1305
1306If you need to take a slice of an \UTF\ encoded string the \type {sub} function
1307can come in handy. This function takes a string and a range defined by two
1308numbers. Negative numbers count from the end of the string.
1309
1310\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",1,7}
1311\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",0,7}
1312\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",0,9}
1313\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",4}
1314\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",0}
1315\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",0,0}
1316\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",4,4}
1317\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",4,0}
1318\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",-3,0}
1319\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",0,-3}
1320\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",-5,-3}
1321\ShowLuaExampleThree {utf} {sub} {"123456àáâãäå",-3}
1322
1323\stopsummary
1324
1325\startsummary[title={len}]
1326
1327There are probably not that many people that can instantly see how many bytes the
1328string in the following example takes:
1329
1330\starttyping
1331local l = utf.len("ÀÁÂÃÄÅàáâãäå")
1332\stoptyping
1333
1334Programming languages use \ASCII\ mostly so there each characters takes one byte.
1335In \CJK\ scripts however, you end up with much longer sequences. If you ever did
1336some typesetting of such scripts you have noticed that the number of characters
1337on a page is less than in the case of a Latin script. As information is coded
1338in less characters, effectively the source of a Latin or \CJK\ document will not
1339differ that much.
1340
1341\ShowLuaExampleThree {utf} {len} {"ÒÓÔÕÖòóôõö"}
1342
1343\stopsummary
1344
1345\startsummary[title={values characters}]
1346
1347There are two iterators that deal with \UTF. In \LUATEX\ these are extensions to
1348the \type {string} library but for consistency we've move them to the \type {utf}
1349namespace.
1350
1351The following function loops over the \UTF\ characters in a string and returns
1352the \UNICODE\ number in \type {u}:
1353
1354\starttyping
1355for u in utf.values(str) do
1356    ... -- u is a number
1357end
1358\stoptyping
1359
1360The next one returns a string \type {c} that has one or more characters as \UTF\
1361characters can have upto 4 bytes.
1362
1363\starttyping
1364for c in utf.characters(str) do
1365    ... -- c is a string
1366end
1367\stoptyping
1368
1369\stopsummary
1370
1371\startsummary[title={ustring xstring tocodes}]
1372
1373These functions are mostly useful for logging where we want to see the \UNICODE\
1374number.
1375
1376\ShowLuaExampleThree {utf} {ustring} {0xE6}
1377\ShowLuaExampleThree {utf} {ustring} {"ù"}
1378\ShowLuaExampleThree {utf} {xstring} {0xE6}
1379\ShowLuaExampleThree {utf} {xstring} {"à"}
1380\ShowLuaExampleThree {utf} {tocodes} {"ùúü"}
1381\ShowLuaExampleThree {utf} {tocodes} {"àáä",""}
1382\ShowLuaExampleThree {utf} {tocodes} {"òóö","+"}
1383
1384\stopsummary
1385
1386\startsummary[title={split splitlines totable}]
1387
1388The \type {split} function splits a sequence of \UTF\ characters into a table
1389which one character per slot. The \type {splitlines} does the same but each slot
1390has a line instead. The \type {totable} function is similar to \type {split}, but
1391the later strips an optionally present \UTF\ bom.
1392
1393\ShowLuaExampleThree {utf} {split} {"òóö"}
1394
1395\stopsummary
1396
1397\startsummary[title={count}]
1398
1399This function counts the number of times that a given substring occurs in a
1400string. The patterns can be a string or an \LPEG\ pattern.
1401
1402\ShowLuaExampleThree {utf} {count} {"òóöòóöòóö","ö"}
1403\ShowLuaExampleThree {utf} {count} {"äáàa",lpeg.P("á") + lpeg.P("à")}
1404
1405\stopsummary
1406
1407\startsummary[title={remapper replacer substituter}]
1408
1409With \type {remapper} you can create a remapping function that remaps a given
1410string using a (hash) table.
1411
1412\starttyping
1413local remap = utf.remapper { a = 'd', b = "c", c = "b", d = "a" }
1414
1415print(remap("abcd 1234 abcd"))
1416\stoptyping
1417
1418A remapper checks each character against the given mapping table. Its cousin
1419\type {replacer} is more efficient and skips non matches. The \type {substituter}
1420function only does a quick check first and avoids building a string with no
1421replacements. That one is much faster when you expect not that many replacements.
1422
1423The \type {replacer} and \type {substituter} functions take table as argument
1424and an indexed as well as hashed one are acceptable. In fact you can even do
1425things like this:
1426
1427\starttyping
1428local rep = utf.replacer { [lpeg.patterns.digit] = "!" }
1429\stoptyping
1430
1431\stopsummary
1432
1433\startsummary[title={is_valid}]
1434
1435This function returns false if the argument is no valid \UTF\ string. As \LUATEX\
1436is pretty strict with respect to the input, this function is only useful when
1437dealing with external files.
1438
1439\starttyping
1440function checkfile(filename)
1441  local data = io.loaddata(filename)
1442  if data and data ~= "" and not utf.is_valid(data) then
1443    logs.report("error","file %q contains invalid utf",filename)
1444  end
1445end
1446\stoptyping
1447
1448\stopsummary
1449
1450% not that relevant:
1451%
1452% -- utf.filetype
1453% -- string.toutf
1454
1455\stopsection
1456
1457\startsection[title=Numbers and bits]
1458
1459In the \type {number} namespace we collect some helpers that deal with numbers as
1460well as bits. Starting with \LUA\ 5.2 a library \type {bit32} is but the language
1461itself doesn't provide for them via operators: the library uses functions to
1462manipulate numbers upto 2\high{32}. In the latest \LUATEX\ you can use the new
1463bit related operators.
1464
1465% For advanced bit manipulations you should use the \type {bit32} library, otherwise
1466% it's best to stick to the functions described here.
1467%
1468% \startsummary[title={hasbit setbit clearbit}]
1469%
1470% As bitsets are numbers you will also use numbers to qualify them. So, if you want to
1471% set bits 1, 4 and 8, you can to that using the following specification:
1472%
1473% \starttyping
1474% local b = 1 + 4 + 8 -- 0x1 + 0x4 + 0x8
1475% local b = 13        -- or 0xC
1476% \stoptyping
1477%
1478% However, changing one bit by adding a number to an existing doesn't work out that well
1479% if that number already has that bit set. Instead we use:
1480%
1481% \starttyping
1482% local b = number.setbit(b,0x4)
1483% \stoptyping
1484%
1485% In a similar fashion you can turn of a bit:
1486%
1487% \starttyping
1488% local b = number.clearbit(b,0x4)
1489% \stoptyping
1490%
1491% Testing for a bit(set) is done as follows:
1492%
1493% \starttyping
1494% local okay = number.hasbit(b,0x4)
1495% \stoptyping
1496%
1497% \stopsummary
1498%
1499% \startsummary[title={bit}]
1500%
1501% Where the previously mentioned helpers work with numbers representing one or more
1502% bits, it is sometimes handy to work with positions. The \type {bit} function
1503% returns the associated number value.
1504%
1505% \ShowLuaExampleThree {number} {bit} {5}
1506%
1507% \stopsummary
1508
1509\startsummary[title={tobitstring}]
1510
1511There is no format option to go from number to bits in terms of zeros and ones so
1512we provide a helper: \type {tobitsting}.
1513
1514\ShowLuaExampleThree {number} {tobitstring} {2013}
1515\ShowLuaExampleThree {number} {tobitstring} {2013,3}
1516\ShowLuaExampleThree {number} {tobitstring} {2013,1}
1517
1518\stopsummary
1519
1520% \startsummary[title={bits}]
1521%
1522% If you ever want to convert a bitset into a table containing the set bits you can
1523% use this function.
1524%
1525% \ShowLuaExampleTwo {number} {bits} {11}
1526%
1527% \stopsummary
1528%
1529% \startsummary[title={toset}]
1530%
1531% A string or number can be split into digits with \type {toset}. Beware, this
1532% function does not return a function but multiple numbers
1533%
1534% \starttyping
1535% local a, b, c, d = number.toset("1001")
1536% \stoptyping
1537%
1538% The returned values are either numbers or \type {nil} when an valid digit is
1539% seen.
1540%
1541% \ShowLuaExampleSeven {number} {toset} {100101}
1542% \ShowLuaExampleSeven {number} {toset} {"100101"}
1543% \ShowLuaExampleSeven {number} {toset} {"21546"}
1544%
1545% \stopsummary
1546
1547\startsummary[title={valid}]
1548
1549This function can be used to check or convert a number, for instance in user
1550interfaces.
1551
1552\ShowLuaExampleThree {number} {valid} {12}
1553\ShowLuaExampleThree {number} {valid} {"34"}
1554\ShowLuaExampleThree {number} {valid} {"ab",56}
1555
1556\stopsummary
1557
1558\stopsection
1559
1560\startsection[title=\LPEG\ patterns]
1561
1562For \LUATEX\ and \CONTEXT\ \MKIV\ the \type {lpeg} library came at the right
1563moment as we can use it in lots of places. An in|-|depth discussion makes no
1564sense as it's easier to look into \type {l-lpeg.lua}, so we stick to an overview.
1565\footnote {If you search the web for \type {lua lpeg} you will end up at the
1566official documentation and tutorial.} Most functions return an \type {lpeg}
1567object that can be used in a match. In time critical situations it's more
1568efficient to use the match on a predefined pattern that to create the pattern new
1569each time. Patterns are cached so there is no penalty in predefining a pattern.
1570So, in the following example, the \type {splitter} that splits at the asterisk
1571will only be created once.
1572
1573\starttyping
1574local splitter_1 = lpeg.splitat("*")
1575local splitter_2 = lpeg.splitat("*")
1576
1577local n, m = lpeg.match(splitter_1,"2*4")
1578local n, m = lpeg.match(splitter_2,"2*4")
1579\stoptyping
1580
1581\startsummary[title={[lua] match print P R S V C Cc Cs ...}]
1582
1583The \type {match} function does the real work. Its first argument is a \type
1584{lpeg} object that is created using the functions with the short uppercase names.
1585
1586\starttyping
1587local P, R, C, Ct = lpeg.P, lpeg.R, lpeg.C, lpeg.Ct
1588
1589local pattern = Ct((P("[") * C(R("az")^0) * P(']') + P(1))^0)
1590
1591local words = lpeg.match(pattern,"a [first] and [second] word")
1592\stoptyping
1593
1594In this example the words between square brackets are collected in a table. There
1595are lots of examples of \type {lpeg} in the \CONTEXT\ code base.
1596
1597\stopsummary
1598
1599\startsummary[title={anywhere}]
1600
1601\starttyping
1602local p = anywhere(pattern)
1603\stoptyping
1604
1605\ShowLuaExampleTwo {lpeg} {match} {lpeg.Ct((lpeg.anywhere("->")/"!")^0), "oeps->what->more"}
1606
1607\stopsummary
1608
1609\startsummary[title={splitter splitat firstofsplit secondofsplit}]
1610
1611The \type {splitter} function returns a pattern where each match gets an action
1612applied. The action can be a function, table or string.
1613
1614\starttyping
1615local p = splitter(pattern, action)
1616\stoptyping
1617
1618The \type {splitat} function returns a pattern that will return the split off
1619parts. Unless the second argument is \type {true} the splitter keeps splitting
1620
1621\starttyping
1622local p = splitat(separator,single)
1623\stoptyping
1624
1625When you need to split off a prefix (for instance in a label) you can use:
1626
1627\starttyping
1628local p = firstofsplit(separator)
1629local p = secondofsplit(separator)
1630\stoptyping
1631
1632The first function returns the original when there is no match but the second
1633function returns \type {nil} instead.
1634
1635\ShowLuaExampleTwo {lpeg} {match} {lpeg.Ct(lpeg.splitat("->",false)), "oeps->what->more"}
1636\ShowLuaExampleTwo {lpeg} {match} {lpeg.Ct(lpeg.splitat("->",false)), "oeps"}
1637\ShowLuaExampleTwo {lpeg} {match} {lpeg.Ct(lpeg.splitat("->",true)), "oeps->what->more"}
1638\ShowLuaExampleTwo {lpeg} {match} {lpeg.Ct(lpeg.splitat("->",true)), "oeps"}
1639
1640\ShowLuaExampleThree {lpeg} {match} {lpeg.firstofsplit(":"), "before:after"}
1641\ShowLuaExampleThree {lpeg} {match} {lpeg.firstofsplit(":"), "whatever"}
1642\ShowLuaExampleThree {lpeg} {match} {lpeg.secondofsplit(":"), "before:after"}
1643\ShowLuaExampleThree {lpeg} {match} {lpeg.secondofsplit(":"), "whatever"}
1644
1645\stopsummary
1646
1647\startsummary[title={split checkedsplit}]
1648
1649The next two functions have counterparts in the \type {string} namespace. They
1650return a table with the split parts. The second function omits empty parts.
1651
1652\starttyping
1653local t = split       (separator,str)
1654local t = checkedsplit(separator,str)
1655\stoptyping
1656
1657\ShowLuaExampleTwo {lpeg} {split}        {",","a,b,c"}
1658\ShowLuaExampleTwo {lpeg} {split}        {",",",a,,b,c,"}
1659\ShowLuaExampleTwo {lpeg} {checkedsplit} {",",",a,,b,c,"}
1660
1661\stopsummary
1662
1663\startsummary[title={stripper keeper replacer}]
1664
1665These three functions return patterns that manipulate a string. The \type
1666{replacer} gets a mapping table passed.
1667
1668\starttyping
1669local p = stripper(str or pattern)
1670local p = keeper  (str or pattern)
1671local p = replacer(mapping)
1672\stoptyping
1673
1674\ShowLuaExampleThree {lpeg} {match} {lpeg.stripper(lpeg.R("az")), "[-a-b-c-d-]"}
1675\ShowLuaExampleThree {lpeg} {match} {lpeg.stripper("ab"), "[-a-b-c-d-]"}
1676\ShowLuaExampleThree {lpeg} {match} {lpeg.keeper(lpeg.R("az")), "[-a-b-c-d-]"}
1677\ShowLuaExampleThree {lpeg} {match} {lpeg.keeper("ab"), "[-a-b-c-d-]"}
1678\ShowLuaExampleThree {lpeg} {match} {lpeg.replacer{{"a","p"},{"b","q"}}, "[-a-b-c-d-]"}
1679
1680\stopsummary
1681
1682\startsummary[title={balancer}]
1683
1684One of the nice things about \type {lpeg} is that it can handle all kind of
1685balanced input. So, a function is provided that returns a balancer pattern:
1686
1687\starttyping
1688local p = balancer(left,right)
1689\stoptyping
1690
1691\ShowLuaExampleTwo {lpeg} {match} {lpeg.Ct((lpeg.C(lpeg.balancer("{","}"))+1)^0),"{a} {b{c}}"}
1692\ShowLuaExampleTwo {lpeg} {match} {lpeg.Ct((lpeg.C(lpeg.balancer("((","]"))+1)^0),"((a] ((b((c]]"}
1693
1694\stopsummary
1695
1696\startsummary[title={counter}]
1697
1698The \type {counter} function returns a function that returns the length of a
1699given string. The \type {count} function differs from its counterpart living in
1700the \type {string} namespace in that it deals with \UTF\ and accepts strings as
1701well as patterns.
1702
1703\starttyping
1704local fnc = counter(lpeg.P("á") + lpeg.P("à"))
1705local len = fnc("äáàa")
1706\stoptyping
1707
1708\stopsummary
1709
1710\startsummary[title={UP US UR}]
1711
1712In order to make working with \UTF-8 input somewhat more convenient a few helpers
1713are provided.
1714
1715\starttyping
1716local p = lpeg.UP(utfstring)
1717local p = lpeg.US(utfstring)
1718local p = lpeg.UR(utfpair)
1719local p = lpeg.UR(first,last)
1720\stoptyping
1721
1722\ShowLuaExampleThree {utf} {count} {"äáàa",lpeg.UP("áà")}
1723\ShowLuaExampleThree {utf} {count} {"äáàa",lpeg.US("àá")}
1724\ShowLuaExampleThree {utf} {count} {"äáàa",lpeg.UR("")}
1725\ShowLuaExampleThree {utf} {count} {"äáàa",lpeg.UR("àá")}
1726\ShowLuaExampleThree {utf} {count} {"äáàa",lpeg.UR(0x0000,0xFFFF)}
1727
1728\startsummary[title={patterns}]
1729
1730The following patterns are available in the \type {patterns} table in the \type
1731{lpeg} namespace:
1732
1733\startluacode
1734context.startalignment { "flushleft" }
1735local done = false
1736for k, v in table.sortedpairs(lpeg.patterns) do
1737    if done then
1738        context.space()
1739    else
1740        done = true
1741    end
1742    context.type(k)
1743end
1744context.stopalignment()
1745\stopluacode
1746
1747There will probably be more of them in the future.
1748
1749\stopsummary
1750
1751\stopsection
1752
1753\startsection[title=IO]
1754
1755The \type {io} library is extended with a couple of functions as well and
1756variables but first we mention a few predefined functions.
1757
1758\startsummary[title={[lua] open popen...}]
1759
1760The IO library deals with in- and output from the console and
1761files.
1762
1763\starttyping
1764local f  = io.open(filename)
1765\stoptyping
1766
1767When the call succeeds \type {f} is a file object. You close this file
1768with:
1769
1770\starttyping
1771f:close()
1772\stoptyping
1773
1774Reading from a file is done with \type {f:read(...)} and writing to a file with
1775\type {f:write(...)}. In order to write to a file, when opening a second argument
1776has to be given, often \type {wb} for writing (binary) data. Although there are
1777more efficient ways, you can use the \type {f:lines()} iterator to process a file
1778line by line.
1779
1780You can open a process with \type {io.popen} but dealing with this one depends a
1781bit on the operating system.
1782
1783\stopsummary
1784
1785\startsummary[title={fileseparator pathseparator}]
1786
1787The value of the following two strings depends on the operating system that is
1788used.
1789
1790\starttyping
1791io.fileseparator
1792io.pathseparator
1793\stoptyping
1794
1795\ShowLuaExampleFive {io} {fileseparator}
1796\ShowLuaExampleFive {io} {pathseparator}
1797
1798\stopsummary
1799
1800\startsummary[title={loaddata savedata}]
1801
1802These two functions save you some programming. The first function loads a whole
1803file in a string. By default the file is loaded in binary mode, but when the
1804second argument is \type {true}, some interpretation takes place (for instance
1805line endings). In practice the second argument can best be left alone.
1806
1807\starttyping
1808io.loaddata(filename,textmode)
1809\stoptyping
1810
1811Saving the data is done with:
1812
1813\starttyping
1814io.savedata(filename,str)
1815io.savedata(filename,tab,joiner)
1816\stoptyping
1817
1818When a table is given, you can optionally specify a string that
1819ends up between the elements that make the table.
1820
1821\stopsummary
1822
1823\startsummary[title={exists size noflines}]
1824
1825These three function don't need much comment.
1826
1827\starttyping
1828io.exists(filename)
1829io.size(filename)
1830io.noflines(fileobject)
1831io.noflines(filename)
1832\stoptyping
1833
1834\stopsummary
1835
1836\startsummary[title={characters bytes readnumber readstring}]
1837
1838When I wrote the icc profile loader, I needed a few helpers for reading strings
1839of a certain length and numbers of a given width. Both accept five values of
1840\type {n}: \type {-4}, \type {-2}, \type {1}, \type {2} and \type {4} where the
1841negative values swap the characters or bytes.
1842
1843\starttyping
1844io.characters(f,n) --
1845io.bytes(f,n)
1846\stoptyping
1847
1848The function \type {readnumber} accepts five sizes: \type {1}, \type {2}, \type
1849{4}, \type {8}, \type {12}. The string function handles any size and strings zero
1850bytes from the string.
1851
1852\starttyping
1853io.readnumber(f,size)
1854io.readstring(f,size)
1855\stoptyping
1856
1857Optionally you can give the position where the reading has to start:
1858
1859\starttyping
1860io.readnumber(f,position,size)
1861io.readstring(f,position,size)
1862\stoptyping
1863
1864\stopsummary
1865
1866\startsummary[title={ask}]
1867
1868In practice you will probably make your own variant of the following function,
1869but at least a template is there:
1870
1871\starttyping
1872io.ask(question,default,options)
1873\stoptyping
1874
1875For example:
1876
1877\starttyping
1878local answer = io.ask("choice", "two", { "one", "two" })
1879\stoptyping
1880
1881\stopsummary
1882
1883\stopsection
1884
1885\startsection[title=File]
1886
1887The file library is one of the larger core libraries that comes with
1888\CONTEXT.
1889
1890\startsummary[title={dirname basename extname nameonly}]
1891
1892We start with a few filename manipulators.
1893
1894\starttyping
1895local path   = file.dirname(name,default)
1896local base   = file.basename(name)
1897local suffix = file.extname(name,default) -- or file.suffix
1898local name   = file.nameonly(name)
1899\stoptyping
1900
1901\ShowLuaExampleThree {file} {dirname}  {"/data/temp/whatever.cld"}
1902\ShowLuaExampleThree {file} {dirname}  {"c:/data/temp/whatever.cld"}
1903\ShowLuaExampleThree {file} {basename} {"/data/temp/whatever.cld"}
1904\ShowLuaExampleThree {file} {extname}  {"c:/data/temp/whatever.cld"}
1905\ShowLuaExampleThree {file} {nameonly} {"/data/temp/whatever.cld"}
1906
1907\stopsummary
1908
1909\startsummary[title={addsuffix replacesuffix}]
1910
1911These functions are used quite often:
1912
1913\starttyping
1914local filename = file.addsuffix(filename, suffix, criterium)
1915local filename = file.replacesuffix(filename, suffix)
1916\stoptyping
1917
1918The first one adds a suffix unless one is present. When \type {criterium} is
1919\type {true} no checking is done and the suffix is always appended. The second
1920function replaces the current suffix or add one when there is none.
1921
1922\ShowLuaExampleThree {file} {addsuffix} {"whatever","cld"}
1923\ShowLuaExampleThree {file} {addsuffix} {"whatever.tex","cld"}
1924\ShowLuaExampleThree {file} {addsuffix} {"whatever.tex","cld",true}
1925
1926\ShowLuaExampleThree {file} {replacesuffix} {"whatever","cld"}
1927\ShowLuaExampleThree {file} {replacesuffix} {"whatever.tex","cld"}
1928
1929\stopsummary
1930
1931\startsummary[title={is_writable is_readable}]
1932
1933These two test the nature of a file:
1934
1935\starttyping
1936file.is_writable(name)
1937file.is_readable(name)
1938\stoptyping
1939
1940\stopsummary
1941
1942\startsummary[title={splitname join collapsepath}]
1943
1944Instead of splitting off individual components you can get them all in one go:
1945
1946\starttyping
1947local drive, path, base, suffix = file.splitname(name)
1948\stoptyping
1949
1950The \type {drive} variable is empty on operating systems other than \MSWINDOWS.
1951Such components are joined with the function:
1952
1953\starttyping
1954file.join(...)
1955\stoptyping
1956
1957The given snippets are joined using the \type {/} as this is
1958rather platform independent. Some checking takes place in order
1959to make sure that nu funny paths result from this. There is
1960also \type {collapsepath} that does some cleanup on a path
1961with relative components, like \type {..}.
1962
1963\ShowLuaExampleSix   {file} {splitname}    {"a:/b/c/d.e"}
1964\ShowLuaExampleThree {file} {join}         {"a","b","c.d"}
1965\ShowLuaExampleThree {file} {collapsepath} {"a/b/../c.d"}
1966\ShowLuaExampleThree {file} {collapsepath} {"a/b/../c.d",true}
1967
1968\stopsummary
1969
1970\startsummary[title={splitpath joinpath}]
1971
1972By default splitting a execution path specification is done using the operating
1973system dependant separator, but you can force one as well:
1974
1975\starttyping
1976file.splitpath(str,separator)
1977\stoptyping
1978
1979The reverse operation is done with:
1980
1981\starttyping
1982file.joinpath(tab,separator)
1983\stoptyping
1984
1985Beware: in the following examples the separator is system dependent so
1986the outcome depends on the platform you run on.
1987
1988\ShowLuaExampleTwo   {file} {splitpath} {"a:b:c"}
1989\ShowLuaExampleTwo   {file} {splitpath} {"a;b;c"}
1990\ShowLuaExampleThree {file} {joinpath}  {{"a","b","c"}}
1991
1992\stopsummary
1993
1994\startsummary[title={robustname}]
1995
1996In workflows filenames with special characters can be a pain so the following
1997function replaces characters other than letters, digits, periods, slashes and
1998hyphens by hyphens.
1999
2000\starttyping
2001file.robustname(str,strict)
2002\stoptyping
2003
2004\ShowLuaExampleThree {file} {robustname} {"We don't like this!"}
2005\ShowLuaExampleThree {file} {robustname} {"We don't like this!",true}
2006
2007\stopsummary
2008
2009\startsummary[title={readdata writedata}]
2010
2011These two functions are duplicates of functions with the
2012same name in the \type {io} library.
2013
2014\stopsummary
2015
2016\startsummary[title={copy}]
2017
2018There is not much to comment on this one:
2019
2020\starttyping
2021file.copy(oldname,newname)
2022\stoptyping
2023
2024\stopsummary
2025
2026\startsummary[title={is_qualified_path is_rootbased_path}]
2027
2028A qualified path has at least one directory component while a rootbased path is
2029anchored to the root of a filesystem or drive.
2030
2031\starttyping
2032file.is_qualified_path(filename)
2033file.is_rootbased_path(filename)
2034\stoptyping
2035
2036\ShowLuaExampleThree {file} {is_qualified_path} {"a"}
2037\ShowLuaExampleThree {file} {is_qualified_path} {"a/b"}
2038\ShowLuaExampleThree {file} {is_rootbased_path} {"a/b"}
2039\ShowLuaExampleThree {file} {is_rootbased_path} {"/a/b"}
2040
2041\stopsummary
2042
2043\stopsection
2044
2045\startsection[title=Dir]
2046
2047The \type {dir} library uses functions of the \type {lfs} library that is linked
2048into \LUATEX.
2049
2050\startsummary[title={current}]
2051
2052This returns the current directory:
2053
2054\starttyping
2055dir.current()
2056\stoptyping
2057
2058\stopsummary
2059
2060\startsummary[title={glob globpattern globfiles}]
2061
2062% not yet documented: dir.collectpattern(path,patt,recurse,result) -- collects tree
2063
2064The \type {glob} function collects files with names that match a given pattern.
2065The pattern can have wildcards: \type {*} (oen of more characters), \type {?}
2066(one character) or \type {**} (one or more directories). You can pass the
2067function a string or a table with strings. Optionally a second argument can be
2068passed, a table that the results are appended to.
2069
2070\starttyping
2071local files = dir.glob(pattern,target)
2072local files = dir.glob({pattern,...},target)
2073\stoptyping
2074
2075The target is optional and often you end up with simple calls like:
2076
2077\starttyping
2078local files = dir.glob("*.tex")
2079\stoptyping
2080
2081There is a more extensive version where you start at a path, and applies an
2082action to each file that matches the pattern. You can either or not force
2083recursion.
2084
2085\starttyping
2086dir.globpattern(path,patt,recurse,action)
2087\stoptyping
2088
2089The \type {globfiles} function collects matches in a table that is returned at
2090the end. You can pass an existing table as last argument. The first argument is
2091the starting path, the second arguments controls analyzing directories and the
2092third argument has to be a function that gets a name passed and is supposed to
2093return \type {true} or \type {false}. This function determines what gets
2094collected.
2095
2096\starttyping
2097dir.globfiles(path,recurse,func,files)
2098\stoptyping
2099
2100\stopsummary
2101
2102\startsummary[title={makedirs}]
2103
2104With \type {makedirs} you can create the given directory. If more than one
2105name is given they are concatinated.
2106
2107\starttyping
2108dir.makedirs(name,...)
2109\stoptyping
2110
2111\stopsummary
2112
2113\startsummary[title={expandname}]
2114
2115This function tries to resolve the given path, including relative paths.
2116
2117\starttyping
2118dir.expandname(str)
2119\stoptyping
2120
2121\ShowLuaExampleThree {dir} {expandname} {"."}
2122
2123\stopsummary
2124
2125\stopsection
2126
2127\startsection[title=URL]
2128
2129\startsummary[title={split hashed construct}]
2130
2131This is a specialized library. You can split an \type {url} into its components.
2132An \URL\ is constructed like this:
2133
2134\starttyping
2135foo://example.com:2010/alpha/beta?gamma=delta#epsilon
2136\stoptyping
2137
2138\starttabulate[|T|T|]
2139\NC scheme    \NC foo://           \NC \NR
2140\NC authority \NC example.com:2010 \NC \NR
2141\NC path      \NC /alpha/beta      \NC \NR
2142\NC query     \NC gamma=delta      \NC \NR
2143\NC fragment  \NC epsilon          \NC \NR
2144\stoptabulate
2145
2146A string is split into a hash table with these keys using the following function:
2147
2148\starttyping
2149url.hashed(str)
2150\stoptyping
2151
2152or in strings with:
2153
2154\starttyping
2155url.split(str)
2156\stoptyping
2157
2158The hash variant is more tolerant than the split. In the hash
2159there is also a key \type {original} that holds the original \URL\
2160and and the boolean \type {noscheme} indicates if there is a
2161scheme at all.
2162
2163The reverse operation is done with:
2164
2165\starttyping
2166url.construct(hash)
2167\stoptyping
2168
2169\startasciimode
2170\ShowLuaExampleTwo {url} {hashed} {"foo://example.com:2010/alpha/beta?gamma=delta#epsilon"}
2171\ShowLuaExampleTwo {url} {hashed} {"alpha/beta"}
2172\ShowLuaExampleTwo {url} {split}  {"foo://example.com:2010/alpha/beta?gamma=delta#epsilon"}
2173\ShowLuaExampleTwo {url} {split}  {"alpha/beta"}
2174\stopasciimode
2175
2176\startsummary[title={hasscheme addscheme filename query}]
2177
2178There are a couple of helpers and their names speaks for themselves:
2179
2180\starttyping
2181url.hasscheme(str)
2182url.addscheme(str,scheme)
2183url.filename(filename)
2184url.query(str)
2185\stoptyping
2186
2187\ShowLuaExampleThree {url} {hasscheme} {"http://www.pragma-ade.com/cow.png"}
2188\ShowLuaExampleThree {url} {hasscheme} {"www.pragma-ade.com/cow.png"}
2189\ShowLuaExampleThree {url} {addscheme} {"www.pragma-ade.com/cow.png","http://"}
2190\ShowLuaExampleThree {url} {addscheme} {"www.pragma-ade.com/cow.png"}
2191\ShowLuaExampleThree {url} {filename}  {"http://www.pragma-ade.com/cow.png"}
2192\ShowLuaExampleTwo   {url} {query}     {"a=b&c=d"}
2193
2194\stopsection
2195
2196\startsection[title=OS]
2197
2198\startsummary[title={[lua luatex] env setenv getenv}]
2199
2200In \CONTEXT\ normally you will use the resolver functions to deal with the
2201environment and files. However, a more low level interface is still available.
2202You can query and set environment variables with two functions. In addition there
2203is the \type {env} table as interface to the environment. This threesome replaces
2204the built in functions.
2205
2206\starttyping
2207os.setenv(key,value)
2208os.getenv(key)
2209os.env[key]
2210\stoptyping
2211
2212\stopsummary
2213
2214\startsummary[title={[lua] execute}]
2215
2216There are several functions for running programs. One comes directly from \LUA,
2217the otheres come with \LUATEX. All of them are are overloaded in \CONTEXT\ in
2218order to get more control.
2219
2220\starttyping
2221os.execute(...)
2222\stoptyping
2223
2224\stopsummary
2225
2226\startsummary[title={[luatex] spawn exec}]
2227
2228Two other runners are:
2229
2230\starttyping
2231os.spawn(...)
2232os.exec (...)
2233\stoptyping
2234
2235The \type {exec} variant will transfer control from the current process to the
2236new one and not return to the current job. There is a more detailed explanation
2237in the \LUATEX\ manual.
2238
2239\stopsummary
2240
2241\startsummary[title={resultof launch}]
2242
2243The following function runs the command and returns the result as string.
2244Multiple lines are combined.
2245
2246\starttyping
2247os.resultof(command)
2248\stoptyping
2249
2250The next one launches a file assuming that the operating system knows
2251what application to use.
2252
2253\starttyping
2254os.launch(str)
2255\stoptyping
2256
2257\stopsummary
2258
2259\startsummary[title={type name platform libsuffix binsuffix}]
2260
2261There are a couple of strings that reflect the current machinery: \type {type}
2262returns either \type {windows} or \type {unix}. The variable \type {name} is more
2263detailed: \type {windows}, \type {msdos}, \type {linux}, \type {macosx}, etc. If
2264you also want the architecture you can consult \type {platform}.
2265
2266\starttyping
2267local t = os.type
2268local n = os.name
2269local p = os.platform
2270\stoptyping
2271
2272These three variables as well as the next two are used internally and normally
2273they are not needed in your applications as most functions that matter are aware
2274of what platform specific things they have to deal with.
2275
2276\starttyping
2277local s = os.libsuffix
2278local b = os.binsuffix
2279\stoptyping
2280
2281These are string, not functions.
2282
2283\ShowLuaExampleFive {os} {type}
2284\ShowLuaExampleFive {os} {name}
2285\ShowLuaExampleFive {os} {platform}
2286\ShowLuaExampleFive {os} {libsuffix}
2287\ShowLuaExampleFive {os} {binsuffix}
2288
2289\stopsummary
2290
2291\startsummary[title={[lua] time}]
2292
2293The built in time function returns a number. The accuracy is
2294implementation dependent and not that large.
2295
2296\ShowLuaExampleThree {os} {time} {}
2297
2298\stopsummary
2299
2300\startsummary[title={[luatex] times gettimeofday}]
2301
2302Although \LUA\ has a built in type {os.time} function, we normally will use the
2303one provided by \LUATEX\ as it is more precise:
2304
2305\starttyping
2306os.gettimeofday()
2307\stoptyping
2308
2309% % This one is gone in luametatex:
2310%
2311% There is also a more extensive variant:
2312%
2313% \starttyping
2314% os.times()
2315% \stoptyping
2316%
2317% This one is platform dependent and returns a table with \type {utime} (use time),
2318% \type {stime} (system time), \type {cutime} (children user time), and \type
2319% {cstime} (children system time).
2320
2321\stopsummary
2322
2323\ShowLuaExampleThree {os} {gettimeofday} {}
2324%ShowLuaExampleTwo   {os} {times}        {}
2325
2326\startsummary[title={runtime}]
2327
2328More interesting is:
2329
2330\starttyping
2331os.runtime()
2332\stoptyping
2333
2334which returns the time spent in the application so far.
2335
2336\ShowLuaExampleThree {os} {runtime} {}
2337
2338Sometimes you need to add the timezone to a verbose time and the following
2339function does that for you.
2340
2341\starttyping
2342os.timezone(delta)
2343\stoptyping
2344
2345\ShowLuaExampleThree {os} {timezone} {}
2346\ShowLuaExampleThree {os} {timezone} {1}
2347\ShowLuaExampleThree {os} {timezone} {-1}
2348
2349\stopsummary
2350
2351\startsummary[title={uuid}]
2352
2353A version 4 UUID can be generated with:
2354
2355\starttyping
2356os.uuid()
2357\stoptyping
2358
2359The generator is good enough for our purpose.
2360
2361\ShowLuaExampleThree {os} {uuid} {}
2362
2363\stopsummary
2364
2365\stopsection
2366
2367\stopchapter
2368
2369\stopcomponent
2370