lowlevel-grouping.tex /size: 13 Kb    last modification: 2023-12-21 09:43
1% language=us runpath=texruns:manuals/lowlevel
2
3\environment lowlevel-style
4
5\startdocument
6  [title=grouping,
7   color=middlecyan]
8
9\startsectionlevel[title=Introduction]
10
11This is a rather short explanation. I decided to write it after presenting the
12other topics at the 2019 \CONTEXT\ meeting where there was a question about
13grouping.
14
15% \stopsectionlevel
16
17\startsectionlevel[title=\PASCAL]
18
19In a language like \PASCAL, the language that \TEX\ has been written in, or
20\MODULA, its successor, there is no concept of grouping like in \TEX. But we can
21find keywords that suggests this:
22
23\starttyping
24for i := 1 to 10 do begin ... end
25\stoptyping
26
27This language probably inspired some of the syntax of \TEX\ and \METAPOST. For
28instance an assignment in \METAPOST\ uses \type {:=} too. However, the \type
29{begin} and \type {end} don't really group but define a block of statements. You
30can have local variables in a procedure or function but the block is just a way
31to pack a sequence of statements.
32
33\stopsectionlevel
34
35\startsectionlevel[title=\TEX]
36
37In \TEX\ macros (or source code) the following can occur:
38
39\starttyping[option=TEX]
40\begingroup
41    ...
42\endgroup
43\stoptyping
44
45as well as:
46
47\starttyping[option=TEX]
48\bgroup
49    ...
50\egroup
51\stoptyping
52
53Here we really group in the sense that assignments to variables inside a group
54are forgotten afterwards. All assignments are local to the group unless they are
55explicitly done global:
56
57\starttyping[option=TEX]
58\scratchcounter=1
59\def\foo{foo}
60\begingroup
61    \scratchcounter=2
62    \global\globalscratchcounter=2
63    \gdef\foo{FOO}
64\endgroup
65\stoptyping
66
67Here \type {\scratchcounter} is still one after the group is left but its global
68counterpart is now two. The \type {\foo} macro is also changed globally.
69
70Although you can use both sets of commands to group, you cannot mix them, so this
71will trigger an error:
72
73\starttyping[option=TEX]
74\bgroup
75\endgroup
76\stoptyping
77
78The bottomline is: if you want a value to persist after the group, you need to
79explicitly change its value globally. This makes a lot of sense in the perspective
80of \TEX.
81
82\stopsectionlevel
83
84\startsectionlevel[title=\METAPOST]
85
86The \METAPOST\ language also has a concept of grouping but in this case it's more like a
87programming language.
88
89\starttyping[option=MP]
90begingroup ;
91    n := 123 ;
92endgroup ;
93\stoptyping
94
95In this case the value of \type {n} is 123 after the group is left, unless you do
96this (for numerics there is actually no need to declare them):
97
98\starttyping[option=MP]
99begingroup ;
100    save n ; numeric n ; n := 123 ;
101endgroup ;
102\stoptyping
103
104Given the use of \METAPOST\ (read: \METAFONT) this makes a lot of sense: often
105you use macros to simplify code and you do want variables to change. Grouping in
106this language serves other purposes, like hiding what is between these commands
107and let the last expression become the result. In a \type {vardef} grouping is
108implicit.
109
110So, in \METAPOST\ all assignments are global, unless a variable is explicitly
111saved inside a group.
112
113\stopsectionlevel
114
115\startsectionlevel[title=\LUA]
116
117In \LUA\ all assignments are global unless a variable is defines local:
118
119\starttyping[option=LUA]
120local x = 1
121local y = 1
122for i = 1, 10 do
123    local x = 2
124    y = 2
125end
126\stoptyping
127
128Here the value of \type {x} after the loop is still one but \type {y} is now two.
129As in \LUATEX\ we mix \TEX, \METAPOST\ and \LUA\ you can mix up these concepts.
130Another mixup is using \type {:=}, \type {endfor}, \type {fi} in \LUA\ after done
131some \METAPOST\ coding or using \type {end} instead of \type {endfor} in
132\METAPOST\ which can make the library wait for more without triggering an error.
133Proper syntax highlighting in an editor clearly helps.
134
135\stopsectionlevel
136
137\startsectionlevel[title=\CCODE]
138
139The \LUA\ language is a mix between \PASCAL\ (which is one reason why I like it)
140and \CCODE.
141
142\starttyping
143int x = 1 ;
144int y = 1 ;
145for (i=1; i<=10;i++) {
146    int x = 2 ;
147    y = 2 ;
148}
149\stoptyping
150
151The semicolon is also used in \PASCAL\ but there it is a separator and not a
152statement end, while in \METAPOST\ it does end a statement (expression).
153
154\stopsectionlevel
155
156\stopsectionlevel
157
158\startsectionlevel[title=Kinds of grouping]
159
160Explicit grouping is accomplished by the two grouping primitives:
161
162\starttyping[option=TEX]
163\begingroup
164    \sl render slanted here
165\endgroup
166\stoptyping
167
168However, often you will find this being used:
169
170\starttyping[option=TEX]
171{\sl render slanted here}
172\stoptyping
173
174This is not only more compact but also avoids the \type {\endgroup} gobbling
175following spaces when used inline. The next code is equivalent but also suffers
176from the gobbling:
177
178\starttyping[option=TEX]
179\bgroup
180    \sl render slanted here
181\egroup
182\stoptyping
183
184The \type {\bgroup} and \type {\egroup} commands are not primitives but aliases
185(made by \type {\let}) to the left and right curly brace. These two characters
186have so called category codes that signal that they can be used for grouping. The
187{\em can be} here suggest that there are other purposes and indeed there are, for
188instance in:
189
190\starttyping[option=TEX]
191\toks 0 = {abs}
192\hbox {def}
193\stoptyping
194
195In the case of a token list assignment the curly braces fence the assignment, so scanning
196stops when a matching right brace is found. The following are all valid:
197
198\starttyping[option=TEX]
199\toks 0 = {a{b}s}
200\toks 0 = \bgroup a{b}s}
201\toks 0 = {a{\bgroup b}s}
202\toks 0 = {a{\egroup b}s}
203\toks 0 = \bgroup a{\bgroup b}s}
204\toks 0 = \bgroup a{\egroup b}s}
205\stoptyping
206
207They have in common that the final fence has to be a right brace. That the first
208one can be a an alias is due to the fact that the scanner searches for a brace
209equivalent when it looks for the value. Because the equal is optional, there is
210some lookahead involved which involves expansion and possibly push back while
211once scanning for the content starts just tokens are collected, with a fast
212check for nested and final braces.
213
214In the case of the box, all these specifications are valid:
215
216\starttyping[option=TEX]
217\hbox {def}
218\hbox \bgroup def\egroup
219\hbox \bgroup def}
220\hbox \bgroup d{e\egroup f}
221\hbox {def\egroup
222\stoptyping
223
224This is because now the braces and equivalent act as grouping symbols so as long
225as they match we're fine. There is a pitfall here: you cannot mix and match
226different grouping, so the next issues an error:
227
228\starttyping[option=TEX]
229\bgroup xxx\endgroup   % error
230\begingroup xxx\egroup % error
231\stoptyping
232
233This can make it somewhat hard to write generic grouping macros without trickery
234that is not always obvious to the user. Fortunately it can be hidden in macros
235like the helper \typ {\groupedcommand}. In \LUAMETATEX\ we have a clean way out
236of this dilemma:
237
238\starttyping[option=TEX]
239\beginsimplegroup xxx\endsimplegroup
240\beginsimplegroup xxx\endgroup
241\beginsimplegroup xxx\egroup
242\stoptyping
243
244When you start a group with \typ {\beginsimplegroup} you can end it in the three
245ways shows above. This means that the user (or calling macro) doesn't take into
246account what kind of grouping was used to start with.
247
248When we are in math mode things are different. First of all, grouping with \typ
249{\begingroup} and \typ {\endgroup} in some cases works as expected, but because
250the math input is converted in a list that gets processed later some settings can
251become persistent, like changes in style or family. You can bet better use \typ
252{\beginmathgroup} and \typ {\endmathgroup} as they restore some properties. We
253also just mention the \type {\frozen} prefix that can be used to freeze
254assignments to some math specific parameters inside a group.
255
256\stopsectionlevel
257
258\startsectionlevel[title=Hooks]
259
260In addition to the original \type {\aftergroup} primitive we have some more
261hooks. They can best be demonstrated with an example:
262
263\startbuffer
264\begingroup \bf
265    %
266    \aftergroup   A \aftergroup   1
267    \atendofgroup B \atendofgroup 1
268    %
269    \aftergrouped   {A2}
270    \atendofgrouped {B2}
271    %
272    test
273\endgroup
274\stopbuffer
275
276\typebuffer[option=TEX]
277
278These collectors are accumulative. Watch how the bold is applied to what we
279inject before the group ends.
280
281\getbuffer
282
283\stopsectionlevel
284
285\startsectionlevel[title=Local versus global]
286
287When \TEX\ enters a group and an assignment is made the current value is stored
288on the save stack, and at the end of the group the original value is restored. In
289\LUAMETATEX\ this mechanism is made a bit more efficient by avoiding redundant
290stack entries. This is also why the next feature can give unexpected results when
291not used wisely.
292
293Now consider the following example:
294
295\startbuffer
296\newdimension\MyDimension
297
298\starttabulate[||||]
299    \NC         \MyDimension10pt \the\MyDimension
300    \NC \advance\MyDimension10pt \the\MyDimension
301    \NC \advance\MyDimension10pt \the\MyDimension \NC \NR
302    \NC         \MyDimension10pt \the\MyDimension
303    \NC \advance\MyDimension10pt \the\MyDimension
304    \NC \advance\MyDimension10pt \the\MyDimension \NC \NR
305\stoptabulate
306\stopbuffer
307
308\typebuffer[option=TEX]  \getbuffer
309
310The reason why we get the same values is that cells are a group and therefore the
311value gets restored as we move on. We can use the \type {\global} prefix to get
312around this
313
314\startbuffer
315\starttabulate[||||]
316    \NC \global        \MyDimension10pt \the\MyDimension
317    \NC \global\advance\MyDimension10pt \the\MyDimension
318    \NC \global\advance\MyDimension10pt \the\MyDimension \NC \NR
319    \NC \global        \MyDimension10pt \the\MyDimension
320    \NC \global\advance\MyDimension10pt \the\MyDimension
321    \NC \global\advance\MyDimension10pt \the\MyDimension \NC \NR
322\stoptabulate
323\stopbuffer
324
325\typebuffer[option=TEX]  \getbuffer
326
327Instead of using a global assignment and increment we can also use the following
328
329\startbuffer
330\constrained\MyDimension\zeropoint
331\starttabulate[||||]
332    \NC \retained        \MyDimension10pt \the\MyDimension
333    \NC \retained\advance\MyDimension10pt \the\MyDimension
334    \NC \retained\advance\MyDimension10pt \the\MyDimension \NC \NR
335    \NC \retained        \MyDimension10pt \the\MyDimension
336    \NC \retained\advance\MyDimension10pt \the\MyDimension
337    \NC \retained\advance\MyDimension10pt \the\MyDimension \NC \NR
338\stoptabulate
339\stopbuffer
340
341\typebuffer[option=TEX]  \getbuffer
342
343So what is the difference with the global approach? Say we have these two buffers:
344
345\startbuffer
346\startbuffer[one]
347    \global\MyDimension\zeropoint
348    \framed {
349        \framed {\global\advance\MyDimension10pt \the\MyDimension}
350        \framed {\global\advance\MyDimension10pt \the\MyDimension}
351    }
352    \framed {
353        \framed {\global\advance\MyDimension10pt \the\MyDimension}
354        \framed {\global\advance\MyDimension10pt \the\MyDimension}
355    }
356\stopbuffer
357
358\startbuffer[two]
359    \global\MyDimension\zeropoint
360    \framed {
361        \framed {\global\advance\MyDimension10pt \the\MyDimension}
362        \framed {\global\advance\MyDimension10pt \the\MyDimension}
363        \getbuffer[one]
364    }
365    \framed {
366        \framed {\global\advance\MyDimension10pt \the\MyDimension}
367        \framed {\global\advance\MyDimension10pt \the\MyDimension}
368        \getbuffer[one]
369    }
370\stopbuffer
371\stopbuffer
372
373\typebuffer[option=TEX] \getbuffer
374
375Typesetting the second buffer gives us:
376
377\startlinecorrection
378\getbuffer[two]
379\stoplinecorrection
380
381When we want to have these entities independent and not use different variables,
382the global settings bleeding from one into the other entity is messy. Therefore
383we can use this:
384
385\startbuffer
386\startbuffer[one]
387    \constrained\MyDimension\zeropoint
388    \framed {
389        \framed {\retained        \MyDimension10pt \the\MyDimension}
390        \framed {\retained\advance\MyDimension10pt \the\MyDimension}
391    }
392    \framed {
393        \framed {\retained        \MyDimension10pt \the\MyDimension}
394        \framed {\retained\advance\MyDimension10pt \the\MyDimension}
395    }
396\stopbuffer
397
398\startbuffer[two]
399    \constrained\MyDimension\zeropoint
400    \framed {
401        \framed {\retained\advance\MyDimension10pt \the\MyDimension}
402        \framed {\retained\advance\MyDimension10pt \the\MyDimension}
403        \getbuffer[one]
404    }
405    \framed {
406        \framed {\retained\advance\MyDimension10pt \the\MyDimension}
407        \framed {\retained\advance\MyDimension10pt \the\MyDimension}
408        \getbuffer[one]
409    }
410\stopbuffer
411\stopbuffer
412
413\typebuffer[option=TEX] \getbuffer
414
415Now we get this:
416
417\startlinecorrection
418\getbuffer[two]
419\stoplinecorrection
420
421The \type {\constrained} prefix makes sure that we have a stack entry, without
422being clever with respect to the current value. Then the \type {\retained} prefix
423can do its work reliably and avoid pushing the old value on the stack. Without
424the constrain it gets a bit unpredictable because then it all depends on where
425further up the chain the value was put on the stack. Of course one can argue that
426we should not have the \quotation {save stack redundant entries optimization} but
427that's not going to be removed.
428
429\stopsectionlevel
430
431\startsectionlevel[title=Files]
432
433Although it doesn't really fit in this chapter, here are some hooks into processing
434files:
435
436\starttyping[option=TEX]
437Hello World!\atendoffiled         {\writestatus{FILE}{ATEOF B #1}}\par
438Hello World!\atendoffiled         {\writestatus{FILE}{ATEOF A #1}}\par
439Hello World!\atendoffiled reverse {\writestatus{FILE}{ATEOF C #1}}\par
440Hello World!\begingroup\sl \atendoffiled {\endgroup}\par
441\stoptyping
442
443Inside a file you can register tokens that will be expanded when the file ends.
444You can also do that beforehand using a variant of the \type {\input} primitive:
445
446\starttyping[option=TEX]
447\eofinput {\writestatus{FILE}{DONE}} {thatfile.tex}
448\stoptyping
449
450This feature is mostly there for consistency with the hooks into groups and
451paragraphs but also because \type {\everyeof} is kind of useless given that one
452never knows beforehand if a file loads another file. The hooks mentioned above
453are bound to the current file.
454
455\stopsectionlevel
456
457\stopdocument
458