evenmore-parameters.tex /size: 15 Kb    last modification: 2021-10-28 13:50
1% language=us runpath=texruns:manuals/evenmore
2
3% This feature was done mid May 2020 with Alien Chatter (I really need to find the
4% original cd's) in the background.
5
6\environment evenmore-style
7
8\startcomponent evenmore-parameters
9
10\startchapter[title=Parameters]
11
12When \TEX\ reads input it either does something directly, like setting a
13register, loading a font, turning a character into a glyph node, packaging a box,
14or it sort of collects tokens and stores them somehow, in a macro (definition),
15in a token register, or someplace temporary to inject them into the input later.
16Here we'll be discussing macros, which have a special token list containing the
17preamble defining the arguments and a body doing the real work. For instance when
18you say:
19
20\starttyping[option=TEX]
21\def\foo#1#2{#1 + #2 + #1 + #2}
22\stoptyping
23
24the macro \type {\foo} is stored in such a way that it knows how to pick up the
25two arguments and when expanding the body, it will inject the collected arguments
26each time a reference like \type {#1} or \type {#2} is seen. In fact, quite
27often, \TEX\ pushes a list of tokens (like an argument) in the input stream and
28then detours in taking tokens from that list. Because \TEX\ does all its memory
29management itself the price of all that copying is not that high, although during
30a long and more complex run the individual tokens that make the forward linked
31list of tokens get scattered in token memory and memory access is still the
32bottleneck in processing.
33
34A somewhat simplified view of how a macro like this gets stored is the following:
35
36\starttyping
37hash entry "foo" with property "macro call" =>
38
39    match (# property stored)
40    match (# property stored)
41    end of match
42
43    match reference 1
44    other character +
45    match reference 2
46    other character +
47    match reference 1
48    other character +
49    match reference 2
50\stoptyping
51
52When a macro gets expanded, the scanner first collects all the passed arguments
53and then pushes those (in this case two) token lists on the parameter stack. Keep
54in mind that due to nesting many kinds of stacks play a role. When the body gets
55expanded and a reference is seen, the argument that it refers to gets injected
56into the input, so imagine that we have this definition:
57
58\starttyping[option=TEX]
59\foo#1#2{\ifdim\dimen0=0pt #1\else #2\fi}
60\stoptyping
61
62and we say:
63
64\starttyping[option=TEX]
65\foo{yes}{no}
66\stoptyping
67
68then it's as if we had typed:
69
70\starttyping[option=TEX]
71\ifdim\dimen0=0pt yes\else no\fi
72\stoptyping
73
74So, you'd better not have something in the arguments that messes up the condition
75parser! From the perspective of an expansion machine it all makes sense. But it
76also means that when arguments are not used, they still get parsed and stored.
77Imagine using this one:
78
79\starttyping[option=TEX]
80\def\foo#1{\iffalse#1\oof#1\oof#1\oof#1\oof#1\fi}
81\stoptyping
82
83When \TEX\ sees that the condition is false it will enter a fast scanning mode
84where it only looks at condition related tokens, so even if \type {\oof} is not
85defined this will work ok:
86
87\starttyping[option=TEX]
88\foo{!}
89\stoptyping
90
91But when we say this:
92
93\starttyping[option=TEX]
94\foo{\else}
95\stoptyping
96
97It will bark! This is because each \type {#1} reference will be resolved, so we
98effectively have
99
100\starttyping[option=TEX]
101\def\foo#1{\iffalse\else\oof\else\oof\else\oof\else\oof\else\fi}
102\stoptyping
103
104which is not good. On the other hand, since expansion takes place in quick
105parsing mode, this will work:
106
107\starttyping[option=TEX]
108\def\oof{\else}
109\foo\oof
110\stoptyping
111
112which actually is:
113
114\starttyping[option=TEX]
115\def\foo#1{\iffalse\oof\oof\oof\oof\oof\oof\oof\oof\oof\fi}
116\stoptyping
117
118So, a reference to an argument effectively is just a replacement. As long as you
119keep that in mind, and realize that while \TEX\ is skipping \quote {if} branches
120nothing gets expanded, you're okay.
121
122Most users will associate the \type {#} character with macro arguments or
123preambles in low level alignments, but since most macro packages provide a higher
124level set of table macros the latter is less well known. But, as often with
125characters in \TEX, you can do magic things:
126
127\starttyping[option=TEX]
128\catcode`?=\catcode`#
129
130\def\foo #1#2?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\par
131\def\foo ?1#2?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\par
132\def\foo ?1?2#3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\par
133\stoptyping
134
135Here the question mark also indicates a macro argument. However, when expanded
136we see this as result:
137
138\starttyping
139macro:#1#2?3->?1?2?3 =>123
140macro:?1#2?3->?1?2?3 =>123
141macro:?1?2#3->#1#2#3 =>123
142\stoptyping
143
144The last used argument signal character (officially called a match character,
145here we have two that fit that category, \type {#} and \type {?}) is used in the
146serialization! Now, there is an interesting aspect here. When \TEX\ stores the
147preamble, as in our first example:
148
149\starttyping
150    match (# property stored)
151    match (# property stored)
152    end of match
153\stoptyping
154
155the property is stored, so in the later example we get:
156
157\starttyping
158    match (# property stored)
159    match (# property stored)
160    match (? property stored)
161    end of match
162\stoptyping
163
164But in the macro body the number is stored instead, because we need it as
165reference to the parameter, so when that bit gets serialized \TEX\ (or more
166accurately: \LUATEX, which is what we're using here) doesn't know what specific
167signal was used. When the preamble is serialized it does keep track of the last
168so|-|called match character. This is why we see this inconsistency in rendering.
169
170A simple solution would be to store the used signal for the match argument, which
171probably only takes a few lines of extra code (using a nine integer array instead
172of a single integer), and use that instead. I'm willing to see that as a bug in
173\LUATEX\ but when I ran into it I was playing with something else: adding the
174ability to prevent storing unused arguments. But the resulting confusion can make
175one wonder why we do not always serialize the match character as \type {#}.
176
177It was then that I noticed that the preamble stored the match tokens and not the
178number and that \TEX\ in fact assumes that no mixture is used. And, after
179prototyping that in itself trivial change I decided that in order to properly
180serialize this new feature it also made sense to always serialize the match token
181as \type {#}. I simply prefer consistency over confusion and so I caught two
182flies in one stroke. The new feature is indicated with a \type {#0} parameter:
183
184\startbuffer
185
186\bgroup
187\catcode`?=\catcode`#
188
189\def\foo ?1?0?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
190\def\foo ?1#0?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
191\def\foo #1#2?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
192\def\foo ?1#2?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
193\def\foo ?1?2#3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
194\egroup
195\stopbuffer
196
197\typebuffer[option=TEX]
198
199\start
200\getbuffer
201\stop
202
203So, what is the rationale behind this new \type{#0} variant? Quite often you
204don't want to do something with an argument at all. This happens when a macro
205acts upon for instance a first argument and then expands another macro that
206follows up but only deals with one of many arguments and discards the rest. Then
207it makes no sense to store unused arguments. Keep in mind that in order to use it
208more than once an argument does need to be stored, because the parser only looks
209forward. In principle there could be some optimization in case the tokens come
210from macros but we leave that for now. So, when we don't need an argument, we can
211avoid storing it and just skip over it. Consider the following:
212
213\startbuffer
214\def\foo     #1{\ifnum#1=1 \expandafter\fooone\else\expandafter\footwo\fi}
215\def\fooone#1#0{#1}
216\def\footwo#0#2{#2}
217\foo{1}{yes}{no}
218\foo{0}{yes}{no}
219\stopbuffer
220
221\typebuffer[option=TEX]
222
223We get:
224
225\getbuffer
226
227Just for the record, tracing of a macro shows that indeed there is no argument
228stored:
229
230\starttyping[option=TEX]
231\def\foo#1#0#3{....}
232\foo{11}{22}{33}
233\foo #1#0#3->....
234#1<-11
235#2<-
236#3<-33
237\stoptyping
238
239Now, you can argue, what is the benefit of not storing tokens? As mentioned
240above, the \TEX\ engines do their own memory management. \footnote {An added
241benefit is that dumping and undumping is relatively efficient too.} This has
242large benefits in performance especially when one keeps in mind that tokens get
243allocated and are recycled constantly (take only lookahead and push back).
244
245However, even if this means that storing a couple of unused arguments doesn't put
246much of a dent in performance, it does mean that a token sits somewhere in memory
247and that this bit of memory needs to get accessed. Again, this is no big deal on
248a computer where a \TEX\ job can take one core and basically is the only process
249fighting for \CPU\ cache usage. But less memory access might be more relevant in
250a scenario of multiple virtual machines running on the same hardware or multiple
251\TEX\ processes on one machine. I didn't carefully measure that so I might be
252wrong here. Anyway, it's always good to avoid moving around data when there is no
253need for it.
254
255Just to temper expectations with respect to performance, here are some examples:
256
257\starttyping[option=TEX]
258\catcode`!=9 % ignore this character
259\firstoftwoarguments
260  {!!!!!!!!!!!!!!!!!!!}{!!!!!!!!!!!!!!!!!!!}
261\secondoftwoarguments
262  {!!!!!!!!!!!!!!!!!!!}{!!!!!!!!!!!!!!!!!!!}
263\secondoffourarguments
264  {!!!!!!!!!!!!!!!!!!!}{!!!!!!!!!!!!!!!!!!!}
265  {!!!!!!!!!!!!!!!!!!!}{!!!!!!!!!!!!!!!!!!!}
266\stoptyping
267
268In \CONTEXT\ we define these macros as follows:
269
270\starttyping[option=TEX]
271\def\firstoftwoarguments      #1#2{#1}
272\def\secondoftwoarguments     #1#2{#2}
273\def\secondoffourarguments#1#2#3#4{#2}
274\stoptyping
275
276The performance of 2 million expansions is the following (probably half or less
277on a more modern machine):
278
279\starttabulate[||||]
280\BC macro                          \BC total \BC step        \NC \NR
281\NC \type {\firstoftwoarguments}   \NC 0.245 \NC 0.000000123 \NC \NR
282\NC \type {\secondoftwoarguments}  \NC 0.251 \NC 0.000000126 \NC \NR
283\NC \type {\secondoffourarguments} \NC 0.390 \NC 0.000000195 \NC \NR
284\stoptabulate
285
286But we could use this instead:
287
288\starttyping[option=TEX]
289\def\firstoftwoarguments      #1#0{#1}
290\def\secondoftwoarguments     #0#2{#2}
291\def\secondoffourarguments#0#2#0#0{#2}
292\stoptyping
293
294which gives:
295
296\starttabulate[||||]
297\BC macro                          \BC total \BC step        \NC \NR
298\NC \type {\firstoftwoarguments}   \NC 0.229 \NC 0.000000115 \NC \NR
299\NC \type {\secondoftwoarguments}  \NC 0.236 \NC 0.000000118 \NC \NR
300\NC \type {\secondoffourarguments} \NC 0.323 \NC 0.000000162 \NC \NR
301\stoptabulate
302
303So, no impressive differences, especially when one considers that when that many
304expansions happen in a run, getting the document itself rendered plus expanding
305real arguments (not something defined to be ignored) will take way more time
306compared to this. I always test an extension like this on the test suite
307\footnote {Currently some 1600 files that take 24 minutes plus or minus 30
308seconds to process on a high end 2013 laptop. The 260 page manual with lots of
309tables, verbatim and \METAPOST\ images takes around 11 seconds. A few
310milliseconds more or less don't really show here. I only time these runs because
311I want to make sure that there are no dramatic consequences.} as well as the
312\LUAMETATEX\ manual (which takes about 11 seconds) and although one can notice a
313little gain, it makes more sense not to play music on the same machine as we run
314the \TEX\ job, if gaining milliseconds is that important. But, as said, it's more
315about unnecessary memory access than about \CPU\ cycles.
316
317This extension is downward compatible and its overhead can be neglected. Okay,
318the serialization now always uses \type {#} but it was inconsistent before, so
319I'm willing to sacrifice that (and I'm pretty sure no \CONTEXT\ user cares or
320will even notice). Also, it's only in \LUAMETATEX\ (for now) so that other macro
321packages don't suffer from this patch. The few cases where \CONTEXT\ can benefit
322from it are easy to isolate for \MKIV\ and \LMTX\ so we can support \LUATEX\ and
323\LUAMETATEX.
324
325I mentioned \LUATEX\ and how it serializes, but for the record, let's see how
326\PDFTEX, which is very close to original \TEX\ in terms of source code, does it.
327If we have this input:
328
329\starttyping[option=TEX]
330\catcode`D=\catcode`#
331\catcode`O=\catcode`#
332\catcode`N=\catcode`#
333\catcode`-=\catcode`#
334\catcode`K=\catcode`#
335\catcode`N=\catcode`#
336\catcode`U=\catcode`#
337\catcode`T=\catcode`#
338\catcode`H=\catcode`#
339
340\def\dek D1O2N3-4K5N6U7T8H9{#1#2#3 #4#6#7#8#9}
341
342{\meaning\dek \tracingall \dek don{}knuth}
343\stoptyping
344
345The meaning gets typeset as:
346
347\starttyping
348macro:D1O2N3-4K5N6U7T8H9->H1H2H3 H4H6H7H8H9don nuth
349\stoptyping
350
351while the tracing reports:
352
353\starttyping
354\dek D1O2N3-4K5N6U7T8H9->H1H2H3 H5H6H7H8H9
355D1<-d
356O2<-o
357N3<-n
358-4<-
359K5<-k
360N6<-n
361U7<-u
362T8<-t
363H9<-h
364\stoptyping
365
366The reason for the difference, as mentioned, is that the tracing uses the
367template and therefore uses the stored match token, while the meaning uses the
368reference match tokens that carry the number and at that time has no access to
369the original match token. Keeping track of that for the sake of tracing would not
370make sense anyway. So, traditional \TEX, which is what \PDFTEX\ is very close to,
371uses the last used match token, the \type {H}. Maybe this example can convince
372you that dropping that bit of log related compatibility is not that much of a
373problem. I just tell myself that I turned an unwanted side effect into a new
374feature.
375
376\subject{A few side notes}
377
378The fact that characters can be given a special meaning is one of the charming
379properties of \TEX. Take these two cases:
380
381\starttyping[option=TEX]
382\bgroup\catcode`\&=5 &\egroup
383\bgroup\catcode`\!=5 !\egroup
384\stoptyping
385
386In both lines there is now an alignment character used outside an alignment. And,
387in both cases the error message is similar:
388
389\starttyping
390! Misplaced alignment tab character &
391! Misplaced alignment tab character !
392\stoptyping
393
394So, indeed the right character is shown in the message. But, as soon as you ask
395for help, there is a difference: in the first case the help is specific for a tab
396character, but in the second case a more generic explanation is given. Just try
397it.
398
399The reason is an explicit check for the ampersand being used as tab character.
400Such is the charm of \TEX. I'll probably opt for a trivial change to be
401consistent here, although in \CONTEXT\ the ampersand is just an ampersand so no
402user will notice.
403
404There are a few more places where, although in principle any character can serve
405any purpose, there are hard coded assumptions, like \type {$} being used for
406math, so a missing dollar is reported, even if math started with another
407character being used to enter math mode. This makes sense because there is no
408urgent need to keep track of what specific character was used for entering math
409mode. An even stronger argument could be that \TEX ies expect dollars to be used
410for that purpose. Of course this works fine:
411
412\starttyping[option=TEX]
413\catcode`=\catcode`$
414 \sqrt{x^3} 
415\stoptyping
416
417But when we forget an \type {} we get messages like:
418
419\starttyping
420! Missing $ inserted
421\stoptyping
422
423or more generic:
424
425\starttyping
426! Extra }, or forgotten $
427\stoptyping
428
429which is definitely a confirmation of \quotation {America first}. Of course we
430can compromise in display math because this is quite okay:
431
432\starttyping[option=TEX]
433\catcode`=\catcode`$
434$ \sqrt{x^3} $
435\stoptyping
436
437unless of course we forget the last dollar in which case we are told that
438
439\starttyping
440! Display math should end with $$
441\stoptyping
442
443so no matter what, the dollar wins. Given how ugly the Euro sign looks I can live
444with this, although I always wonder what character would have been taken if \TEX\
445was developed in another country.
446
447\stopchapter
448
449\stopcomponent
450