onandon-fences.tex /size: 16 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3% This feature has been removed because we have different control now in the
4% reworked engine so this chapter cnanot be processed any longer.
5
6\endinput
7
8\startcomponent onandon-fences
9
10\environment onandon-environment
11
12% avoid context defaults:
13%
14% \mathitalicsmode   \plusone   % default in context
15% \mathdelimitersmode\plusseven % optional in context
16
17\def\UseMode#1{\appendtoks\mathdelimitersmode#1\to\everymathematics}
18
19\startchapter[title={Tricky fences}]
20
21Occasionally one of my colleagues notices some suboptimal rendering and asks me
22to have a look at it. Now, one can argue about \quotation {what is right} and
23indeed there is not always a best answer to it. Such questions can even be a
24nuisance; let's think of the following scenario. You have a project where \TEX\
25is practically the only solution. Let it be an \XML\ rendering project, which
26means that there are some boundary conditions. Speaking in 2017 we find that in
27most cases a project starts out with the assumption that everything is possible.
28
29Often such a project starts with a folio in mind and therefore by decent tagging
30to match the educational and esthetic design. When rendering is mostly automatic
31and concerns too many (variants) to check all rendering, some safeguards are used
32(an example will be given below). Then different authors, editors and designers
33come into play and their expectations, also about what is best, often conflict.
34Add to that rendering for the web, and devices and additional limitations show
35up: features get dropped and even more cases need to be compensated (the quality
36rules for paper are often much higher). But, all that defeats the earlier
37attempts to do well because suddenly it has to match the lesser format. This in
38turn makes investing in improving rendering very inefficient (read: a bottomless
39pit because it never gets paid and there is no way to gain back the investment).
40Quite often it is spacing that triggers discussions and questions what rendering
41is best. And inconsistency dominates these questions.
42
43So, in case you wonder why I bother with subtle aspects of rendering as discussed
44below, the answer is that it is not so much professional demand but users (like
45my colleagues or those on the mailing lists) that make me look into it and often
46something that looks trivial takes days to sort out (even for someone who knows
47his way around the macro language, fonts and the inner working of the engine).
48And one can be sure that more cases will pop up.
49
50All this being said, let's move on to a recent example. In \CONTEXT\ we support
51\MATHML\ although in practice we're forced to a mix of that standard and
52\ASCIIMATH. When we're lucky, we even get a mix with good old \TEX-encoded math.
53One problem with an automated flow and processing (other than raw \TEX) is that
54one can get anything and therefore we need to play safe. This means for instance
55that you can get input like this:
56
57\starttyping
58f(x) + f(1/x)
59\stoptyping
60
61or in more structured \TEX\ speak:
62
63\startbuffer
64$f(x) + f(\frac{1}{x})$
65\stopbuffer
66
67\typebuffer
68
69Using \TeX\ Gyre Pagella, this renders as: {\UseMode\zerocount\inlinebuffer}, and
70when seeing this a \TEX\ user will revert to:
71
72\startbuffer
73$f(x) + f\left(\frac{1}{x}\right)$
74\stopbuffer
75
76\typebuffer
77
78which gives: {\UseMode\zerocount \inlinebuffer}. So, in order to be robust we can
79always use the \type {\left} and \type {\right} commands, can't we?
80
81\startbuffer
82$f(x) + f\left(x\right)$
83\stopbuffer
84
85\typebuffer
86
87which gives {\UseMode\zerocount \inlinebuffer}, but let's blow up this result a
88bit showing some additional tracing from left to right, now in Latin Modern:
89
90\startbuffer[blownup]
91\startcombination[nx=3,ny=2,after=\vskip3mm]
92    {\scale[scale=4000]{\hbox{$f(x)$}}}
93        {just characters}
94    {\scale[scale=4000]{\ruledhbox{\showglyphs \showfontkerns \showfontitalics$f(x)$}}}
95        {just characters}
96    {\scale[scale=4000]{\ruledhbox{\showglyphs \showfontkerns \showfontitalics \showmakeup$f(x)$}}}
97        {just characters}
98    {\scale[scale=4000]{\hbox{$f\left(x\right)$}}}
99        {using delimiters}
100    {\scale[scale=4000]{\ruledhbox{\showglyphs \showfontkerns \showfontitalics$f\left(x\right)$}}}
101        {using delimiters}
102    {\scale[scale=4000]{\ruledhbox{\showglyphs \showfontkerns \showfontitalics \showmakeup$f\left(x\right)$}}}
103        {using delimiters}
104\stopcombination
105\stopbuffer
106
107\startlinecorrection
108\UseMode\zerocount
109\switchtobodyfont[modern]\getbuffer[blownup]
110\stoplinecorrection
111
112When we visualize the glyphs and kerns we see that there's a space instead of a
113kern when we use delimiters. This is because the delimited sequence is processed
114as a subformula and injected as a so|-|called inner object and as such gets
115spaced according to the ordinal (for the $f$) and inner (\quotation {fenced} with
116delimiters $x$) spacing rules. Such a difference normally will go unnoticed but
117as we mentioned authors, editors and designers being involved, there's a good
118chance that at some point one will magnify a \PDF\ preview and suddenly notice
119that the difference between the $f$ and $($ is a bit on the large side for simple
120unstacked cases, something that in print is likely to go unnoticed. So, even when
121we don't know how to solve this, we do need to have an answer ready.
122
123When I was confronted by this example of rendering I started wondering if there
124was a way out. It makes no sense to hard code a negative space before a fenced
125subformula because sometimes you don't want that, especially not when there's
126nothing before it. So, after some messing around I decided to have a look at the
127engine instead. I wondered if we could just give the non|-|scaled fence case the
128same treatment as the character sequence.
129
130Unfortunately here we run into the somewhat complex way the rendering takes
131place. Keep in mind that it is quite natural from the perspective of \TEX\
132because normally a user will explicitly use \type {\left} and \type {\right} as
133needed, while in our case the fact that we automate and therefore want a generic
134solution interferes (as usual in such cases).
135
136Once read in the sequence \type {f(x)} can be represented as a list:
137
138\starttyping
139list = {
140 {
141  id = "noad", subtype = "ord", nucleus = {
142   {
143    id = "mathchar", fam = 0, char = "U+00066",
144   },
145  },
146 },
147 {
148  id = "noad", subtype = "open", nucleus = {
149   {
150    id = "mathchar", fam = 0, char = "U+00028",
151   },
152  },
153 },
154 {
155  id = "noad", subtype = "ord", nucleus = {
156   {
157    id = "mathchar", fam = 0, char = "U+00078",
158   },
159  },
160 },
161 {
162  id = "noad", subtype = "close", nucleus = {
163   {
164    id = "mathchar", fam = 0, char = "U+00029",
165   },
166  },
167 },
168}
169\stoptyping
170
171The sequence \type {f \left( x \right)} is also a list but now it is a tree (we
172leave out some unset keys):
173
174\starttyping
175list = {
176 {
177  id = "noad", subtype = "ord", nucleus = {
178   {
179    id = "mathchar", fam = 0, char = "U+00066",
180   },
181  },
182 },
183 {
184  id = "noad", subtype = "inner", nucleus = {
185   {
186    id = "submlist", head = {
187     {
188      id = "fence", subtype = "left", delim = {
189       {
190        id = "delim", small_fam = 0, small_char = "U+00028",
191       },
192      },
193     },
194     {
195      id = "noad", subtype = "ord", nucleus = {
196       {
197        id = "mathchar", fam = 0, char = "U+00078",
198       },
199      },
200     },
201     {
202      id = "fence", subtype = "right", delim = {
203       {
204        id = "delim", small_fam = 0, small_char = "U+00029",
205       },
206      },
207     },
208    },
209   },
210  },
211 },
212}
213\stoptyping
214
215So, the formula \type {f(x)} is just four characters and stays that way, but with
216some inter|-|character spacing applied according to the rules of \TEX\ math. The
217sequence \typ {f \left( x \right)} however becomes two components: the \type {f}
218is an ordinal noad,\footnote {Noads are the mathematical building blocks.
219Eventually they become nodes, the building blocks of paragraphs and boxed
220material.} and \typ {\left( x \right)} becomes an inner noad with a list as a
221nucleus, which gets processed independently. The way the code is written this is
222what (roughly) happens:
223
224\startitemize
225\startitem
226    A formula starts; normally this is triggered by one or two dollar signs.
227\stopitem
228\startitem
229    The \type {f} becomes an ordinal noad and \TEX\ goes~on.
230\stopitem
231\startitem
232    A fence is seen with a left delimiter and an inner noad is injected.
233\stopitem
234\startitem
235    That noad has a sub|-|math list that takes the left delimiter up to a
236    matching right one.
237\stopitem
238\startitem
239    When all is scanned a routine is called that turns a list of math noads into
240    a list of nodes.
241\stopitem
242\startitem
243    So, we start at the beginning, the ordinal \type {f}.
244\stopitem
245\startitem
246    Before moving on a check happens if this character needs to be kerned with
247    another (but here we have an ordinal|-|inner combination).
248\stopitem
249\startitem
250    Then we encounter the subformula (including fences) which triggers a nested
251    call to the math typesetter.
252\stopitem
253\startitem
254    The result eventually gets packaged into a hlist and we're back one level up
255    (here after the ordinal \type {f}).
256\stopitem
257\startitem
258    Processing a list happens in two passes and, to cut it short, it's the second
259    pass that deals with choosing fences and spacing.
260\stopitem
261\startitem
262    Each time when a (sub)list is processed a second pass over that list
263    happens.
264\stopitem
265\startitem
266    So, now \TEX\ will inject the right spaces between pairs of noads.
267\stopitem
268\startitem
269    In our case that is between an ordinal and an inner noad, which is quite
270    different from a sequence of ordinals.
271\stopitem
272\stopitemize
273
274It's these fences that demand a two-pass approach because we need to know the
275height and depth of the subformula. Anyway, do you see the complication? In our
276inner formula the fences are not scaled, but this is not communicated back in the
277sense that the inner noad can become an ordinal one, as in the simple \type {f(}
278pair. The information is not only lost, it is not even considered useful and the
279only way to somehow bubble it up in the processing so that it can be used in the
280spacing requires an extension. And even then we have a problem: the kerning that
281we see between \type {f(} is also lost. It must be noted that this kerning is
282optional and triggered by setting \type {\mathitalicsmode=1}. One reason for this
283is that fonts approach italic correction differently, and cheat with the
284combination of natural width and italic correction.
285
286Now, because such a workaround is definitely conflicting with the inner workings
287of \TEX, our experimenting demands another variable be created: \type
288{\mathdelimitersmode}. It might be a prelude to more manipulations but for now we
289stick to this one case. How messy it really is can be demonstrated when we render
290our example with Cambria.
291
292\startlinecorrection
293\UseMode\zerocount
294\switchtobodyfont[cambria]\getbuffer[blownup]
295\stoplinecorrection
296
297If you look closely you will notice that the parenthesis are moved up a bit. Also
298notice the more accurate bounding boxes. Just to be sure we also show Pagella:
299
300\startlinecorrection
301\UseMode\zerocount
302\switchtobodyfont[pagella]\getbuffer[blownup]
303\stoplinecorrection
304
305When we really want the unscaled variant to be somewhat compatible with the
306fenced one we now need to take into account:
307
308\startitemize[packed]
309\startitem
310    the optional axis|-|and|-|height|/|depth related shift of the fence (bit 1)
311\stopitem
312\startitem
313    the optional kern between characters (bit 2)
314\stopitem
315\startitem
316    the optional space between math objects (bit 4)
317\stopitem
318\stopitemize
319
320Each option can be set (which is handy for testing) but here we will set them
321all, so, when \type {\mathdelimitersmode=7}, we want cambria to come out as
322follows:
323
324\startlinecorrection
325\UseMode\plusseven
326\switchtobodyfont[cambria]\getbuffer[blownup]
327\stoplinecorrection
328
329When this mode is set the following happens:
330
331\startitemize
332\startitem
333    We keep track of the scaling and when we use the normal size this is
334    registered in the noad (we had space in the data structure for that).
335\stopitem
336\startitem
337    This information is picked up by the caller of the routine that does the
338    subformula and stored in the (parent) inner noad (again, we had space for
339    that).
340\stopitem
341\startitem
342    Kerns between a character (ordinal) and subformula (inner) are kept,
343    which can be bad for other cases but probably less than what we try
344    to solve here.
345\stopitem
346\startitem
347    When the fences are unscaled the inner property temporarily becomes
348    an ordinal one when we apply the inter|-|noad spacing.
349\stopitem
350\stopitemize
351
352Hopefully this is good enough but anything more fancy would demand drastic
353changes in one of the most sensitive mechanisms of \TEX. It might not always work
354out right, so for now I consider it an experiment, which means that it can be
355kept around, rejected or improved.
356
357In case one wonders if such an extension is truly needed, one should also take
358into account that automated typesetting (also of math) is probably one of the
359areas where \TEX\ can shine for a while. And while we can deal with much by using
360\LUA, this is one of the cases where the interwoven and integrated parsing,
361converting and rendering of the math machinery makes it hard. It also fits into a
362further opening up of the inner working by modes.
363
364\startbuffer[simple]
365\dontleavehmode
366\scale
367    [scale=3000]
368    {\ruledhbox
369        {\showglyphs
370         \showfontkerns
371         \showfontitalics
372         $f(x)$}}
373\stopbuffer
374
375\startbuffer[fenced]
376\dontleavehmode
377\scale
378    [scale=3000]
379    {\ruledhbox
380        {\showglyphs
381         \showfontkerns
382         \showfontitalics
383         $f\left(x\right)$}}
384\stopbuffer
385
386\def\TestMe#1%
387  {\bTR
388       \bTD[width=35mm,align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode\zerocount\getbuffer[simple] \eTD
389       \bTD[width=35mm,align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode\zerocount\getbuffer[fenced] \eTD
390       \bTD[width=35mm,align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode\plusseven\getbuffer[simple] \eTD
391       \bTD[width=35mm,align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode\plusseven\getbuffer[fenced] \eTD
392   \eTR
393   \bTR
394       \bTD[align=middle,nx=2] \type{\mathdelimitersmode=0} \eTD
395       \bTD[align=middle,nx=2] \type{\mathdelimitersmode=7} \eTD
396   \eTR
397   \bTR
398       \bTD[align=middle,nx=4] \switchtobodyfont[#1]\bf #1 \eTD
399   \eTR}
400
401\startbuffer
402\bTABLE[frame=off]
403    \TestMe{modern}
404    \TestMe{cambria}
405    \TestMe{pagella}
406\eTABLE
407\stopbuffer
408
409Another objection to such a solution can be that we should not alter the engine
410too much. However, fences already are an exception and treated specially (tests
411and jumps in the program) so adding this fits reasonably well into that part of
412the design.
413
414In the following examples we demonstrate the results for Latin Modern, Cambria
415and Pagella when \type {\mathdelimitersmode} is set to zero or one. First we show
416the case where \type {\mathitalicsmode} is disabled:
417
418\startlinecorrection
419    \mathitalicsmode\zerocount\getbuffer
420\stoplinecorrection
421
422When we enable \type {\mathitalicsmode} we get:
423
424\startlinecorrection
425    \mathitalicsmode\plusone  \getbuffer
426\stoplinecorrection
427
428So is this all worth the effort? I don't know, but at least I got the picture and
429hopefully now you have too. It might also lead to some more modes in future
430versions of \LUATEX.
431
432\startbuffer[simple]
433\dontleavehmode
434\scale
435    [scale=2000]
436    {\ruledhbox
437        {\showglyphs
438         \showfontkerns
439         \showfontitalics
440         $f(x)$}}
441\stopbuffer
442
443\startbuffer[fenced]
444\dontleavehmode
445\scale
446    [scale=2000]
447    {\ruledhbox
448        {\showglyphs
449         \showfontkerns
450         \showfontitalics
451         $f\left(x\right)$}}
452\stopbuffer
453
454\def\TestMe#1%
455  {\bTR
456       \dostepwiserecurse{0}{7}{1}{
457           \bTD[align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode##1\getbuffer[simple] \eTD
458        }
459   \eTR
460   \bTR
461       \dostepwiserecurse{0}{7}{1}{
462           \bTD[align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode##1\getbuffer[fenced] \eTD
463        }
464   \eTR
465   \bTR
466       \dostepwiserecurse{0}{7}{1}{
467           \bTD[align=middle]
468              \tttf
469              \ifcase##1\relax
470              \or ns       % 1
471              \or    it    % 2
472              \or ns it    % 3
473              \or       or % 4
474              \or ns    or % 5
475              \or    it or % 6
476              \or ns it or % 7
477              \fi
478           \eTD
479       }
480   \eTR
481   \bTR
482       \bTD[align=middle,nx=8] \switchtobodyfont[#1]\bf #1 \eTD
483   \eTR}
484
485\startbuffer
486\bTABLE[frame=off,distance=2mm]
487    \TestMe{modern}
488    \TestMe{cambria}
489    \TestMe{pagella}
490\eTABLE
491\stopbuffer
492
493\startlinecorrection
494\getbuffer
495\stoplinecorrection
496
497In \CONTEXT, a regular document can specify \type {\setupmathfences
498[method=auto]}, but in \MATHML\ or \ASCIIMATH\ this feature is enabled by default
499(so that we can test it).
500
501We end with a summary of all the modes (assuming italics mode is enabled) in the
502table below.
503
504\stopcomponent
505