mk-math.tex /size: 36 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\usemodule[fnt-23]
4\usemodule[fnt-25]
5
6\startcomponent mk-math
7
8\environment mk-environment
9
10\chapter{Unicode math}
11
12{\em I assume that the reader is somewhat familiar with math in
13\TEX. Although in \CONTEXT\ we try to support the concepts and
14symbols used in the \TEX\ community we have our own way of
15implementing math. The fact that \CONTEXT\ is not used extensively
16for conventional math journals permits us to rigourously
17re|-|implement mechanisms. Of course the user interfaces mostly
18remain the same.}
19
20\subject{introduction}
21
22The \LUATEX\ project entered a new stage when end of 2008 and
23beginning of 2009 math got opened up. Although \TEX\ can handle
24math pretty good we had a few wishes that we hoped to fulfill in
25the process. That \TEX's math machinery is a rather independent
26subsystem is reflected in the fact that after parsing there is an
27intermediate list of so called noads (math elements), which then
28gets converted into a node list (glyphs, kerns, penalties, glue and
29more). This conversion can be intercepted by a callback and a
30macro package can do whatever it likes with the list of noads as
31long as it returns a proper list.
32
33Of course \CONTEXT\ does support math and that is visible in its
34code base:
35
36\startitemize
37
38\item Due to the fact that we need to be able to switch to
39alternative styles the font system is quite complex and in
40\CONTEXT\ \MKII\ math font definitions (and changes) are good for
4150\% of the time involved. In \MKIV\ we can use a more efficient
42model.
43
44\item Because some usage of \CONTEXT\ demands the mix of several
45completely different encoded math fonts there is a dedicated math
46encoding subsystem in \MKII. In \MKIV\ we will use \UNICODE\
47exclusively.
48
49\item Some constructs (and symbols) are implemented in a way that
50we find suboptimal. In the perspective of \UNICODE\ in \MKIV\ we
51aim at all symbols being real characters. This is possible because
52all important constructs (like roots, accents and delimiters) are
53supported by the engine.
54
55\item In order to fit vertical spacing around math (think for
56instance of typesetting on a grid) in \MKII\ we have ended up with
57rather messy and suboptimal code. \footnote {This is because
58spacing before and after formulas has to cooperate with spacing of
59structural components that surround it.} The expectation is that
60we can improve that.
61
62\stopitemize
63
64In the following sections I will discuss a few of the
65implementation details of the font related issues in \MKIV. Of
66course a few years from now the actual solutions we implemented
67might look different but the principles remain the same. Also, as
68with other components of \LUATEX\ Taco and I worked in parallel on
69the code and its usage, which made both our tasks easier.
70
71\subject{transition}
72
73In \TEX, math typesetting uses a special concept called families.
74Each math component (number, letter, symbol, etc) is member of a
75family. Because we have three sizes (text, script and
76scriptscript) this results in a family||size matrix of defined
77fonts. Because the number of glyphs in a font was limited to 256,
78in practice it meant that we had quite some font definitions. The
79minimum number of families was~4 (roman, italic, symbol, and
80extension) but in practice several more could be active (sans,
81bold, mono|-|spaced, more symbols, etc.) for specific alphabets or
82extra symbols (for instance \AMS\ set A and B). The total number
83of families in traditional \TEX\ is limited to 16, and one easily
84hits this maximum. In that case, some 16 times 3 fonts are defined
85for one size of which in practice only a few are really used in the
86typesetting.
87
88A potential source of confusion is bold math. Bold in math can
89either mean having some bold letters, or having the whole formula
90in bold. In practice this means that for a complete bold formula
91one has to define the whole lot using bold fonts. A complication
92is that the math symbols (etc) are kind of bound to families and
93so we end up with either redefining symbols, or reusing the
94families (which is easier and faster). In any case there is a
95performance issue involved due to the rather massive switch from
96normal to bold.
97
98In \UNICODE\ all alphabets that make sense as well as all math
99symbols are part of the definition although unfortunately some
100alphabets have their letters spread over the \UNICODE\ vector and
101not in a range (like blackboard). This forces all applications
102that want to support math to implement similar hacks to deal with
103it.
104
105In \MKIV\ we will assume that we have \UNICODE\ aware math fonts,
106like \OPENTYPE. The font that sets the standard is Microsoft
107Cambria. The upcoming (I'm writing this in January 2009) \TEX Gyre
108fonts will be compliant to this standard but they're not yet there
109and so we have a problem. The way out is to define virtual fonts
110and now that \LUATEX\ math is extended to cover all of \UNICODE\
111as well as provides access to the (intermediate) math lists this
112has become feasible. This also permits us to test \LUATEX\
113with both Cambria and Latin Modern Virtual Math.
114
115The advantage is that we can stick to just one family for all
116shapes which simplifies the underlying \TEX\ code enormously.
117First of all we need to define way less fonts (which is partially
118compensated by loading them as part of the virtual font) and all
119math aspects can now be dealt with using the character data
120tables.
121
122One tricky aspect of the new approach is that the Latin Modern
123fonts have design sizes, so we have to define several virtual
124fonts. On the other hand, fonts like Cambria have alternative
125script and scriptscript shapes which is controlled by the \type
126{ssty} feature, a gsub alternate that provides some alternative
127sizes for a couple of hundred characters that matter.
128
129\starttabulate[|l|l|l|]
130\NC text         \NC \type {lmmi12 at 12pt} \NC \type {cambria at 12pt with ssty=no} \NC \NR
131\NC script       \NC \type {lmmi8  at  8pt} \NC \type {cambria at  8pt with ssty=1}  \NC \NR
132\NC scriptscript \NC \type {lmmi6  at  6pt} \NC \type {cambria at  6pt with ssty=2}  \NC \NR
133\stoptabulate
134
135So Cambria not so much has design sizes but shapes optimized
136relative to the text variant: in the following example we see text
137in red, script in green and scriptscript in blue.
138
139\startbuffer
140\definefontfeature[math][analyze=false,script=math,language=dflt]
141
142\definefontfeature[text]        [math][ssty=no]
143\definefontfeature[script]      [math][ssty=1]
144\definefontfeature[scriptscript][math][ssty=2]
145\stopbuffer
146
147\typebuffer \getbuffer
148
149Let us first look at Cambria:
150
151\startbuffer
152\startoverlay
153    {\definedfont[name:cambriamath*scriptscript at 150pt]\mkblue  X}
154    {\definedfont[name:cambriamath*script       at 150pt]\mkgreen X}
155    {\definedfont[name:cambriamath*text         at 150pt]\mkred   X}
156\stopoverlay
157\stopbuffer
158
159\typebuffer \startlinecorrection \getbuffer \stoplinecorrection
160
161When we compare them scaled down as happens in real script and
162scriptscript we get:
163
164\startbuffer
165\startoverlay
166    {\definedfont[name:cambriamath*scriptscript at 120pt]\mkblue  X}
167    {\definedfont[name:cambriamath*script       at  80pt]\mkgreen X}
168    {\definedfont[name:cambriamath*text         at  60pt]\mkred   X}
169\stopoverlay
170\stopbuffer
171
172\typebuffer \startlinecorrection \getbuffer \stoplinecorrection
173
174Next we see (scaled) Latin Modern:
175
176\startbuffer
177\startoverlay
178    {\definedfont[LMRoman8-Regular  at 150pt]\mkblue  X}
179    {\definedfont[LMRoman10-Regular at 150pt]\mkgreen X}
180    {\definedfont[LMRoman12-Regular at 150pt]\mkred   X}
181\stopoverlay
182\stopbuffer
183
184\typebuffer \startlinecorrection \getbuffer \stoplinecorrection
185
186In practice we will see:
187
188\startbuffer
189\startoverlay
190    {\definedfont[LMRoman8-Regular  at 120pt]\mkblue  X}
191    {\definedfont[LMRoman10-Regular at  80pt]\mkgreen X}
192    {\definedfont[LMRoman12-Regular at  60pt]\mkred   X}
193\stopoverlay
194\stopbuffer
195
196\typebuffer \startlinecorrection \getbuffer \stoplinecorrection
197
198Both methods probably work out well although you need to keep in
199mind that the \OPENTYPE\ \type {ssty} feature is not so much a
200design size related feature.
201
202An \OPENTYPE\ font can have a specification for the script and
203scriptscript size. By default we listen to this specification instead
204of the one imposed by the bodyfont environment. When you turn on
205tracing
206
207\starttyping
208\enabletrackers[otf.math]
209\stoptyping
210
211you will get messages like:
212
213\starttyping
214asked scriptscript size: 458752, used: 471859.2 (102.86 %)
215asked script size: 589824, used: 574095.36 (97.33 %)
216\stoptyping
217
218The differences between the defaults and the font recommendations
219are not that large so by default we listen to the font specification.
220
221\usetypescript[cambria] \start \setupbodyfont[cambria] \stop
222
223\definefontfeature[math-script]      [math-script]      [mathsize=no]
224\definefontfeature[math-scriptscript][math-scriptscript][mathsize=no]
225
226\definetypeface [cambria-ns] [rm] [serif] [cambria] [default]
227\definetypeface [cambria-ns] [tt] [mono]  [modern]  [default]
228\definetypeface [cambria-ns] [mm] [math]  [cambria] [default]
229
230\usetypescript[cambria-ns] \start \setupbodyfont[cambria-ns] \stop
231
232\startlinecorrection
233\scale
234  [width=\textwidth]
235  {\backgroundline
236     [darkgray]
237     {\startoverlay
238       {\white\switchtobodyfont   [cambria]$\sum_{i=0}^n$}
239       {\mkred\switchtobodyfont[cambria-ns]$\sum_{i=0}^n$}
240     \stopoverlay
241     \startoverlay
242       {\white\switchtobodyfont   [cambria]$\int_{i=0}^n$}
243       {\mkred\switchtobodyfont[cambria-ns]$\int_{i=0}^n$}
244     \stopoverlay
245     \startoverlay
246       {\white\switchtobodyfont   [cambria]$\log_{i=0}^n$}
247       {\mkred\switchtobodyfont[cambria-ns]$\log_{i=0}^n$}
248     \stopoverlay
249     \startoverlay
250       {\white\switchtobodyfont   [cambria]$\cos_{i=0}^n$}
251       {\mkred\switchtobodyfont[cambria-ns]$\cos_{i=0}^n$}
252     \stopoverlay
253     \startoverlay
254       {\white\switchtobodyfont   [cambria]$\prod_{i=0}^n$}
255       {\mkred\switchtobodyfont[cambria-ns]$\prod_{i=0}^n$}
256     \stopoverlay}}
257\stoplinecorrection
258
259\definefontfeature[math-script]      [math-script]      [mathsize=yes]
260\definefontfeature[math-scriptscript][math-scriptscript][mathsize=yes]
261
262In this overlay the white text is scaled according to the
263specification in the font, while the red text is scaled according
264to the bodyfont environment (12/7/5 points).
265
266\subject{going virtual}
267
268The number of math fonts (used) in the \TEX\ community is
269relatively small and of those only Latin Modern (which builds upon
270Computer Modern) has design sizes. This means that the amount of
271\UNICODE\ compliant virtual math fonts that we have to make is not
272that large. We could have used an already present virtual
273composition mechanism but instead we made a handy helper function
274that does a more efficient job. This means that a definition looks
275(a bit simplified) as follows:
276
277\starttyping
278mathematics.make_font ( "lmroman10-math", {
279  { name="lmroman10-regular", features="virtualmath", main=true },
280  { name="lmmi10", vector="tex-mi", skewchar=0x7F },
281  { name="lmsy10", vector="tex-sy", skewchar=0x30, parameters=true } ,
282  { name="lmex10", vector="tex-ex", extension=true } ,
283  { name="msam10", vector="tex-ma" },
284  { name="msbm10", vector="tex-mb" },
285  { name="lmroman10-bold", "tex-bf" } ,
286  { name="lmmib10", vector="tex-bi", skewchar=0x7F } ,
287  { name="lmsans10-regular", vector="tex-ss", optional=true },
288  { name="lmmono10-regular", vector="tex-tt", optional=true },
289} )
290\stoptyping
291
292For the \TEX Gyre Pagella it looks this way:
293
294\starttyping
295mathematics.make_font ( "px-math", {
296  { name="texgyrepagella-regular", features="virtualmath", main=true },
297  { name="pxr", vector="tex-mr" } ,
298  { name="pxmi", vector="tex-mi", skewchar=0x7F },
299  { name="pxsy", vector="tex-sy", skewchar=0x30, parameters=true } ,
300  { name="pxex", vector="tex-ex", extension=true } ,
301  { name="pxsya", vector="tex-ma" },
302  { name="pxsyb", vector="tex-mb" },
303} )
304\stoptyping
305
306As you can see, it is possible to add alphabets, given that there is
307a suitable vector that maps glyph indices onto \UNICODE s. It is good
308to know that this function only defines the way such a font is
309constructed. The actual construction is delayed till the font is
310needed.
311
312Such a virtual font is used in typescripts (the building blocks of
313typeface definitions in \CONTEXT) as follows:
314
315\starttyping
316\starttypescript [math] [palatino] [name]
317  \definefontsynonym [MathRoman] [pxmath@px-math]
318  \loadmapfile[original-youngryu-px.map]
319\stoptypescript
320\stoptyping
321
322If you're familiar with the way fonts are defined in \CONTEXT, you will
323notice that we no longer need to define MathItalic, MathSymbol and
324additional symbol fonts. Of course users don't have to deal with
325these issues themselves. The \type {@} triggers the virtual
326font builder.
327
328You can imagine that in \MKII\ switching to another font style or size
329involves initializing (or at least checking) involves some 30 to 40
330font definitions when it comes to math (the number of used
331families times 3, the number o fmath sizes.). And even if we take
332into account that fonts are loaded only once, this checking and
333enabling takes time. Keep in mind that in \CONTEXT\ we can have
334several math font sets active in one document which comes at a
335price.
336
337In \MKIV\ we use one family (at three sizes). Of course we need to
338load the font (and more than one in the case of virtual variants)
339but when switching bodyfont sizes we only need to enable one
340(already defined) math font. And that really saves time. This is
341one of the areas where we gain back time that we loose elsewhere
342by extending core functionality using \LUA\ (like \OPENTYPE\
343support).
344
345\subject{dimensions}
346
347By setting font related dimensions you can control the way \TEX\
348positions math elements relative to each other. Math fonts have a
349few more dimensions than regular text fonts. But \OPENTYPE\ math
350fonts like Cambria have quite some more. There is a nice booklet
351published by Microsoft, \quote {Mathematical Typesetting}, where
352dealing with math is discussed in the perspective of their word
353processor and \TEX. In the booklet some of the parameters are
354discussed and since many of them are rather special it makes no
355sense (yet) to elaborate on them here. \footnote {Googling on
356\quote {Ulrich Vieth}, \quote {TeX} and \quote {conferences} might
357give you some hits on articles on these matters.} Figuring out
358their meaning was quite a challenge.
359
360I am the first to admit that the current code in \MKIV\ that deals
361with math parameters is somewhat messy. There are several reasons
362for this:
363
364\startitemize[packed]
365\item We can pass parameters as \type {MathConstants} table in the
366      \TFM\ table that we pass to the core engine.
367\item We can use some named parameters, like \type {x_height} and
368      pass those in the \type {parameters} table.
369\item We can use the traditional font dimension numbers in the
370      \type {parameters} table, but since they overlap for symbol and
371      extensible fonts, that is asking for troubles.
372\stopitemize
373
374Because in \MKIV\ we create virtual fonts at run|-|time and use just
375one family, we fill the \type {MathConstants} table for
376traditional fonts as well. Future versions may use the upcoming
377mechanisms of font parameter sets at the macro level. These can be
378defined for each of the sizes (display, text, script and
379scriptscript, and the last three in cramped form as well) but
380since a font only carries one set, we currently use a compromise.
381
382\subject{tracing}
383
384One of the nice aspects of the opened up math machinery is that it
385permits us to get a more detailed look at what happens. It also
386fits nicely in the way we always want to visualize things in
387\CONTEXT\ using color, although most users are probably unaware of
388many such features because they don't need them as I do.
389
390\startbuffer
391\enabletrackers[math.analyzing]
392\ruledhbox{$a = \sqrt{b^2 + \sin{c} - {1 \over \gamma}}$}
393\disabletrackers[math.analyzing]
394\stopbuffer
395
396\typebuffer \startbaselinecorrection \getbuffer \stopbaselinecorrection
397
398This tracker option colors characters depending on their nature and the
399fact that they are remapped. The tracker also was handy during development
400of \LUATEX\ especially for checking if attributes migrated right in
401constructed symbols.
402
403For over a year I had been using a partial \UNICODE\ math
404implementation in some projects but for serious math the vectors
405needed to be completed. In order to help the \quote {math
406department} of the \CONTEXT\ development team (Aditya Mahajan,
407Mojca Miklavec, Taco Hoekwater and myself) we have some extra
408tracing options, like
409
410\startbuffer
411\showmathfontcharacters[list=0x0007B]
412\stopbuffer
413
414\typebuffer
415
416\start \blank \getbuffer \blank \stop
417
418The simple variant with no arguments would have extended this
419document with many pages of such descriptions.
420
421Another handy command (defined in module \type{fnt-25}) is the following:
422
423\starttyping
424\ShowCompleteFont{name:cambria}{9pt}{1}
425\ShowCompleteFont{dummy@lmroman10-math}{10pt}{1}
426\stoptyping
427
428This will for instance for Cambria generate between 50 and 100
429pages of character tables.
430
431\startbuffer[mathtest]
432$abc \bf abc \bi abc$
433$\mathscript abcdefghijklmnopqrstuvwxyz %
434  1234567890 ABCDEFGHIJKLMNOPQRSTUVWXYZ$
435$\mathfraktur abcdefghijklmnopqrstuvwxyz %
436  1234567890 ABCDEFGHIJKLMNOPQRSTUVWXYZ$
437$\mathblackboard abcdefghijklmnopqrstuvwxyz %
438  1234567890 ABCDEFGHIJKLMNOPQRSTUVWXYZ$
439$\mathscript abc IRZ \mathfraktur abc IRZ %
440  \mathblackboard abc IRZ \ss abc IRZ 123$
441\stopbuffer
442
443If you look at the following samples you can imagine how coloring
444the characters and replacements helped figuring out the alphabets
445We use the following input (stored in a buffer):
446
447\typebuffer [mathtest]
448
449For testing Cambria we say:
450
451\starttyping
452\usetypescript[cambria]
453\switchtobodyfont[cambria,11pt]
454\enabletrackers[math.analyzing]
455\getbuffer[mathtest] % the input shown before
456\disabletrackers[math.analyzing]
457\stoptyping
458
459And we get:
460
461\usetypescript[cambria] % global
462
463\startlines
464\switchtobodyfont[cambria,10pt]
465\enabletrackers[math.analyzing]
466\getbuffer[mathtest] % the input shown before
467\disabletrackers[math.analyzing]
468\stoplines
469
470For the virtualized Latin Modern we say:
471
472\starttyping
473\usetypescript[modern]
474\switchtobodyfont[modern,11pt]
475\enabletrackers[math.analyzing]
476\getbuffer[mathtest] % the input shown before
477\disabletrackers[math.analyzing]
478\stoptyping
479
480This gives:
481
482\usetypescript[modern] % global
483
484\startlines
485\switchtobodyfont[modern,11pt]
486\enabletrackers[math.analyzing]
487\getbuffer[mathtest]
488\disabletrackers[math.analyzing]
489\stoplines
490
491These two samples demonstrate that Cambria has a rather complete
492repertoire of shapes which is no surprise because it is a recent
493font that also serves as a showcase for \UNICODE\ and \OPENTYPE\
494driven math.
495
496Commands like \type {\mathscript} sets an attribute. When we post|-|process
497the noad list and encounter this attribute, we remap the characters to
498the desired variant. Of course this happens selectively. So, a capital~A
499(\type {0x0041}) becomes a capital script~A (\type {0x1D49C}). Of course
500this solution is rather \CONTEXT\ specific and there are other ways to
501achieve the same goal (like using more families and switching family).
502
503\subject{special cases}
504
505Because we now are operating in the \UNICODE\ domain, we run into
506problems if we keep defining some of the math symbols in the
507traditional \TEX\ way. Even with the \AMS\ fonts available we
508still end up with some characters that are represented by
509combining others. Take for instance $\neq$ which is composed of
510two characters. Because in \MKIV\ we want to have all
511characters in their pure form we use a virtual replacement for
512them. In \MKIV\ speak it looks like this:
513
514\starttyping
515local function negate(main,unicode,basecode)
516    local characters = main.characters
517    local basechar = characters[basecode]
518    local ht, wd = basechar.height, basechar.width
519    characters[unicode] = {
520        width    = wd,
521        height   = ht,
522        depth    = basechar.depth,
523        italic   = basechar.italic,
524        kerns    = basechar.kerns,
525        commands = {
526            { "slot", 1, basecode },
527            { "push" },
528            { "down",    ht/5},
529            { "right", - wd/2},
530            { "slot", 1, 0x2215 },
531            { "pop" },
532        }
533    }
534end
535\stoptyping
536
537In case you're curious, there are indeed kerns, in this case the
538kerns with the Greek Delta.
539
540Another thing we need to handle is positioning of accents on top
541of slanted (italic) shapes. For this \TEX\ uses a special
542character in its fonts (set with \type{\skewchar}). Any character
543can have in its kerning table a kern towards this special
544character. From this kern we
545can calculate the \type {top_accent} variable that we can pass for
546each character. This variable lives at the same level as \type
547{width}, \type {height}, \type {depth} and \type {italic} and is
548calculated as: $w/2 + k$, so it defines the horizontal anchor. A
549nice side effect is that (in the \CONTEXT\ font management
550subsystem) this saves us passing information associated with
551specific fonts such as the skew character.
552
553A couple of concepts are unique to \TEX, like having \type {\hat}
554and \type {\widehat} where the wide one has sizes. In \OPENTYPE\ and
555\UNICODE\ we don't have this distinction so we need special
556trickery to simulate this. We do so by adding extra code points in
557a private \UNICODE\ space which in return results in them being
558defined automatically and the relevant first size variant being
559used for \type {\hat}. For some users this might still be too wide
560but at least it's better than a wrongly positioned \ASCII\ variant.
561In the future we might use this private space for similar cases.
562
563Arrows, horizontal extenders and radicals also fall in the
564category \quote {troublesome} if only because they use special
565dimensions to get the desired effect. Fortunately \OPENTYPE\ math
566is modeled after \TEX, so in \LUATEX\ we introduce a couple
567of new constructs to deal with this. One such simplification at
568the macro level is in the definition of \type {\root}. Here we use
569the new \type {\Uroot} primitive. The placement related parameters
570are those used by traditional \TEX, but when they are available the
571\OPENTYPE\ parameters are applied. The simplified
572plain definitions are now:
573
574\starttyping
575\def\rootradical{\Uroot 0 "221A }
576
577\def\root#1\of{\rootradical{#1}}
578
579\def\sqrt{\rootradical{}}
580\stoptyping
581
582The successive sizes of the root will be taken from the font in the
583same way as traditional \TEX\ does it. In that sense \LUATEX\ is no
584doing anything differently, it only has more parameters to control
585the process. The definition of \type {\sqrt} in \CONTEXT\ permits
586an optional first argument that sets the degree.
587
588\startbuffer
589\showmathfontcharacters[list=0x221A]
590\stopbuffer
591
592\start \blank \getbuffer \blank \stop
593
594Note that we've collected all characters in family~0 (simply
595because that is what \TEX\ defaults characters to) and that we use
596the formal \UNICODE\ slots. When we use the Latin Modern fonts we
597just remap traditional slots to the right ones.
598
599Another neat trick is used when users choose among the bigger variants
600of some characters. The traditional approach is to create a box of a
601certain size and create a fake delimited variant which is then used.
602
603\starttyping
604\definemathcommand [big]  {\choosemathbig\plusone  }
605\definemathcommand [Big]  {\choosemathbig\plustwo  }
606\definemathcommand [bigg] {\choosemathbig\plusthree}
607\definemathcommand [Bigg] {\choosemathbig\plusfour }
608\stoptyping
609
610Of course this can become a primitive operation and we might decide
611to add such a primitive later on so we won't bother you with more
612details.
613
614Attributes are also used to make live easier for authors who have
615to enter lots of pairs. Compare:
616
617\startbuffer
618\setupmathematics[autopunctuation=no]
619
620$ (a,b) = (1.20,3.40) $
621\stopbuffer
622
623\typebuffer \begingroup \getbuffer \endgroup
624
625with:
626
627\startbuffer
628\setupmathematics[autopunctuation=yes]
629
630$ (a,b) = (1.20,3.40) $
631\stopbuffer
632
633\typebuffer \begingroup \getbuffer \endgroup
634
635So we don't need to use this any more:
636
637\starttyping
638$ (a{,}b) = (1{.}20{,}3{.}40) $
639\stoptyping
640
641Features like this are implemented on top of an experimental math
642manipulation framework that is part of \MKIV. When the math
643font system is stable we will rework the rest of math support
644and implement additional manipulating frameworks.
645
646\subject{control}
647
648As with all other character related issues, in \MKIV\ everything
649is driven by a character table (consider it a database).
650Quite some effort went into getting that one right and although by
651now math is represented well, more data will be added in due time.
652
653In \MKIV\ we no longer have huge lists of \TEX\ definitions for
654math related symbols. Everything is initialized using the mentioned
655table: normal symbols, delimiters, radicals, whether or not with name.
656Take for instance the square root:
657
658\start \blank \showmathfontcharacters[list=0x221A] \blank \stop
659
660
661Its entry is:
662
663\starttyping
664[0x221A] = {
665    adobename = "radical",
666    category = "sm",
667    cjkwd = "a",
668    description = "SQUARE ROOT",
669    direction = "on",
670    linebreak = "ai",
671    mathclass = "radical",
672    mathname = "surd",
673    unicodeslot = 0x221A,
674}
675\stoptyping
676
677The fraction symbol also comes in sizes. This symbol is not to be
678confused with the negation symbol \type {0x2215}, which in \TEX\ is
679known as \type {\not}).
680
681\start \blank \showmathfontcharacters[list=0x2044] \blank \stop
682
683\starttyping
684[0x2044] = {
685    adobename = "fraction",
686    category = "sm",
687    contextname = "textfraction",
688    description = "FRACTION SLASH",
689    direction = "cs",
690    linebreak = "is",
691    mathspec = {
692        { class = "binary", name = "slash" },
693        { class = "close", name = "solidus" },
694    },
695    unicodeslot = 0x2044,
696}
697\stoptyping
698
699However, since most users don't have this symbol visualized in
700their word processor, they expect the same behaviour from the
701regular slash. This is why we find a reference to the real symbol
702in its definition.
703
704\start \blank \showmathfontcharacters[list=0x002F] \blank \stop
705
706The definition is:
707
708\starttyping
709[0x002F] = {
710    adobename = "slash",
711    category = "po",
712    cjkwd = "na",
713    contextname = "textslash",
714    description = "SOLIDUS",
715    direction = "cs",
716    linebreak = "sy",
717    mathsymbol = 0x2044,
718    unicodeslot = 0x002F,
719}
720\stoptyping
721
722One problem left is that currently we have only one class per
723character (apart from the delimiter and radical usage which have
724their own definitions). Future releases of \CONTEXT\ will provide
725support for math dictionaries (as in \OPENMATH\ and \MATHML~3). At
726that point we will also have a \type {mathdict} entry.
727
728There is another issue with character mappings, one that will
729seldom reveal itself to the user, but might confuse macro writers
730when they see an error message.
731
732In traditional \TEX, and therefore also in the Latin Modern fonts,
733a chain from small to large character goes in two steps: the
734normal size is taken from one family and the larger variants from
735another. The larger variant then has a pointer to an even larger
736one and so on, until there is no larger variant or an extensible
737recipe is found. The default family is number~0. It is for this
738reason that some of the definition primitives expect a small and
739large family part.
740
741However, in order to support \OPENTYPE\ in \LUATEX\ the
742alternative method no longer assumes this split. After all, we no
743longer have a situation where the 256 limit forces us to take the
744smaller variant from one font and the larger sequence from another
745(so we need two family||slot pairs where each family eventually
746resolves to a font).
747
748It is for that reason that the new \type {\U...} primitives expect
749only one family specification: the small symbol, which then has a
750pointer to a larger variant when applicable. However deep down in
751the engine, there is still support for the multiple family
752solution (after all, we don't want to drop compatibility). As a
753result, in error messages you can still find references
754(defaulting to~0) to large specifications, even if you don't use
755them. In that case you can simply ignore the large symbol (0,0),
756since it is not used when the small symbol provides a link.
757
758\subject{extensibles}
759
760In \TEX\ fences can be told to become larger automatically. In
761traditional \TEX\ a character can have a linked list of next
762larger shapes ending in a description of how to compose even
763larger variants.
764
765A parenthesis in Cambria has the following list:
766
767\start
768    \switchtobodyfont[cambria,10pt]
769    \showmathfontcharacters[list=0x00028]
770\stop
771
772In Latin Modern we have:
773
774\start
775    \switchtobodyfont[modern,10pt]
776    \showmathfontcharacters[list=0x00028]
777\stop
778
779Of course \LUATEX\ is downward compatible with respect to this
780feature, but the internal representation is now closer to what
781\OPENTYPE\ math provides (which is not that far from how \TEX\
782works simply because it's inspired by \TEX). Because Cambria has
783different parameters we get slightly different results. In the
784following list of pairs, you see Cambria on the left and Latin
785Modern on the right.
786Both start with stepwise larger shapes, followed by a more gradual
787growth. The thresholds for a next step are driven by parameters
788set in the \OPENTYPE\ font or by \TEX's default.
789
790\start
791\lineskip1ex
792\dostepwiserecurse{5}{140}{5} {
793    \dontleavehmode \ruledhbox \bgroup
794        \setbox0=\vbox{\vss\hbox{\switchtobodyfont[cambria,10pt]$\left\{ \vcenter{\hbox{\darkgray\vrule height \recurselevel pt width 5pt}} \right\}$}\vss}%
795        \setbox2=\vbox{\vss\hbox{\switchtobodyfont[modern, 10pt]$\left\{ \vcenter{\hbox{\darkgray\vrule height \recurselevel pt width 5pt}} \right\}$}\vss}%
796        \ifdim\ht0>\ht2
797            \setbox2\vbox to \htdp0{\vss\box2\vss}%
798        \else
799            \setbox0\vbox to \htdp2{\vss\box0\vss}%
800        \fi
801        \box0\box2
802    \egroup \quad
803}
804\par \stop
805
806In traditional \TEX\ horizontal extensibles are not really present. Accents
807are chosen from a linked list of variants and don't have an extensible
808specification. This is because most such accents grow in two dimensions and
809the only extensible like accents are rules and braces. However, in \UNICODE\
810we have a few more and also because of symmetry we decided to add horizontal
811extensibles too. Take:
812
813\startbuffer
814$ \overbrace {a+1} \underbrace {b+2} \doublebrace {c+3} $ \par
815$ \overparent{a+1} \underparent{b+2} \doubleparent{c+3} $ \par
816\stopbuffer
817
818\typebuffer
819
820This gives:
821
822\getbuffer
823
824Contrary to Cambria, Latin Modern Math, which is just like
825Computer Modern Math, has no ready overbrace glyphs. Keep in mind
826that in that we're dealing with fonts that have only 256 slots and
827that the traditional font mechanism has the same limitation. For
828this reason, the (extensible) braces are traditionally made from
829snippets as is demonstrated below.
830
831\startbuffer
832\hbox\bgroup
833  \ruledhbox{\getglyph{lmex10}{\char"7A}}
834  \ruledhbox{\getglyph{lmex10}{\char"7B}}
835  \ruledhbox{\getglyph{lmex10}{\char"7C}}
836  \ruledhbox{\getglyph{lmex10}{\char"7D}}
837  \ruledhbox{\getglyph{lmex10}{\char"7A\char"7D\char"7C\char"7B}}
838  \ruledhbox{\getglyph{name:cambriamath}{\char"23DE}}
839  \ruledhbox{\getglyph{lmex10}{\char"7C\char"7B\char"7A\char"7D}}
840  \ruledhbox{\getglyph{name:cambriamath}{\char"23DF}}
841\egroup
842\stopbuffer
843
844\typebuffer
845
846This gives:
847
848\startlinecorrection
849\getbuffer
850\stoplinecorrection
851
852The four snippets have the height and depth of the rule that will
853connect them. Since we want a single interface for all fonts we no
854longer will use macro based solutions. First of all fonts like
855Cambria don't have the snippets, and using active character
856trickery (so that we can adapt the meaning to the font) has no
857preference either. This leaves virtual glyphs.
858
859It took us a bit of experimenting to get the right virtual definition because
860it is a multi||step process:
861
862\startitemize[packed]
863\item The right \UNICODE\ character (\type {0x23DE}) points to a character that has
864      no glyph itself but only horizontal extensibles.
865\item The snippets that make up the extensible don't have the right dimensions
866      (as they define the size of the connecting rule), so we need to make them
867      virtual themselves and give them a size that matches \LUATEX's expectations.
868\item Each virtual snippet contains a reference to the physical snippet and moves
869      it up or down as well as fixes its size.
870\item The second and fifth snippet are actually not real glyphs but rules. The
871      dimensions are derived from the snippets and it is shifted up or down too.
872\stopitemize
873
874You might wonder if this is worth the trouble. Well, it is if you take into
875account that all upcoming math fonts will be organized like Cambria.
876
877\subject{math kerning}
878
879While reading Microsofts orange booklet, it became clear that
880\OPENTYPE\ provides advanced kerning possibilities and we decided
881to put it on the agenda for \LUATEX.
882
883It is possible to define a ladder||like boundary for each corner
884of a character where the ladder more or less follows the shape of
885a character. In theory this means that when we attach a
886superscript to a base character we can use two such ladders to
887determine the optimal spacing between them.
888
889Let's have a look at a few characters, the upright~f and its
890italic cousin.
891
892\startcombination[2*1]
893  {\ShowGlyphShape{name:cambria-math}{40bp}{0x66}}    {U+00066}
894  {\ShowGlyphShape{name:cambria-math}{40bp}{0x1D453}} {0x1D453}
895\stopcombination
896
897The ladders on the right can be used to position a super or
898subscript, that is, they are positioned in the normal way but the
899ladder, as well as the boundingbox and/or left ladders of the
900scripts can be used to fine tune the positioning.
901
902Should we use this information? I made this visualizer for
903checking some Arabic fonts anchoring and cursive features and then
904it made sense to add some of the information related to math as
905well. \footnote {Taco extended the visualizer for his presentation
906at Bachotek 2009 so you might run into variants.} The orange
907booklet shows quite advanced ladders, and when looking at the 3500
908shapes in Cambria, it quickly becomes clear that in practice there
909is not that much detail in the specification. Nevertheless,
910because without this feature the result is not acceptable \LUATEX\
911gracefully supports it.
912
913\usetypescript[cambria-y]
914
915\startbuffer
916$V^a_a V^a V_a V^1_2 V^1 V_2 f^a f_a f^a_a$\par
917$V^f_f V^f V_f V^1_2 V^1 V_2 f^f f_f f^f_f$\par
918$T^a_a T^a T_a T^1_2 T^1 T_2 f^a f_f f^a_f$\par
919$T^f_f T^f T_f T^1_2 T^1 T_2 f^f f_a f^f_a$\par
920\stopbuffer
921
922\startlinecorrection
923\startcombination[3*1]
924    {\framed[align=normal]{\switchtobodyfont[modern]\getbuffer}}    {latin modern}
925    {\framed[align=normal]{\switchtobodyfont[cambria-y]\getbuffer}} {cambria without kerning}
926    {\framed[align=normal]{\switchtobodyfont[cambria]\getbuffer}}   {cambria with kerning}
927\stopcombination
928\stoplinecorrection
929
930% \ShowGlyphShape{name:cambria-math} {40bp}{0x1D43F}
931% \ShowGlyphShape{name:cambria-math}{100bp}{0x1D444}
932% \ShowGlyphShape{name:cambria-math}{100bp}{0x1D447}
933% \ShowGlyphShape{name:cambria-math}{100bp}{0x2112}
934% \ShowGlyphShape{name:cambria-math}{100bp}{0x1D432}
935% \ShowGlyphShape{name:cambria-math}{100bp}{0x1D43D}
936% \ShowGlyphShape{name:cambria-math}{100bp}{0x1D44A}
937% \ShowGlyphShape{name:cambria-math}{100bp}{0x1D45D}
938
939\subject{faking glyphs}
940
941A previous section already discussed virtual shapes. In the
942process of replacing all shapes that lack in Latin Modern and are
943composed from snippets instead we ran into the dots. As they are a
944nice demonstration of something that, although somewhat of a hack,
945survived 30 years without problems we show the definition used in
946\CONTEXT\ \MKII:
947
948% ldots = 2026
949% vdots = 22EE
950% cdots = 22EF
951% ddots = 22F1
952% udots = 22F0
953
954\startbuffer
955\def\PLAINldots{\ldotp\ldotp\ldotp}
956\def\PLAINcdots{\cdotp\cdotp\cdotp}
957
958\def\PLAINvdots
959  {\vbox{\forgetall\baselineskip.4\bodyfontsize\lineskiplimit\zeropoint\kern.6\bodyfontsize\hbox{.}\hbox{.}\hbox{.}}}
960
961\def\PLAINddots
962  {\mkern1mu%
963   \raise.7\bodyfontsize\ruledvbox{\kern.7\bodyfontsize\hbox{.}}%
964   \mkern2mu%
965   \raise.4\bodyfontsize\relax\ruledhbox{.}%
966   \mkern2mu%
967   \raise.1\bodyfontsize\ruledhbox{.}%
968   \mkern1mu}
969\stopbuffer
970
971\getbuffer \typebuffer
972
973This permitted us to say:
974
975\starttyping
976\definemathcommand [ldots] [inner]   {\PLAINldots}
977\definemathcommand [cdots] [inner]   {\PLAINcdots}
978\definemathcommand [vdots] [nothing] {\PLAINvdots}
979\definemathcommand [ddots] [inner]   {\PLAINddots}
980\stoptyping
981
982However, in \MKIV\ we use virtual shapes instead.
983
984\definemathcommand [xldots] [inner]   {\PLAINldots}
985\definemathcommand [xcdots] [inner]   {\PLAINcdots}
986\definemathcommand [xvdots] [nothing] {\PLAINvdots}
987\definemathcommand [xddots] [inner]   {\PLAINddots}
988
989The following lines show the virtual shapes in red. In each
990triplet we see the original, the virtual and the overlaid
991character.
992
993\startlinecorrection
994\switchtobodyfont[modern,17.3pt]%
995\dontleavehmode
996\ruledhbox{$\xldots$}%
997\ruledhbox{$\ldots$}%
998\ruledhbox{\startoverlay{$\xldots$}{$\red\ldots$}\stopoverlay}%
999\quad
1000\ruledhbox{$\xcdots$}%
1001\ruledhbox{$\cdots$}%
1002\ruledhbox{\startoverlay{$\xcdots$}{$\red\cdots$}\stopoverlay}%
1003\quad
1004\ruledhbox{$\xvdots$}%
1005\ruledhbox{$\vdots$}%
1006\ruledhbox{\startoverlay{$\xvdots$}{$\red\vdots$}\stopoverlay}%
1007\quad
1008\ruledhbox{$\xddots$}%
1009\ruledhbox{$\ddots$}%
1010\ruledhbox{\startoverlay{$\xddots$}{$\red\ddots$}\stopoverlay}%
1011\quad
1012\ruledhbox{$\xddots$}%
1013\ruledhbox{$\udots$}%
1014\ruledhbox{\startoverlay{$\xddots$}{$\red\udots$}\stopoverlay}%
1015\stoplinecorrection
1016
1017As you can see here, the virtual variants are rather close to the
1018originals. At 12pt there are no real differences but (somehow) at
1019other sizes we get slightly different results but it is hardly
1020visible. Watch the special spacing above the shapes. It is
1021probably needed for getting the spacing right in matrices (where
1022they are used).
1023
1024\stopcomponent
1025