mk-zapfino.tex /size: 20 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\startcomponent mk-zapfino
4
5\environment mk-environment
6
7\nonknuthmode
8
9\definefontfeature
10   [SampleFont]
11   [language=dflt,
12    script=latn,
13    calt=yes,
14    clig=yes,
15    rlig=yes,
16    tlig=yes,
17    mode=node]
18
19\font\Sample=ZapfinoExtraLTPro*SampleFont at 24pt
20
21\def\SampleChar#1{\dontleavehmode\struttedbox{\Sample\fontchar{#1}}}
22\def\SampleText#1{\dontleavehmode\struttedbox{\Sample#1}}
23
24\doifmodeelse {tug} {
25
26    \title{Zapfing fonts}
27
28    \subject{by Hans Hagen \& Taco Hoekwater}
29
30    This is Chapter~XII from \notabene {\CONTEXT, from \MKII\ to \MKIV}, a document
31    that describes our explorations, experiments and decisions made while
32    we develop \LUATEX. This text has not been copy-edited.
33
34    \blank[3*big]
35
36} {
37
38    \chapter{Zapfing fonts}
39
40}
41
42\subject {remark}
43
44{\it The actual form of the tables shown here might have changed
45in the meantime. However, since this document describes the
46stepwise development of \LUATEX\ and \CONTEXT\ \MKIV\ we don't
47update the following information. The rendering might differ from
48earlier rendering simply because the code used to process this
49chapter evolves.}
50
51\subject {features}
52
53In previous chapters we've seen support for \OPENTYPE\ features creep into \LUATEX\ and
54\CONTEXT\ \MKIV. However, it may not have been clear that so far we were just feeding
55the traditional \TEX\ machinery with the right data: ligatures and kerns. Here we will
56show what so called features can do for you. Not much \LUA\ code will be shown, if
57only because relatively complex code is needed to handle this kind of trickery with
58acceptable performance.
59
60In order to support features in their full glory more is needed than \TEX's ligature
61and kern mechanisms: we need to manipulate the node list. As a result, we have now a
62second mechanism built into \MKIV\ and users can choose what method they like most. The
63first method, called \type {base}, is less powerful and less complete
64than the one named \type {node}. Eventually \CONTEXT\ will use the node method by
65default.
66
67There are two variants of features: substitutions and positioning. Here we
68concentrate on substitutions of which there are several. Positioning is for instance
69used for specialized kerning as needed in for instance typesetting Arab.
70
71One character representation can be replaced by one or more fixed alternatives or alternatives
72chosen from a list of alternatives (substitutions or alternates). Multiple characters
73can be replaces by one character (substitutions, alternates or a ligature). The
74replacements can depend on preceding and|/|or following glyphs in which case we say that
75the replacement is driven by rules. Rules can deal with single glyphs, combinations of
76glyphs, classes (defined in the font) of glyphs and|/|or ranges of  glyphs.
77
78Because the available documentation of \OPENTYPE\ is rather minimalistic and because
79most fonts are relatively simple, you can imagine that figuring out how to
80implement support for fonts with advanced features is not entirely trivial
81and involves some trial and error. What also complicate things is that features can
82interfere. Yet another complicating factor is that in the order of applying a rule may
83obscure a later rule. Such fonts don't ship with manuals and examples of correct output
84are not part of the buy.
85
86We like testing \LUATEX's open type support with Palatino Regular and Palatino Sans and
87good old \TYPEONE\ support with Optima Nova. So it makes sense to test advanced
88features with Zapfino Pro. This font has many features, which happen to be
89implemented by Adam Twardoch, a well known font expert and familiar with the \TEX\
90community. We had the feeling that when \LUATEX\ can support Zapfino Pro, designed by
91Hermann Zapf and enhanced by Adam, we have reached a crucial point in the development.
92
93The first thing that you will observe when using this font is that the files are larger
94than normal, especially the cached versions in \MKIV. This made me extend some of the
95serialization code that we use for caching font data so that it could handle huge
96tables better but at the cost of some speed. Once we could handle the data conveniently
97and as a side effect look into the font data with an editor, it became clear that
98implementing for the \type {calt} and \type {clig} features would take a bit
99of coding.
100
101\subject{example}
102
103Before some details will be discussed, we will show two of the test texts that \CONTEXT\
104users normally use when testing layouts or new features, a quote from E.R.\ Tufte and
105one from Hermann Zapf. The \TEX\ code shows how features are set in \CONTEXT.
106
107\startbuffer
108\definefontfeature
109   [zapfino]
110   [language=nld,script=latn,mode=node,
111    calt=yes,clig=yes,liga=yes,rlig=yes,tlig=yes]
112
113\definefont
114    [Zapfino]
115    [ZapfinoExtraLTPro*zapfino at 24pt]
116    [line=40pt]
117\Zapfino
118\input tufte \par
119\stopbuffer
120
121\typebuffer  \blank[disable] \start \getbuffer \stop
122
123You don't even have to look too closely in order to notice that characters are
124represented by different glyphs, depending on the context in which they appear.
125
126\startbuffer
127\definefontsynonym
128  [Zapfino]
129  [ZapfinoExtraLTPro]
130  [features=zapfino]
131\definedfont
132  [Zapfino at 24pt]
133\setupinterlinespace
134  [line=40pt]
135\input zapf \par
136\stopbuffer
137
138\typebuffer \blank[disable] \start \getbuffer \stop
139
140\subject{obeying rules}
141
142When we were testing node based feature support, the only way to check this was to
143identify the rules that lead to certain glyphs. The more unique glyphs are good
144candidates for this. For instance
145
146\startitemize[packed]
147\item there is s special glyph representing \SampleChar{c_slash_o}
148\item in the input stream this is the character sequence \type{c/o}
149\item so there most be a rule that tells us that this sequence becomes that ligature
150\stopitemize
151
152As said, in this case, the replacement glyph is supposed to be a ligature and indeed
153there is such a ligature: \type {c_slash_o}. Of course, this replacement will only
154take place when the sequence is surrounded by spaces.
155
156However, when testing this, we were not looking at this rule but at the (randomly
157chosen) rule that was meant to intercept the alternative \type {h.2} followed
158by \type {z.4}. Interesting was that this resolved to a ligature indeed, but
159the shape associated with this ligature was an~\type {h}, which is not right.
160Actually, a few more of such rules turned out to be wrong. It took a bit of
161an effort to reach this conclusion because of the mentioned interferences
162of features and rules. At that time, the rule entry (in raw \LUATEX\ table
163format) looks as follows:
164
165\starttyping
166[44] = {
167    ["format"] = "coverage",
168    ["rules"] = {
169        [1] = {
170            ["coverage"] = {
171                ["ncovers"] = {
172                    [1] = "h.2",
173                    [2] = "z.4",
174                }
175            },
176            ["lookups"] = {
177                [1] = {
178                    ["lookup_tag"] = "L084",
179                    ["seq"] = 0,
180                }
181            }
182        }
183    }
184    ["script_lang_index"] = 1,
185    ["tag"] = "calt",
186    ["type"] = "chainsub"
187}
188\stoptyping
189
190Instead of reinventing the wheel, we used the \FONTFORGE\ libraries for reading the
191\OPENTYPE\ font files. Therefore the \LUATEX\ table is resembling the internal \FONTFORGE\
192data structures. Currently we show the version~1 format.
193
194Here \type {ncovers} means that when the current character has shape \SampleChar
195{h.2} (\type{h.2}) and the next one is \SampleChar{z.4} (\type{z.4}) (a sequence)
196then we need to apply the lookup internally tagged \type {L084}. Such a rule
197can be more extensive, for instance instead of \type {h.2} one can have a list of
198characters, and there can be \type {bcovers} and \type {fcovers} as well, which means
199that preceding or following character need to be taken into account.
200
201When this rule matches, it resolves to a specification like:
202
203\starttyping
204[6] = {
205    ["flags"] = 0,
206    ["lig"] = {
207        ["char"] = "h",
208        ["components"] = "h.2 z.4",
209    },
210    ["script_lang_index"] = 65535,
211    ["tag"] = "L084",
212    ["type"] = "ligature",
213}
214\stoptyping
215
216Here \type {tag} and \type {script_lang_index} are kind of special and
217are part of an private feature system, i.e.\ they make up the cross reference
218between rules and glyphs. Watch how the components don't match the character,
219which is even more peculiar when we realize that these are the initials of the
220author of the font. It took a couple of Skype sessions and mails before
221we came to the conclusion that this was probably a glitch in the font. So,
222what to do when a font has bugs like this? Should one disable the feature?
223That would be a pitty because a font like Zapfino depends on it. On the other
224hand, given the number of rules and given the fact that there are different
225rule sets for some languages, you can imagine that making up the rules and
226checking them is not trivial.
227
228We should realize that Zapfino is an extraordinary case, because it used
229the \OPENTYPE\ features extensively. We can also be sure that the problems will
230be fixed once they are known, if only because Adam Twardoch (who did the job)
231has exceptionally high standards but it may take a while before the fix reached
232the user (who then has to update his or her font). As said, it also takes some
233effort to run into the situation described here so the likelihood of running
234into this rule is small. This also brings to our attention the fact that fonts
235can now contain bugs and updating them makes sense but can break existing
236documents. Since such fonts are copyrighted and not available on line, font
237vendors need to find ways to communicate these fixes to their customers.
238
239Can we add some additional checks for problems like this? For a while I
240thought that it was possible by assuming that ligatures have names like
241\type {h.2_z.4} but alas, sequences of glyphs are mapped onto ligatures
242using mappings like the following:
243
244\starttabulate[||||]
245\NC \type{three fraction four.2} \NC \type{threequarters} \NC \SampleChar{threequarters} \NC\NR
246\NC \type{three fraction four}   \NC \type{threequarters} \NC \SampleChar{threequarters} \NC\NR
247\NC \type{d r}                   \NC \type{d_r}           \NC \SampleChar{d_r}           \NC\NR
248\NC \type{e period}              \NC \type{e_period}      \NC \SampleChar{e_period}      \NC\NR
249\NC \type{f i}                   \NC \type{fi}            \NC \SampleChar{fi}            \NC\NR
250\NC \type{f l}                   \NC \type{fl}            \NC \SampleChar{fl}            \NC\NR
251\NC \type{f f i}                 \NC \type{f_f_i}         \NC \SampleChar{f_f_i}         \NC\NR
252\NC \type{f t}                   \NC \type{f_t}           \NC \SampleChar{f_t}           \NC\NR
253\stoptabulate
254
255Some ligature have no \type {_} in their names and there are also some
256inconsistencies, compare the \type {fl} and \type {f_f_i}. Here font
257history is painfully reflected in inconsistency and no solution can be
258found here.
259
260So, in order to get rid of this problem, \MKIV\ implements a method to ignore
261certain rules but then, this only makes sense if one knows how the rules
262are tagged internally. So, in practice this is no solution. However, you can
263imagine that at some point \CONTEXT\ ships with a database of fixes that
264are applied to known fonts with certain version numbers.
265
266We also found out that the font table that we used was not good enough for our
267purpose because the exact order in what rules have to be applies was not
268available. Then we noticed that in the meantime \FONTFORGE\ had moved on
269to version~2 and after consulting the author we quickly came to the conclusion
270that it made sense to use the updated representation.
271
272In version~2 the snippet with the previously mentioned rule looks as follows:
273
274\starttyping
275["ks_latn_l_66_c_19"]={
276 ["format"]="coverage",
277 ["rules"]={
278  [1]={
279   ["coverage"]={
280    ["current"]={
281     [1]="h.2",
282     [2]="z.4",
283    }
284   },
285   ["lookups"]={
286    [1]={
287     ["lookup"]="ls_l_84",
288     ["seq"]=0,
289    }
290   }
291  }
292 },
293 ["type"]="chainsub",
294},
295\stoptyping
296
297The main rule table is now indexed by name which is possible because the order
298of rules is specified somewhere else. The key \type {ncovers} has been replaced
299by \type {current}. As long as \LUATEX\ is in beta stage, we have the freedom to
300change such labels as some of them are rather \FONTFORGE\ specific.
301
302This rule is mentioned in a feature specification table. Here specific features are
303associated with languages and scripts. This is just one of the entries concerning
304\type {calt}. You can imagine that it took a while to figure out how best to
305deal with this, but eventually the \MKIV\ code could do the trick. The cryptic
306names are replacements for pointers in the \FONTFORGE\ datastructure. In order to be
307able to use \FONTFORGE\ for font development and analysis, the decision was made to
308stick closely to its idiom.
309
310\starttyping
311 ["gsub"]={
312  ...
313  [67]={
314   ["features"]={
315    [1]={
316     ["scripts"]={
317      [1]={
318       ["langs"]={
319        [1]="AFK ",
320        [2]="DEU ",
321        [3]="NLD ",
322        [4]="ROM ",
323        [5]="TRK ",
324        [6]="dflt",
325       },
326       ["script"]="latn",
327      }
328     },
329     ["tag"]="calt",
330    }
331   },
332   ["name"]="ks_latn_l_66",
333   ["subtables"]={
334    [1]={
335     ["name"]="ks_latn_l_66_c_0",
336    },
337    ...
338    [20]={
339     ["name"]="ks_latn_l_66_c_19",
340    },
341    ...
342   },
343   ["type"]="gsub_context_chain",
344  },
345\stoptyping
346
347\subject{practice}
348
349The few snapshots of the font table probably don't make much sense if you
350haven't seen the whole table. Well, it certainly helps to see the whole picture,
351but we're talking of a 14 MB file (1.5 MB bytecode). When resolving ligatures,
352we can follow a straightforward approach:
353
354\startitemize[packed]
355\item walk over the nodelist and at each character (glyph node) call a function
356\item this function inspects the character and takes a look at the following ones
357\item when a ligature is identified, the sequence of nodes is replaced
358\stopitemize
359
360Substitutions are not much different but there we look at just one character.
361However, contextual substitutions (and ligatures) are more complex. Here we need
362to loop over a list of rules (dependent on script and language) and this involves
363a sequence as well as preceding and following characters. When we have a hit, the
364sequence will be replaced by another one, determined by a lookup in the character
365table. Since this is a rather time consuming operation, especially because many
366surrounding characters need to be taken into account, you can imagine that we need
367a bit of trickery to get an acceptable performance. Fortunately \LUA\ is pretty fast
368when it comes down to manipulating strings and tables, so we can prepare some handy
369datastructures in advance.
370
371When testing the implementation of features one need to be aware of the fact that
372some appearance are also implemented using the regular ligature mechanisms. Take the
373following definitions:
374
375\startbuffer[a]
376\definefontfeature
377   [none]
378   [language=dflt,script=latn,mode=node,liga=no]
379\definefontfeature
380   [calt]
381   [language=dflt,script=latn,mode=node,liga=no,calt=yes]
382\definefontfeature
383   [clig]
384   [language=dflt,script=latn,mode=node,liga=no,clig=yes]
385\definefontfeature
386   [dlig]
387   [language=dflt,script=latn,mode=node,liga=no,dlig=yes]
388\definefontfeature
389   [liga]
390   [language=dflt,script=latn,mode=node]
391\stopbuffer
392
393\startbuffer[b]
394\starttabulate[||||]
395\NC \type{none } \NC \definedfont[ZapfinoExtraLTPro*none at 24pt]\hbox{on the synthesis}\NC\definedfont[ZapfinoExtraLTPro*none at 24pt]\hbox{winnow the wheat}\NC \NR
396\NC \type{calt } \NC \definedfont[ZapfinoExtraLTPro*calt at 24pt]\hbox{on the synthesis}\NC\definedfont[ZapfinoExtraLTPro*calt at 24pt]\hbox{winnow the wheat}\NC \NR
397\NC \type{clig } \NC \definedfont[ZapfinoExtraLTPro*clig at 24pt]\hbox{on the synthesis}\NC\definedfont[ZapfinoExtraLTPro*clig at 24pt]\hbox{winnow the wheat}\NC \NR
398\NC \type{dlig } \NC \definedfont[ZapfinoExtraLTPro*dlig at 24pt]\hbox{on the synthesis}\NC\definedfont[ZapfinoExtraLTPro*dlig at 24pt]\hbox{winnow the wheat}\NC \NR
399\NC \type{liga } \NC \definedfont[ZapfinoExtraLTPro*liga at 24pt]\hbox{on the synthesis}\NC\definedfont[ZapfinoExtraLTPro*liga at 24pt]\hbox{winnow the wheat}\NC \NR
400\stoptabulate
401\stopbuffer
402
403\typebuffer[a]
404
405This gives:
406
407\start \getbuffer[a] \getbuffer[b] \stop
408
409Here are Adam's recommendations with regards to the \type {dlig} feature:
410\quotation {The \type{dlig} feature is supposed to by use only upon user's
411discretion, usually on single runs, words or even pairs. It makes little
412sense to enable \type {dlig} for an entire sentence or paragraph. That's
413how the \OPENTYPE\ specification envisions it.}
414
415When testing features it helps to use words that look similar so next we will
416show some examples that used. When we look at these examples, we need to
417understand that when a specific character representation is analyzed, the
418rules can take preceding and following characters into account. The rules
419take characters as well as their shapes, or more precisely: one of their
420shapes since Zapfino has many variants, into account. Since different rules
421are used for languages (okay, this is limited to only a subset of languages
422that use the latin script) not only shapes but also the way words are
423constructed are taken into account. Designing te rules is definitely non trivial.
424
425When testing the implementation we ran into cases where the initial \type
426{t} showed up wrong, for instance in the the Dutch word \type {troef}.
427Because space can be part of the rules, we need to handle the
428cases where words end and start and boxes are then kind of special.
429
430\definefontfeature
431   [zapfing]
432   [language=dflt,
433    script=latn,
434    calt=yes,
435    clig=yes,
436    rlig=yes,
437    tlig=yes,
438    mode=node]
439
440\font\Zapfing=ZapfinoExtraLTPro*zapfing at 24pt
441
442\startbuffer
443troef troef troef troeftroef troef  \par
444\ruledhbox{troef troef troef troeftroef troef} \par
445\ruledhbox{troef 123} \par
446\ruledhbox{troef} \ruledhbox{troef } \ruledhbox{ troef} \ruledhbox { troef } \par
447\stopbuffer
448
449\typebuffer \start \Zapfing \getbuffer \stop
450
451Unfortunately, this does not work well with punctuation, which is less
452prominent in the rules than space. In our favourite test quote of Tufte, we have
453lots of commas and there it shows up:
454
455\startbuffer
456review review review, review \par
457itemize, review \par
458itemize, review, \par
459\stopbuffer
460
461\typebuffer \start \Zapfing \getbuffer \stop
462
463Of course we can decide to extend the rule base at runtime and this may
464well happen when we experiment more with this font.
465
466The next one was one of our first test lines, Watch the initial and the
467Zapfino ligature.
468
469\startbuffer
470Welcome to Zapfino
471\stopbuffer
472
473\typebuffer \start \Zapfing \getbuffer \stop
474
475For a while there was a bug in the rule handler that resulted in the variant of
476the \type {y} that has a very large descender. Incidentally the word \type
477{synthesize} is also a good test case for the \type {the} pattern which gets
478special treatment because there is a ligature available.
479
480\startbuffer
481synopsize versus synthesize versus
482synthase versus sympathy versus synonym
483\stopbuffer
484
485\typebuffer \start \Zapfing \getbuffer \stop
486
487Here are some examples that use the \type {g}, \type {d} and \type {f} in
488several places.
489
490\startbuffer
491eggen groet ogen hagen \par
492dieren druiven onder aard  donder modder \par
493fiets effe flater triest troef \par
494\stopbuffer
495
496\typebuffer \start \Zapfing \getbuffer \stop
497
498Let's see how well Hermann has taken care of the \type {h}'s
499representations. There are quite some variants of the lowercase one:
500
501\starttabulate
502\NC \type {h}      \NC \SampleChar{h}      \NC \NR
503\NC \type {h.2}    \NC \SampleChar{h.2}    \NC \NR
504\NC \type {h.3}    \NC \SampleChar{h.3}    \NC \NR
505\NC \type {h.4}    \NC \SampleChar{h.4}    \NC \NR
506\NC \type {h.5}    \NC \SampleChar{h.5}    \NC \NR
507\NC \type {h.init} \NC \SampleChar{h.init} \NC \NR
508\NC \type {h.sups} \NC \SampleChar{h.sups} \NC \NR
509\NC \type {h.sc}   \NC \SampleChar{h.sc}   \NC \NR
510\NC \type {orn.73} \NC \SampleChar{orn.73} \NC \NR
511\stoptabulate
512
513How about the uppercase variant, as used in his name:
514
515\startbuffer
516M Mr Mr. H He Her Herm Herma Herman Hermann Z Za Zap Zapf \par
517Mr. Hermann Zapf
518\stopbuffer
519
520\typebuffer \start \Zapfing \getbuffer \stop
521
522Of course we have to test another famous name:
523
524\startbuffer
525D Do Don Dona Donal Donald K Kn Knu Knut Knuth \par
526Don Knuth Donald Knuth Donald E. Knuth DEK \par
527Prof. Dr. Donald E. Knuth \par
528\stopbuffer
529
530\typebuffer \start \Zapfing \getbuffer \stop
531
532Unfortunately the \LUA\ and \TEX\ logos don't come out that well:
533
534\startbuffer
535L Lu Lua l lu lua t te tex TeX luatex luaTeX LuaTeX
536\stopbuffer
537
538\typebuffer \start \Zapfing \getbuffer \stop
539
540This font has quite some ornaments and there is an \type {ornm} feature
541that can be applied. We're still not sure about its usage, but when one
542keys in text in lowercase, \type {hermann} comes out as follows:
543
544\definefontfeature
545  [gebarentaal]
546  [language=dflt,
547   script=latn,
548   mode=node,
549   ornm=yes,
550   liga=no]
551
552{\font\Sample=ZapfinoExtraLTPro*gebarentaal at 24pt \Sample hermann}
553
554As said in the beginning, dirty implementation details will be kept away from
555the reader. Also, you should not be surprised if the current code had some
556bugs or does some things wrong. Also, if spacing looks a bit weird to you,
557keep in mind that we're still in the middle of sorting things out.
558
559\start \Zapfing Taco Hoekwater \& Hans Hagen \stop
560
561\stopcomponent
562