still-backend.tex /size: 17 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\environment still-environment
4
5\starttext
6
7\startchapter[title=The \LUATEX\ \PDF\ backend]
8
9\startsection[title=Introduction]
10
11The original design of \TEX\ has a clear separation between the frontend and
12backend code. In principle, shipping out a page boils down to traversing the
13to|-|be|-|shipped|-|out box and translating the glyph, rule, glue, kern and list
14nodes into positioning just glyphs and rules on a canvas. The \DVI\ backend is
15therefore relatively simple, as the \DVI\ output format delegates to other
16programs the details of font inclusion and such into the final format; it just
17describes the pages.
18
19Because we eventually want color and images as well, there is a mechanism to pass
20additional information to post|-|processing programs. One can insert \type
21{\special}s with directives like \type {insert image named foo.jpg}. The frontend
22as well as the backend are not concerned with what goes into a special; the \DVI\
23post|-|processor of course is.
24
25The \PDF\ backend, on the other hand, is more complex as it immediately produces
26the final typeset result and, as such, offers possibilities to insert verbatim
27code (\type {\pdfliteral}), images (\type {\pdfximage} cum suis), annotations,
28destinations, threads and all kinds of objects, reuse typeset content (\type
29{\pdfxform} cum suis); in the end, there are all kinds of \type {\pdf...}
30commands. The way these were implemented in \LUATEX\ prior to 0.82 violates the
31separation between frontend and backend, an inheritance from \PDFTEX. Additional
32features such as protrusion and expansion add to that entanglement. However,
33because \PDF\ is an evolving standard, occasionally we need to adapt the related
34code. A separation of code makes sure that the frontend can become stable (and
35hopefully frozen) at some point. \footnote {In practice nowadays, the backend
36code changes little, because the \PDF\ produced by \LUATEX\ is rather simple and
37is easily adapted to the changing standard.}
38
39In \LUATEX\ we had already started making this separation of specialized code,
40such as a cleaner implementation of font expansion, but all these \type {\pdf...}
41commands were still pervasive, leading to fuzzy dependencies, checks for backend
42modes, etc.\ so a logical step was to straighten all this out. That way we give
43\LUATEX\ a cleaner core constructed from traditional \TEX, extended with \ETEX,
44\ALEPH|/|\OMEGA, and \LUATEX\ functionality.
45
46\stopsection
47
48\startsection[title=Extensions]
49
50A first step, then, was to transform generic (i.e.\ independent from the backend)
51functionality which was still (sort of) bound to \ALEPH\ and \PDFTEX, into core
52functionality. A second step was to reorganize the backend specific \PDF\ code,
53i.e.\ move it out of the core and into the group of extension commands. This
54extension group is somewhat special and originates in traditional \TEX; it is the
55way to add your own functionality to \TEX, the program.
56
57As an example for future programmers, Don Knuth added four (connected) primitives
58as extensions: \type {\openout}, \type {\closeout}, \type {\write} and \type
59{\special}. The \ALEPH\ and \PDFTEX\ engines, on the other hand, put some
60functionality in extensions and some in the core. This arose from the fact that
61dealing with variables in extensions is often inconvenient, as they are then seen
62as (unexpandable) commands instead of integers, token lists, etc. That the
63write|-|related commands are there is almost entirely due to being the
64demonstration of the mechanism; everything related to {\em reading} files is in
65the core. There is one property that perhaps forces us to keep the writers there,
66and that's the \type {\immediate} prefix. \footnote {Unfortunately we're stuck
67with \type {\immediate} in the backend; a \type {deferred} keyword would have
68been handier, especially since other backend|-|related commands can also be
69immediate.}
70
71In the process of separating, we reshuffled the code base a bit; the current use
72of the extensions mechanism still suits as an example and also gives us backward
73compatibility. However, new backend primitives will not be added there but rather
74in specific plugins (if needed at all).
75
76\stopsection
77
78\startsection[title=From whatsits to nodes]
79
80The \PDF\ backend introduced two new concepts into the core: (reusable) images
81and (reusable) content (wrapped in boxes). In keeping with good \TEX\ practice,
82these were implemented as whatsits (a node type for extensions); but this
83created, as a side effect, an anomaly in the handling of such nodes. Consider
84looping over a node list where we need to check dimensions of nodes; in \LUA, we
85can write something like this:
86
87\starttyping
88while n do
89    if n.id == glyph then
90        -- wd ht dp
91    elseif n.id == rule then
92        -- wd ht dp
93    elseif n.id == kern then
94        -- wd
95    elseif n.id == glue then
96        -- size stretch shrink
97    elseif n.id == whatsits then
98        if n.subtype == pdfxform then
99            -- wd ht dp
100        elseif n.subtype == pdfximage then
101            -- wd ht dp
102        end
103    end
104    n = n.next
105end
106\stoptyping
107
108So for each node in the list, we need to check these two whatsit subtypes. But as
109these two concepts are rather generic, there is no evident need to implement it
110this way. Of course the backend has to provide the inclusion and reuse, but the
111frontend can be agnostic about this. That is, at the input end, in specifying
112these two injects, we only have to make sure we pass the right information (so
113the scanner might differentiate between backends).
114
115Thus, in \LUATEX\ these two concepts have been promoted to core features:
116
117\starttabulate[|l|l|]
118\NC \type {\pdfxform}           \NC \type {\saveboxresource}             \NC \NR
119\NC \type {\pdfximage}          \NC \type {\saveimageresource}           \NC \NR
120\NC \type {\pdfrefxform}        \NC \type {\useboxresource}              \NC \NR
121\NC \type {\pdfrefximage}       \NC \type {\useimageresource}            \NC \NR
122\NC \type {\pdflastxform}       \NC \type {\lastsavedboxresourceindex}   \NC \NR
123\NC \type {\pdflastximage}      \NC \type {\lastsavedimageresourceindex} \NC \NR
124\NC \type {\pdflastximagepages} \NC \type {\lastsavedimageresourcepages} \NC \NR
125\stoptabulate
126
127The index should be considered an arbitrary number set to whatever the backend
128plugin decides to use as an identifier. These are no longer whatsits, but a
129special type of rule; after all, \TEX\ is only interested in dimensions. Given
130this change, the previous code can be simplified to:
131
132\starttyping
133while n do
134    if n.id == glyph then
135        -- wd ht dp
136    elseif n.id == rule then
137        -- wd ht dp
138    elseif n.id == kern then
139        -- wd
140    elseif n.id == glue then
141        -- size stretch shrink
142    end
143    n = n.next
144end
145\stoptyping
146
147The only consequence for the previously existing rule type (which, in fact, is
148also something that needs to be dealt with in the backend, depending on the
149target format) is that a normal rule now has subtype~0 while the box resource has
150subtype~1 and the image subtype~2.
151
152If a package writer wants to retain the \PDFTEX\ names, the previous table can be
153used; just prefix \type{\let}. For example, the first line would be (spaces
154optional, of course):
155
156\starttyping
157\let\pdfxform\saveboxresource
158\stoptyping
159
160\stopsection
161
162\startsection[title=Direction nodes]
163
164A similar change has been made for ``direction'' nodes, which were also
165previously whatsits. These are now normal nodes so again, instead of consulting
166whatsit subtypes, we can now just check the id of a node.
167
168It should be apparent that all of these changes from whatsits to normal nodes
169already greatly simplify the code base.
170
171\stopsection
172
173\startsection[title=Promoted commands]
174
175Many more commands have been promoted to the core. Here is an additional list of
176original \PDFTEX\ commands and their new counterparts (this time with the \type
177{\let} included):
178
179\starttyping
180\let\pdfpagewidth       \pagewidth
181\let\pdfpageheight      \pageheight
182
183\let\pdfadjustspacing   \adjustspacing
184\let\pdfprotrudechars   \protrudechars
185\let\pdfnoligatures     \ignoreligaturesinfont
186\let\pdffontexpand      \expandglyphsinfont
187\let\pdfcopyfont        \copyfont
188
189\let\pdfnormaldeviate  \normaldeviate
190\let\pdfuniformdeviate \uniformdeviate
191\let\pdfsetrandomseed  \setrandomseed
192\let\pdfrandomseed     \randomseed
193
194\let\ifpdfabsnum       \ifabsnum
195\let\ifpdfabsdim       \ifabsdim
196\let\ifpdfprimitive    \ifprimitive
197
198\let\pdfprimitive      \primitive
199
200\let\pdfsavepos        \savepos
201\let\pdflastxpos       \lastxpos
202\let\pdflastypos       \lastypos
203
204\let\pdftexversion     \luatexversion
205\let\pdftexrevision    \luatexrevision
206\let\pdftexbanner      \luatexbanner
207
208\let\pdfoutput         \outputmode
209\let\pdfdraftmode      \draftmode
210
211\let\pdfpxdimen        \pxdimen
212
213\let\pdfinsertht       \insertht
214\stoptyping
215
216\stopsection
217
218\startsection[title=Backend commands]
219
220There are many commands that start with \type {\pdf} and, over the history of
221development of \PDFTEX\ and \LUATEX, some have been added, some have been
222renamed, others removed. Instead of the many, we now have just one: \type
223{\pdfextension}. A couple of usage examples:
224
225\starttyping
226\pdfextension literal {1 0 0 2 0 0 cm}
227\pdfextension obj     {/foo (bar)}
228\stoptyping
229
230Here, we pass a keyword that tells for what to scan and what to do with it. A
231backward|-|compatible interface is easy to write. Although it delegates a bit
232more management of these \type {\pdf} commands to the macro package, the
233responsibility for dealing with such low|-|level, error|-|prone calls is there
234anyway. The full list of \type {\pdfextension}s is given here. The scanning after
235the keyword is the same as for \PDFTEX.
236
237\starttyping
238\protected\def\pdfliteral       {\pdfextension literal }
239\protected\def\pdfcolorstack    {\pdfextension colorstack }
240\protected\def\pdfsetmatrix     {\pdfextension setmatrix }
241\protected\def\pdfsave          {\pdfextension save\relax}
242\protected\def\pdfrestore       {\pdfextension restore\relax}
243\protected\def\pdfobj           {\pdfextension obj }
244\protected\def\pdfrefobj        {\pdfextension refobj }
245\protected\def\pdfannot         {\pdfextension annot }
246\protected\def\pdfstartlink     {\pdfextension startlink }
247\protected\def\pdfendlink       {\pdfextension endlink\relax}
248\protected\def\pdfoutline       {\pdfextension outline }
249\protected\def\pdfdest          {\pdfextension dest }
250\protected\def\pdfthread        {\pdfextension thread }
251\protected\def\pdfstartthread   {\pdfextension startthread }
252\protected\def\pdfendthread     {\pdfextension endthread\relax}
253\protected\def\pdfinfo          {\pdfextension info }
254\protected\def\pdfcatalog       {\pdfextension catalog }
255\protected\def\pdfnames         {\pdfextension names }
256\protected\def\pdfincludechars  {\pdfextension includechars }
257\protected\def\pdffontattr      {\pdfextension fontattr }
258\protected\def\pdfmapfile       {\pdfextension mapfile }
259\protected\def\pdfmapline       {\pdfextension mapline }
260\protected\def\pdftrailer       {\pdfextension trailer }
261\protected\def\pdfglyphtounicode{\pdfextension glyphtounicode }
262\stoptyping
263
264\stopsection
265
266\startsection[title=Backend variables]
267
268As with commands, there are many variables that can influence the \PDF\ backend.
269The most important one was, of course, that which set the output mode
270(\type{\pdfoutput}). Well, that one is gone and has been replaced by \type
271{\outputmode}. A value of~1 means that we produce \PDF.
272
273One complication of variables is that (if we want to be compatible), we need to
274have them as real \TEX\ registers. However, as most of them are optional, an easy
275way out is simply not to define them in the engine. In order to be able to still
276deal with them as registers (which is backward compatible), we define them as
277follows:
278
279\starttyping
280\edef\pdfminorversion        {\pdfvariable minorversion}
281\edef\pdfcompresslevel       {\pdfvariable compresslevel}
282\edef\pdfobjcompresslevel    {\pdfvariable objcompresslevel}
283\edef\pdfdecimaldigits       {\pdfvariable decimaldigits}
284
285\edef\pdfhorigin             {\pdfvariable horigin}
286\edef\pdfvorigin             {\pdfvariable vorigin}
287
288\edef\pdfgamma               {\pdfvariable gamma}
289\edef\pdfimageresolution     {\pdfvariable imageresolution}
290\edef\pdfimageapplygamma     {\pdfvariable imageapplygamma}
291\edef\pdfimagegamma          {\pdfvariable imagegamma}
292\edef\pdfimagehicolor        {\pdfvariable imagehicolor}
293\edef\pdfimageaddfilename    {\pdfvariable imageaddfilename}
294\edef\pdfignoreunknownimages {\pdfvariable ignoreunknownimages}
295
296\edef\pdfinclusioncopyfonts  {\pdfvariable inclusioncopyfonts}
297\edef\pdfinclusionerrorlevel {\pdfvariable inclusionerrorlevel}
298\edef\pdfpkmode              {\pdfvariable pkmode}
299\edef\pdfpkresolution        {\pdfvariable pkresolution}
300\edef\pdfgentounicode        {\pdfvariable gentounicode}
301
302\edef\pdflinkmargin          {\pdfvariable linkmargin}
303\edef\pdfdestmargin          {\pdfvariable destmargin}
304\edef\pdfthreadmargin        {\pdfvariable threadmargin}
305\edef\pdfformmargin          {\pdfvariable formmargin}
306
307\edef\pdfuniqueresname       {\pdfvariable uniqueresname}
308\edef\pdfpagebox             {\pdfvariable pagebox}
309\edef\pdfpagesattr           {\pdfvariable pagesattr}
310\edef\pdfpageattr            {\pdfvariable pageattr}
311\edef\pdfpageresources       {\pdfvariable pageresources}
312\edef\pdfxformattr           {\pdfvariable xformattr}
313\edef\pdfxformresources      {\pdfvariable xformresources}
314\stoptyping
315
316You can set them as follows (the values shown here are the initial values):
317
318\starttyping
319\pdfcompresslevel         9
320\pdfobjcompresslevel      1
321\pdfdecimaldigits         3
322\pdfgamma              1000
323\pdfimageresolution      71
324\pdfimageapplygamma       0
325\pdfimagegamma         2200
326\pdfimagehicolor          1
327\pdfimageaddfilename      1
328\pdfpkresolution         72
329\pdfinclusioncopyfonts    0
330\pdfinclusionerrorlevel   0
331\pdfignoreunknownimages   0
332\pdfreplacefont           0
333\pdfgentounicode          0
334\pdfpagebox               0
335\pdfminorversion          4
336\pdfuniqueresname         0
337
338\pdfhorigin             1in
339\pdfvorigin             1in
340\pdflinkmargin          0pt
341\pdfdestmargin          0pt
342\pdfthreadmargin        0pt
343\stoptyping
344
345Their removal from the frontend has helped again to clean up the code and, by
346making them registers, their use is still compatible. A call to \type
347{\pdfvariable} defines an internal register that keeps the value (of course this
348value can also be influenced by the backend itself). Although they are real
349registers, they live in a protected namespace:
350
351\startbuffer
352\meaning\pdfcompresslevel
353\stopbuffer
354
355\typebuffer
356
357which gives:
358
359{\tt\getbuffer}
360
361It's perhaps unfortunate that we have to remain compatible because a setter and
362getter would be much nicer. I am still considering writing the extension
363primitive in \LUA\ using the token scanner, but it might not be possible to
364remain compatible then. This is not so much an issue for \CONTEXT\ that always
365has had backend drivers, but, rather, for other macro packages that have users
366expecting the primitives (or counterparts) to be available.
367
368\startsection[title=Backend feedback]
369
370The backend can report on some properties that were also accessible via \type
371{\pdf...} primitives. Because these are read|-|only variables, another primitive
372now handles them: \type {\pdffeedback}. This primitive can be used to define
373compatible alternatives:
374
375\starttyping
376\def\pdflastlink       {\numexpr\pdffeedback lastlink\relax}
377\def\pdfretval         {\numexpr\pdffeedback retval\relax}
378\def\pdflastobj        {\numexpr\pdffeedback lastobj\relax}
379\def\pdflastannot      {\numexpr\pdffeedback lastannot\relax}
380\def\pdfxformname      {\numexpr\pdffeedback xformname\relax}
381\def\pdfcreationdate           {\pdffeedback creationdate}
382\def\pdffontname       {\numexpr\pdffeedback fontname\relax}
383\def\pdffontobjnum     {\numexpr\pdffeedback fontobjnum\relax}
384\def\pdffontsize       {\dimexpr\pdffeedback fontsize\relax}
385\def\pdfpageref        {\numexpr\pdffeedback pageref\relax}
386\def\pdfcolorstackinit         {\pdffeedback colorstackinit}
387\stoptyping
388
389The variables are internal, so they are anonymous. When we ask for the meaning of
390some that were previously defined:
391
392\starttyping
393\meaning\pdfhorigin
394\meaning\pdfcompresslevel
395\meaning\pdfpageattr
396\stoptyping
397
398we will get, similar to the above:
399
400\starttyping
401macro:->[internal backend dimension]
402macro:->[internal backend integer]
403macro:->[internal backend tokenlist]
404\stoptyping
405
406\stopsection
407
408\startsection[title=Removed primitives]
409
410Finally, here is the list of primitives that have been removed, with no
411\TEX|-|level equivalent available. Many were experimental, and they can be easily
412be provided to \TEX\ using \LUA.
413
414\startcolumns[n=2]
415\starttyping
416\knaccode
417\knbccode
418\knbscode
419\pdfadjustinterwordglue
420\pdfappendkern
421\pdfeachlinedepth
422\pdfeachlineheight
423\pdfelapsedtime
424\pdfescapehex
425\pdfescapename
426\pdfescapestring
427\pdffiledump
428\pdffilemoddate
429\pdffilesize
430\pdffirstlineheight
431\pdfforcepagebox
432\pdfignoreddimen
433\pdflastlinedepth
434\pdflastmatch
435\pdflastximagecolordepth
436\pdfmatch
437\pdfmdfivesum
438\pdfmovechars
439\pdfoptionalwaysusepdfpagebox
440\pdfoptionpdfinclusionerrorlevel
441\pdfprependkern
442\pdfresettimer
443\pdfshellescape
444\pdfsnaprefpoint
445\pdfsnapy
446\pdfsnapycomp
447\pdfstrcmp
448\pdfunescapehex
449\pdfximagebbox
450\shbscode
451\stbscode
452\stoptyping
453\stopcolumns
454
455\stopsection
456
457\startsection[title=Conclusion]
458
459The advantage of a clean backend separation, supported by just the three
460primitives \type {\pdfextension}, \type {\pdfvariable} and \type {\pdffeedback},
461as well as a collection of registers, is that we can now further clean the code
462base, which remains a curious mix of combined engine code, sometimes and
463sometimes not converted to C from \PASCAL. A clean separation also means that if
464someone wants to tune the backend for a special purpose, the frontend can be left
465untouched. We will get there eventually.
466
467All the definitions shown here are available in the file \type {luatex-pdf.tex},
468which is part of the \CONTEXT\ distribution.
469
470\stopsection
471
472\stopchapter
473
474\stoptext
475