luatex-modifications.tex /size: 48 Kb    last modification: 2023-12-21 09:43
1% language=us engine=luatex runpath=texruns:manuals/luatex
2
3\environment luatex-style
4
5\startcomponent luatex-modifications
6
7\startchapter[reference=modifications,title={Modifications}]
8
9\startsection[title=The merged engines]
10
11\startsubsection[title=The need for change]
12
13\topicindex {engines}
14\topicindex {history}
15
16The first version of \LUATEX\ only had a few extra primitives and it was largely
17the same as \PDFTEX. Then we merged substantial parts of \ALEPH\ into the code
18and got more primitives. When we got more stable the decision was made to clean
19up the rather hybrid nature of the program. This means that some primitives have
20been promoted to core primitives, often with a different name, and that others
21were removed. This made it possible to start cleaning up the code base. In \in
22{chapter} [enhancements] we discussed some new primitives, here we will cover
23most of the adapted ones.
24
25Besides the expected changes caused by new functionality, there are a number of
26not|-|so|-|expected changes. These are sometimes a side|-|effect of a new
27(conflicting) feature, or, more often than not, a change necessary to clean up
28the internal interfaces. These will also be mentioned.
29
30\stopsubsection
31
32\startsubsection[title=Changes from \TEX\ 3.1415926]
33
34\topicindex {\TEX}
35
36Of course it all starts with traditional \TEX. Even if we started with \PDFTEX,
37most still comes from the original. But we divert a bit.
38
39\startitemize
40
41\startitem
42    The current code base is written in \CCODE, not \PASCAL. We use \CWEB\ when
43    possible. As a consequence instead of one large file plus change files, we
44    now have multiple files organized in categories like \type {tex}, \type
45    {pdf}, \type {lang}, \type {font}, \type {lua}, etc. There are some artifacts
46    of the conversion to \CCODE, but in due time we will clean up the source code
47    and make sure that the documentation is done right. Many files are in the
48    \CWEB\ format, but others, like those interfacing to \LUA, are \CCODE\ files.
49    Of course we want to stay as close as possible to the original so that the
50    documentation of the fundamentals behind \TEX\ by Don Knuth still applies.
51\stopitem
52
53\startitem
54    See \in {chapter} [languages] for many small changes related to paragraph
55    building, language handling and hyphenation. The most important change is
56    that adding a brace group in the middle of a word (like in \type {of{}fice})
57    does not prevent ligature creation.
58\stopitem
59
60\startitem
61    There is no pool file, all strings are embedded during compilation.
62\stopitem
63
64\startitem
65    The specifier \type {plus 1 fillll} does not generate an error. The extra
66    \quote{l} is simply typeset.
67\stopitem
68
69\startitem
70    The upper limit to \prm {endlinechar} and \prm {newlinechar} is 127.
71\stopitem
72
73\startitem
74    Magnification (\prm {mag}) is only supported in \DVI\ output mode. You can
75    set this parameter and it even works with \type {true} units till you switch
76    to \PDF\ output mode. When you use \PDF\ output you can best not touch the
77    \prm {mag} variable. This fuzzy behaviour is not much different from using
78    \PDF\ backend related functionality while eventually \DVI\ output is
79    required.
80
81    After the output mode has been frozen (normally that happens when the first
82    page is shipped out) or when \PDF\ output is enabled, the \type {true}
83    specification is ignored. When you preload a plain format adapted to
84    \LUATEX\ it can be that the \prm {mag} parameter already has been set.
85\stopitem
86
87\stopitemize
88
89\stopsubsection
90
91\startsubsection[title=Changes from \ETEX\ 2.2]
92
93\topicindex {\ETEX}
94
95Being the de factor standard extension of course we provide the \ETEX\
96functionality, but with a few small adaptations.
97
98\startitemize
99
100\startitem
101    The \ETEX\ functionality is always present and enabled so the prepended
102    asterisk or \type {-etex} switch for \INITEX\ is not needed.
103\stopitem
104
105\startitem
106    The \TEXXET\ extension is not present, so the primitives \type
107    {\TeXXeTstate}, \type {\beginR}, \type {\beginL}, \type {\endR} and \type
108    {\endL} are missing. Instead we used the \OMEGA/\ALEPH\ approach to
109    directionality as starting point.
110\stopitem
111
112\startitem
113    Some of the tracing information that is output by \ETEX's \prm
114    {tracingassigns} and \prm {tracingrestores} is not there.
115\stopitem
116
117\startitem
118    Register management in \LUATEX\ uses the \OMEGA/\ALEPH\ model, so the maximum
119    value is 65535 and the implementation uses a flat array instead of the mixed
120    flat & sparse model from \ETEX.
121\stopitem
122
123\startitem
124    When kpathsea is used to find files, \LUATEX\ uses the \type {ofm} file
125    format to search for font metrics. In turn, this means that \LUATEX\ looks at
126    the \type {OFMFONTS} configuration variable (like \OMEGA\ and \ALEPH) instead
127    of \type {TFMFONTS} (like \TEX\ and \PDFTEX). Likewise for virtual fonts
128    (\LUATEX\ uses the variable \type {OVFFONTS} instead of \type {VFFONTS}).
129\stopitem
130
131\startitem
132    The primitives that report a stretch or shrink order report a value in a
133    convenient range zero upto four. Because some macro packages can break on
134    that we also provide \type {\eTeXgluestretchorder} and \type
135    {\eTeXglueshrinkorder} which report values compatible with \ETEX. The (new)
136    \type {fi} value is reported as \type {-1} (so when used in an \type
137    {\ifcase} test that value makes one end up in the \type {\else}).
138\stopitem
139
140\stopitemize
141
142\stopsubsection
143
144\startsubsection[title=Changes from \PDFTEX\ 1.40]
145
146\topicindex {\PDFTEX}
147
148Because we want to produce \PDF\ the most natural starting point was the popular
149\PDFTEX\ program. We inherit the stable features, dropped most of the
150experimental code and promoted some functionality to core \LUATEX\ functionality
151which in turn triggered renaming primitives.
152
153For compatibility reasons we still refer to \type {\pdf...} commands but \LUATEX\
154has a different backend interface. Instead of these primitives there are three
155interfacing primitives: \lpr {pdfextension}, \lpr {pdfvariable} and \lpr
156{pdffeedback} that take keywords and optional further arguments (below we will
157still use the \tex {pdf} prefix names as reference). This way we can extend the
158features when needed but don't need to adapt the core engine. The front- and
159backend are decoupled as much as possible.
160
161\startitemize
162
163\startitem
164    The (experimental) support for snap nodes has been removed, because it is
165    much more natural to build this functionality on top of node processing and
166    attributes. The associated primitives that are gone are: \orm
167    {pdfsnaprefpoint}, \orm {pdfsnapy}, and \orm {pdfsnapycomp}.
168\stopitem
169
170\startitem
171    The (experimental) support for specialized spacing around nodes has also been
172    removed. The associated primitives that are gone are: \orm
173    {pdfadjustinterwordglue}, \orm {pdfprependkern}, and \orm {pdfappendkern}, as
174    well as the five supporting primitives \orm {knbscode}, \orm {stbscode}, \orm
175    {shbscode}, \orm {knbccode}, and \orm {knaccode}.
176\stopitem
177
178\startitem
179    A number of \quote {\PDFTEX\ primitives} have been removed as they can be
180    implemented using \LUA: \orm {pdfelapsedtime}, \orm {pdfescapehex}, \orm
181    {pdfescapename}, \orm {pdfescapestring}, \orm {pdffiledump}, \orm
182    {pdffilemoddate}, \orm {pdffilesize}, \orm {pdfforcepagebox}, \orm
183    {pdflastmatch}, \orm {pdfmatch}, \orm {pdfmdfivesum}, \orm {pdfmovechars},
184    \orm {pdfoptionalwaysusepdfpagebox}, \orm {pdfoptionpdfinclusionerrorlevel},
185    \orm {pdfresettimer}, \orm {pdfshellescape}, \orm {pdfstrcmp} and \orm
186    {pdfunescapehex}.
187\stopitem
188
189\startitem
190    The version related primitives \orm {pdftexbanner}, \orm {pdftexversion}
191    and \orm {pdftexrevision} are no longer present as there is no longer a
192    relationship with \PDFTEX\ development.
193\stopitem
194
195\startitem
196    The experimental snapper mechanism has been removed and therefore also the
197    primitives \orm {pdfignoreddimen}, \orm {pdffirstlineheight}, \orm
198    {pdfeachlineheight}, \orm {pdfeachlinedepth} and \orm {pdflastlinedepth}.
199\stopitem
200
201\startitem
202    The experimental primitives \lpr {primitive}, \lpr {ifprimitive}, \lpr
203    {ifabsnum} and \lpr {ifabsdim} are promoted to core primitives. The \type
204    {\pdf*} prefixed originals are not available.
205\stopitem
206
207\startitem
208    Because \LUATEX\ has a different subsystem for managing images, more
209    diversion from its ancestor happened in the meantime. We don't adapt to
210    changes in \PDFTEX.
211\stopitem
212
213\startitem
214    Two extra token lists are provided, \orm {pdfxformresources} and \orm
215    {pdfxformattr}, as an alternative to \orm {pdfxform} keywords.
216\stopitem
217
218\startitem
219    Image specifications also support \type {visiblefilename}, \type
220    {userpassword} and \type {ownerpassword}. The password options are only
221    relevant for encrypted \PDF\ files.
222\stopitem
223
224\startitem
225    The current version of \LUATEX\ no longer replaces and|/|or merges fonts in
226    embedded \PDF\ files with fonts of the enveloping \PDF\ document. This
227    regression may be temporary, depending on how the rewritten font backend will
228    look like.
229\stopitem
230
231\startitem
232    The primitives \orm {pdfpagewidth} and \orm {pdfpageheight} have been removed
233    because \lpr {pagewidth} and \lpr {pageheight} have that purpose.
234\stopitem
235
236\startitem
237    The primitives \orm {pdfnormaldeviate}, \orm {pdfuniformdeviate}, \orm
238    {pdfsetrandomseed} and \orm {pdfrandomseed} have been promoted to core
239    primitives without \type {pdf} prefix so the original commands are no longer
240    recognized.
241\stopitem
242
243\startitem
244    The primitives \lpr {ifincsname}, \lpr {expanded} and \lpr {quitvmode}
245    are now core primitives.
246\stopitem
247
248\startitem
249    As the hz and protrusion mechanism are part of the core the related
250    primitives \lpr {lpcode}, \lpr {rpcode}, \lpr {efcode}, \lpr
251    {leftmarginkern}, \lpr {rightmarginkern} are promoted to core primitives. The
252    two commands \lpr {protrudechars} and \lpr {adjustspacing} replace their
253    prefixed with \type {\pdf} originals.
254\stopitem
255
256\startitem
257    The hz optimization code has been partially redone so that we no longer need
258    to create extra font instances. The front- and backend have been decoupled
259    and more efficient (\PDF) code is generated.
260\stopitem
261
262\startitem
263    When \lpr {adjustspacing} has value~2, hz optimization will be applied to
264    glyphs and kerns. When the value is~3, only glyphs will be treated. A value
265    smaller than~2 disables this feature. With value of 1, font expansion is
266    applied after \TEX's normal paragraph breaking routines have broken the
267    paragraph into lines. In this case, line breaks are identical to standard
268    \TEX\ behavior (as with \PDFTEX).
269\stopitem
270
271\startitem
272    The \lpr {tagcode} primitive is promoted to core primitive.
273\stopitem
274
275\startitem
276    The \lpr {letterspacefont} feature is now part of the core but will not be
277    changed (improved). We just provide it for legacy use.
278\stopitem
279
280\startitem
281    The \orm {pdfnoligatures} primitive is now \lpr {ignoreligaturesinfont}.
282\stopitem
283
284\startitem
285    The \orm {pdfcopyfont} primitive is now \lpr {copyfont}.
286\stopitem
287
288\startitem
289    The \orm {pdffontexpand} primitive is now \lpr {expandglyphsinfont}.
290\stopitem
291
292\startitem
293    Because position tracking is also available in \DVI\ mode the \lpr {savepos},
294    \lpr {lastxpos} and \lpr {lastypos} commands now replace their \type {pdf}
295    prefixed originals.
296\stopitem
297
298\startitem
299    The introspective primitives \type {\pdflastximagecolordepth} and \type
300    {\pdfximagebbox} have been removed. One can use external applications to
301    determine these properties or use the built|-|in \type {img} library.
302\stopitem
303
304\startitem
305    The initializers \orm {pdfoutput} has been replaced by \lpr {outputmode} and
306    \orm {pdfdraftmode} is now \lpr {draftmode}.
307\stopitem
308
309\startitem
310    The pixel multiplier dimension \orm {pdfpxdimen} lost its prefix and is now
311    called \lpr {pxdimen}.
312\stopitem
313
314\startitem
315    An extra \orm {pdfimageaddfilename} option has been added that can be used to
316    block writing the filename to the \PDF\ file.
317\stopitem
318
319\startitem
320    The primitive \orm {pdftracingfonts} is now \lpr {tracingfonts} as it
321    doesn't relate to the backend.
322\stopitem
323
324\startitem
325    The experimental primitive \orm {pdfinsertht} is kept as \lpr {insertht}.
326\stopitem
327
328\startitem
329    There is some more control over what metadata goes into the \PDF\ file.
330\stopitem
331
332\startitem
333    The promotion of primitives to core primitives as well as the separation of
334    font- and backend means that the initialization namespace \type {pdftex} is
335    gone.
336\stopitem
337
338\stopitemize
339
340One change involves the so called xforms and ximages. In \PDFTEX\ these are
341implemented as so called whatsits. But contrary to other whatsits they have
342dimensions that need to be taken into account when for instance calculating
343optimal line breaks. In \LUATEX\ these are now promoted to a special type of rule
344nodes, which simplifies code that needs those dimensions.
345
346Another reason for promotion is that these are useful concepts. Backends can
347provide the ability to use content that has been rendered in several places, and
348images are also common. As already mentioned in \in {section}
349[sec:imagedandforms], we now have:
350
351\starttabulate[|l|l|]
352\DB \LUATEX \BC \PDFTEX \NC \NR
353\TB
354\NC \lpr {saveboxresource}             \NC \orm {pdfxform}           \NC \NR
355\NC \lpr {saveimageresource}           \NC \orm {pdfximage}          \NC \NR
356\NC \lpr {useboxresource}              \NC \orm {pdfrefxform}        \NC \NR
357\NC \lpr {useimageresource}            \NC \orm {pdfrefximage}       \NC \NR
358\NC \lpr {lastsavedboxresourceindex}   \NC \orm {pdflastxform}       \NC \NR
359\NC \lpr {lastsavedimageresourceindex} \NC \orm {pdflastximage}      \NC \NR
360\NC \lpr {lastsavedimageresourcepages} \NC \orm {pdflastximagepages} \NC \NR
361\LL
362\stoptabulate
363
364There are a few \lpr {pdffeedback} features that relate to this but these are
365typical backend specific ones. The index that gets returned is to be considered
366as \quote {just a number} and although it still has the same meaning (object
367related) as before, you should not depend on that.
368
369The protrusion detection mechanism is enhanced a bit to enable a bit more complex
370situations. When protrusion characters are identified some nodes are skipped:
371
372\startitemize[packed,columns,two]
373\startitem zero glue \stopitem
374\startitem penalties \stopitem
375\startitem empty discretionaries \stopitem
376\startitem normal zero kerns \stopitem
377\startitem rules with zero dimensions \stopitem
378\startitem math nodes with a surround of zero \stopitem
379\startitem dir nodes \stopitem
380\startitem empty horizontal lists \stopitem
381\startitem local par nodes \stopitem
382\startitem inserts, marks and adjusts \stopitem
383\startitem boundaries \stopitem
384\startitem whatsits \stopitem
385\stopitemize
386
387Because this can not be enough, you can also use a protrusion boundary node to
388make the next node being ignored. When the value is~1 or~3, the next node will be
389ignored in the test when locating a left boundary condition. When the value is~2
390or~3, the previous node will be ignored when locating a right boundary condition
391(the search goes from right to left). This permits protrusion combined with for
392instance content moved into the margin:
393
394\starttyping
395\protrusionboundary1\llap{!\quad}«Who needs protrusion?»
396\stoptyping
397
398\stopsubsection
399
400\startsubsection[title=Changes from \ALEPH\ RC4]
401
402\topicindex {\ALEPH}
403
404Because we wanted proper directional typesetting the \ALEPH\ mechanisms looked
405most attractive. These are rather close to the ones provided by \OMEGA, so what
406we say next applies to both these programs.
407
408\startitemize
409
410\startitem
411    The extended 16-bit math primitives (\orm {omathcode} etc.) have been
412    removed.
413\stopitem
414
415\startitem
416    The \OCP\ processing has been removed completely and as a consequence, the
417    following primitives have been removed: \orm {ocp}, \orm {externalocp}, \orm
418    {ocplist}, \orm {pushocplist}, \orm {popocplist}, \orm {clearocplists}, \orm
419    {addbeforeocplist}, \orm {addafterocplist}, \orm {removebeforeocplist}, \orm
420    {removeafterocplist} and \orm {ocptracelevel}.
421\stopitem
422
423\startitem
424    \LUATEX\ only understands 4~of the 16~direction specifiers of \ALEPH: \type
425    {TLT} (latin), \type {TRT} (arabic), \type {RTT} (cjk), \type {LTL} (mongolian).
426    All other direction specifiers generate an error. In addition to a keyword
427    driven model we also provide an integer driven one.
428\stopitem
429
430\startitem
431    The input translations from \ALEPH\ are not implemented, the related
432    primitives are not available: \orm {DefaultInputMode}, \orm
433    {noDefaultInputMode}, \orm {noInputMode}, \orm {InputMode}, \orm
434    {DefaultOutputMode}, \orm {noDefaultOutputMode}, \orm {noOutputMode}, \orm
435    {OutputMode}, \orm {DefaultInputTranslation}, \orm
436    {noDefaultInputTranslation}, \orm {noInputTranslation}, \orm
437    {InputTranslation}, \orm {DefaultOutputTranslation}, \orm
438    {noDefaultOutputTranslation}, \orm {noOutputTranslation} and \orm
439    {OutputTranslation}.
440\stopitem
441
442\startitem
443    Several bugs have been fixed and confusing implementation details have been
444    sorted out.
445\stopitem
446
447\startitem
448    The scanner for direction specifications now allows an optional space after
449    the direction is completely parsed.
450\stopitem
451
452\startitem
453    The \type {^^} notation has been extended: after \type {^^^^} four
454    hexadecimal characters are expected and after \type {^^^^^^} six hexadecimal
455    characters have to be given. The original \TEX\ interpretation is still valid
456    for the \type {^^} case but the four and six variants do no backtracking,
457    i.e.\ when they are not followed by the right number of hexadecimal digits
458    they issue an error message. Because \type{^^^} is a normal \TEX\ case, we
459    don't support the odd number of \type {^^^^^} either.
460\stopitem
461
462\startitem
463    Glues {\it immediately after} direction change commands are not legal
464    breakpoints.
465\stopitem
466
467\startitem
468    Several mechanisms that need to be right|-|to|-|left aware have been
469    improved. For instance placement of formula numbers.
470\stopitem
471
472\startitem
473    The page dimension related primitives \lpr {pagewidth} and \lpr {pageheight}
474    have been promoted to core primitives. The \prm {hoffset} and \prm {voffset}
475    primitives have been fixed.
476\stopitem
477
478\startitem
479    The primitives \type {\charwd}, \type {\charht}, \type {\chardp} and \type
480    {\charit} have been removed as we have the \ETEX\ variants \type
481    {\fontchar*}.
482\stopitem
483
484\startitem
485    The two dimension registers \lpr {pagerightoffset} and \lpr
486    {pagebottomoffset} are now core primitives.
487\stopitem
488
489\startitem
490    The direction related primitives \lpr {pagedir}, \lpr {bodydir}, \lpr
491    {pardir}, \lpr {textdir}, \lpr {mathdir} and \lpr {boxdir} are now core
492    primitives.
493\stopitem
494
495\startitem
496    The promotion of primitives to core primitives as well as removing of all
497    others means that the initialization namespace \type {aleph} that early
498    versions of \LUATEX\ provided is gone.
499\stopitem
500
501\stopitemize
502
503The above let's itself summarize as: we took the 32 bit aspects and much of the
504directional mechanisms and merged it into the \PDFTEX\ code base as starting
505point for further development. Then we simplified directionality, fixed it and
506opened it up.
507
508\stopsubsection
509
510\startsubsection[title=Changes from anywhere]
511
512The \type {\partokenname} and \type {\partokencontext} primitives are taken from
513the \PDFTEX\ change file posted on the implementers list. They are explained in
514the \PDFTEX\ manual and are classified as \ETEX\ extensions.
515
516\stopsubsection
517
518\startsubsection[title=Changes from standard \WEBC]
519
520\topicindex {\WEBC}
521
522The compilation framework is \WEBC\ and we keep using that but without the
523\PASCAL\ to \CCODE\ step. This framework also provides some common features that
524deal with reading bytes from files and locating files in \TDS. This is what we do
525different:
526
527\startitemize
528
529\startitem
530    There is no mltex support.
531\stopitem
532
533\startitem
534    There is no enctex support.
535\stopitem
536
537\startitem
538    The following encoding related command line switches are silently ignored,
539    even in non|-|\LUA\ mode: \type {-8bit}, \type {-translate-file}, \type
540    {-mltex}, \type {-enc} and \type {-etex}.
541\stopitem
542
543\startitem
544    The \prm {openout} whatsits are not written to the log file.
545\stopitem
546
547\startitem
548    Some of the so|-|called \WEBC\ extensions are hard to set up in non|-|\KPSE\
549    mode because \type {texmf.cnf} is not read: \type {shell-escape} is off (but
550    that is not a problem because of \LUA's \type {os.execute}), and the paranoia
551    checks on \type {openin} and \type {openout} do not happen. However, it is
552    easy for a \LUA\ script to do this itself by overloading \type {io.open} and
553    alike.
554\stopitem
555
556\startitem
557    The \quote{E} option does not do anything useful.
558\stopitem
559
560\stopitemize
561
562\stopsubsection
563
564\stopsection
565
566\startsection[reference=backendprimitives,title=The backend primitives]
567
568\startsubsection[title={Less primitives}]
569
570\topicindex {backend}
571\topicindex {\PDF+backend}
572
573In a previous section we mentioned that some \PDFTEX\ primitives were removed and
574others promoted to core \LUATEX\ primitives. That is only part of the story. In
575order to separate the backend specific primitives in de code these commands are
576now replaced by only a few. In traditional \TEX\ we only had the \DVI\ backend
577but now we have two: \DVI\ and \PDF. Additional functionality is implemented as
578\quote {extensions} in \TEX\ speak. By separating more strickly we are able to
579keep the core (frontend) clean and stable and isolate these extensions. If for
580some reason an extra backend option is needed, it can be implemented without
581touching the core. The three \PDF\ backend related primitives are:
582
583\starttyping
584\pdfextension command [specification]
585\pdfvariable  name
586\pdffeedback  name
587\stoptyping
588
589An extension triggers further parsing, depending on the command given. A variable is
590a (kind of) register and can be read and written, while a feedback is reporting
591something (as it comes from the backend it's normally a sequence of tokens).
592
593\stopsubsection
594
595\startsubsection[title={\lpr{pdfextension}, \lpr {pdfvariable} and \lpr {pdffeedback}},reference=sec:pdfextensions]
596
597In order for \LUATEX\ to be more than just \TEX\ you need to enable primitives. That
598has already been the case right from the start. If you want the traditional \PDFTEX\
599primitives (for as far their functionality is still around) you now can do this:
600
601\starttyping
602\protected\def\pdfliteral             {\pdfextension literal}
603\protected\def\pdflateliteral         {\pdfextension lateliteral}
604\protected\def\pdfcolorstack          {\pdfextension colorstack}
605\protected\def\pdfsetmatrix           {\pdfextension setmatrix}
606\protected\def\pdfsave                {\pdfextension save\relax}
607\protected\def\pdfrestore             {\pdfextension restore\relax}
608\protected\def\pdfobj                 {\pdfextension obj }
609\protected\def\pdfrefobj              {\pdfextension refobj }
610\protected\def\pdfannot               {\pdfextension annot }
611\protected\def\pdfstartlink           {\pdfextension startlink }
612\protected\def\pdfendlink             {\pdfextension endlink\relax}
613\protected\def\pdfoutline             {\pdfextension outline }
614\protected\def\pdfdest                {\pdfextension dest }
615\protected\def\pdfthread              {\pdfextension thread }
616\protected\def\pdfstartthread         {\pdfextension startthread }
617\protected\def\pdfendthread           {\pdfextension endthread\relax}
618\protected\def\pdfinfo                {\pdfextension info }
619\protected\def\pdfcatalog             {\pdfextension catalog }
620\protected\def\pdfnames               {\pdfextension names }
621\protected\def\pdfincludechars        {\pdfextension includechars }
622\protected\def\pdffontattr            {\pdfextension fontattr }
623\protected\def\pdfmapfile             {\pdfextension mapfile }
624\protected\def\pdfmapline             {\pdfextension mapline }
625\protected\def\pdftrailer             {\pdfextension trailer }
626\protected\def\pdfglyphtounicode      {\pdfextension glyphtounicode }
627\protected\def\pdfrunninglinkoff      {\pdfextension linkstate 1 }
628\protected\def\pdfrunninglinkon       {\pdfextension linkstate 0 }
629\stoptyping
630
631The introspective primitives can be defined as:
632
633\starttyping
634\def\pdftexversion    {\numexpr\pdffeedback version\relax}
635\def\pdftexrevision           {\pdffeedback revision}
636\def\pdflastlink      {\numexpr\pdffeedback lastlink\relax}
637\def\pdfretval        {\numexpr\pdffeedback retval\relax}
638\def\pdflastobj       {\numexpr\pdffeedback lastobj\relax}
639\def\pdflastannot     {\numexpr\pdffeedback lastannot\relax}
640\def\pdfxformname     {\numexpr\pdffeedback xformname\relax}
641\def\pdfcreationdate          {\pdffeedback creationdate}
642\def\pdffontname      {\numexpr\pdffeedback fontname\relax}
643\def\pdffontobjnum    {\numexpr\pdffeedback fontobjnum\relax}
644\def\pdffontsize      {\dimexpr\pdffeedback fontsize\relax}
645\def\pdfpageref       {\numexpr\pdffeedback pageref\relax}
646\def\pdfcolorstackinit        {\pdffeedback colorstackinit}
647\stoptyping
648
649The configuration related registers have become:
650
651\starttyping
652\edef\pdfcompresslevel            {\pdfvariable compresslevel}
653\edef\pdfobjcompresslevel         {\pdfvariable objcompresslevel}
654\edef\pdfrecompress               {\pdfvariable recompress}
655\edef\pdfdecimaldigits            {\pdfvariable decimaldigits}
656\edef\pdfgamma                    {\pdfvariable gamma}
657\edef\pdfimageresolution          {\pdfvariable imageresolution}
658\edef\pdfimageapplygamma          {\pdfvariable imageapplygamma}
659\edef\pdfimagegamma               {\pdfvariable imagegamma}
660\edef\pdfimagehicolor             {\pdfvariable imagehicolor}
661\edef\pdfimageaddfilename         {\pdfvariable imageaddfilename}
662\edef\pdfpkresolution             {\pdfvariable pkresolution}
663\edef\pdfpkfixeddpi               {\pdfvariable pkfixeddpi}
664\edef\pdfinclusioncopyfonts       {\pdfvariable inclusioncopyfonts}
665\edef\pdfinclusionerrorlevel      {\pdfvariable inclusionerrorlevel}
666\edef\pdfignoreunknownimages      {\pdfvariable ignoreunknownimages}
667\edef\pdfgentounicode             {\pdfvariable gentounicode}
668\edef\pdfomitcidset               {\pdfvariable omitcidset}
669\edef\pdfomitcharset              {\pdfvariable omitcharset}
670\edef\pdfomitinfodict             {\pdfvariable omitinfodict}
671\edef\pdfomitmediabox             {\pdfvariable omitmediabox}
672\edef\pdfpagebox                  {\pdfvariable pagebox}
673\edef\pdfminorversion             {\pdfvariable minorversion}
674\edef\pdfuniqueresname            {\pdfvariable uniqueresname}
675
676\edef\pdfhorigin                  {\pdfvariable horigin}
677\edef\pdfvorigin                  {\pdfvariable vorigin}
678\edef\pdflinkmargin               {\pdfvariable linkmargin}
679\edef\pdfdestmargin               {\pdfvariable destmargin}
680\edef\pdfthreadmargin             {\pdfvariable threadmargin}
681\edef\pdfxformmargin              {\pdfvariable xformmargin}
682
683\edef\pdfpagesattr                {\pdfvariable pagesattr}
684\edef\pdfpageattr                 {\pdfvariable pageattr}
685\edef\pdfpageresources            {\pdfvariable pageresources}
686\edef\pdfxformattr                {\pdfvariable xformattr}
687\edef\pdfxformresources           {\pdfvariable xformresources}
688\edef\pdfpkmode                   {\pdfvariable pkmode}
689
690\edef\pdfsuppressoptionalinfo     {\pdfvariable suppressoptionalinfo }
691\edef\pdftrailerid                {\pdfvariable trailerid }
692\stoptyping
693
694The variables are internal ones, so they are anonymous. When you ask for the
695meaning of a few previously defined ones:
696
697\starttyping
698\meaning\pdfhorigin
699\meaning\pdfcompresslevel
700\meaning\pdfpageattr
701\stoptyping
702
703you will get:
704
705\starttyping
706macro:->[internal backend dimension]
707macro:->[internal backend integer]
708macro:->[internal backend tokenlist]
709\stoptyping
710
711The \prm {edef} can also be a \prm {def} but it's a bit more efficient to expand
712the lookup related register beforehand.
713
714The backend is derived from \PDFTEX\ so the same syntax applies. However, the
715\type {outline} command accepts a \type {objnum} followed by a number. No
716checking takes place so when this is used it had better be a valid (flushed)
717object.
718
719In order to be (more or less) compatible with \PDFTEX\ we also support the option
720to suppress some info but we do so via a bitset:
721
722\starttyping
723\pdfvariable suppressoptionalinfo \numexpr
724        0
725    +   1   % PTEX.FullBanner
726    +   2   % PTEX.FileName
727    +   4   % PTEX.PageNumber
728    +   8   % PTEX.InfoDict
729    +  16   % Creator
730    +  32   % CreationDate
731    +  64   % ModDate
732    + 128   % Producer
733    + 256   % Trapped
734    + 512   % ID
735\relax
736\stoptyping
737
738In addition you can overload the trailer id, but we don't do any checking on
739validity, so you have to pass a valid array. The following is like the ones
740normally generated by the engine. You even need to include the brackets here!
741
742\starttyping
743\pdfvariable trailerid {[
744    <FA052949448907805BA83C1E78896398>
745    <FA052949448907805BA83C1E78896398>
746]}
747\stoptyping
748
749Although we started from a merge of \PDFTEX\ and \ALEPH, by now the code base as
750well as functionality has diverted from those parents. Here we show the options
751that can be passed to the extensions. The \type {shipout} option is a compatibility
752feature. Instead one can use the \type {deferred} prefix.
753
754\starttexsyntax
755\pdfextension literal
756    [shipout] [ direct | page | raw ] { tokens }
757\stoptexsyntax
758
759\starttexsyntax
760\pdfextension dest
761    num integer | name { tokens }!crlf
762    [ fitbh | fitbv | fitb | fith| fitv | fit |
763      fitr <rule spec> | xyz [ zoom <integer> ]
764\stoptexsyntax
765
766\starttexsyntax
767\pdfextension annot
768    reserveobjnum | useobjnum <integer>
769    { tokens }
770\stoptexsyntax
771
772\starttexsyntax
773\pdfextension save
774\stoptexsyntax
775
776\starttexsyntax
777\pdfextension restore
778\stoptexsyntax
779
780\starttexsyntax
781\pdfextension setmatrix
782    { tokens }
783\stoptexsyntax
784
785\starttexsyntax
786[ \immediate ] \pdfextension obj
787    reserveobjnum
788\stoptexsyntax
789
790\starttexsyntax
791[ \immediate ] \pdfextension obj
792    [ useobjnum <integer> ]
793    [ uncompressed ]
794    [ stream  [ attr { tokens } ] ]
795    [ file ]
796    { tokens }
797\stoptexsyntax
798
799\starttexsyntax
800\pdfextension refobj
801    <integer>
802\stoptexsyntax
803
804\starttexsyntax
805\pdfextension colorstack
806    <integer>
807    set { tokens } | push { tokens } | pop | current
808\stoptexsyntax
809
810\starttexsyntax
811\pdfextension startlink
812    [ attr { tokens } ]
813    user { tokens } | goto | thread
814    [ file { tokens } ]
815    [ page <integer> { tokens } | name { tokens } | num  integer ]
816    [ newwindow  | nonewwindow ]
817\stoptexsyntax
818
819\starttexsyntax
820\pdfextension endlink
821\stoptexsyntax
822
823\starttexsyntax
824\pdfextension startthread
825    num <integer> | name { tokens }
826\stoptexsyntax
827
828\starttexsyntax
829\pdfextension endthread
830\stoptexsyntax
831
832\starttexsyntax
833\pdfextension thread
834    num <integer> | name { tokens }
835\stoptexsyntax
836
837\starttexsyntax
838\pdfextension outline
839    [ attr { tokens } ]
840    [ useobjnum <integer> ]
841    [ count <integer> ]
842    { tokens }
843\stoptexsyntax
844
845\starttexsyntax
846\pdfextension glyphtounicode
847    { tokens }
848    { tokens }
849\stoptexsyntax
850
851\starttexsyntax
852\pdfextension catalog
853    { tokens }
854    [ openaction
855      user { tokens } | goto | thread
856      [ file { tokens } ]
857      [ page <integer> { tokens } | name { tokens } | num <integer> ]
858      [ newwindow  | nonewwindow ] ]
859\stoptexsyntax
860
861\starttexsyntax
862\pdfextension fontattr
863    <integer>
864    {tokens}
865\stoptexsyntax
866
867\starttexsyntax
868\pdfextension mapfile
869    {tokens}
870\stoptexsyntax
871
872\starttexsyntax
873\pdfextension mapline
874    {tokens}
875\stoptexsyntax
876
877\starttexsyntax
878\pdfextension includechars
879    {tokens}
880\stoptexsyntax
881
882\starttexsyntax
883\pdfextension info
884    {tokens}
885\stoptexsyntax
886
887\starttexsyntax
888\pdfextension names
889    {tokens}
890\stoptexsyntax
891
892\starttexsyntax
893\pdfextension trailer
894    {tokens}
895\stoptexsyntax
896
897\stopsubsection
898
899\startsubsection[title={Defaults}]
900
901The engine sets the following defaults.
902
903\starttyping
904\pdfcompresslevel         9
905\pdfobjcompresslevel      1 % used: (0,9)
906\pdfrecompress            0 % mostly for debugging
907\pdfdecimaldigits         4 % used: (3,6)
908\pdfgamma              1000
909\pdfimageresolution      71
910\pdfimageapplygamma       0
911\pdfimagegamma         2200
912\pdfimagehicolor          1
913\pdfimageaddfilename      1
914\pdfpkresolution         72
915\pdfpkfixeddpi            0
916\pdfinclusioncopyfonts    0
917\pdfinclusionerrorlevel   0
918\pdfignoreunknownimages   0
919\pdfgentounicode          0
920\pdfomitcidset            0
921\pdfomitcharset           0
922\pdfomitinfodict          0
923\pdfomitmediabox          0
924\pdfpagebox               0
925\pdfminorversion          4
926\pdfuniqueresname         0
927
928\pdfhorigin             1in
929\pdfvorigin             1in
930\pdflinkmargin          0pt
931\pdfdestmargin          0pt
932\pdfthreadmargin        0pt
933\pdfxformmargin         0pt
934\stoptyping
935
936\stopsubsection
937
938\startsubsection[title={Backward compatibility}]
939
940If you also want some backward compatibility, you can add:
941
942\starttyping
943\let\pdfpagewidth      \pagewidth
944\let\pdfpageheight     \pageheight
945
946\let\pdfadjustspacing  \adjustspacing
947\let\pdfprotrudechars  \protrudechars
948\let\pdfnoligatures    \ignoreligaturesinfont
949\let\pdffontexpand     \expandglyphsinfont
950\let\pdfcopyfont       \copyfont
951
952\let\pdfxform          \saveboxresource
953\let\pdflastxform      \lastsavedboxresourceindex
954\let\pdfrefxform       \useboxresource
955
956\let\pdfximage         \saveimageresource
957\let\pdflastximage     \lastsavedimageresourceindex
958\let\pdflastximagepages\lastsavedimageresourcepages
959\let\pdfrefximage      \useimageresource
960
961\let\pdfsavepos        \savepos
962\let\pdflastxpos       \lastxpos
963\let\pdflastypos       \lastypos
964
965\let\pdfoutput         \outputmode
966\let\pdfdraftmode      \draftmode
967
968\let\pdfpxdimen        \pxdimen
969
970\let\pdfinsertht       \insertht
971
972\let\pdfnormaldeviate  \normaldeviate
973\let\pdfuniformdeviate \uniformdeviate
974\let\pdfsetrandomseed  \setrandomseed
975\let\pdfrandomseed     \randomseed
976
977\let\pdfprimitive      \primitive
978\let\ifpdfprimitive    \ifprimitive
979
980\let\ifpdfabsnum       \ifabsnum
981\let\ifpdfabsdim       \ifabsdim
982\stoptyping
983
984And even:
985
986\starttyping
987\newdimen\pdfeachlineheight
988\newdimen\pdfeachlinedepth
989\newdimen\pdflastlinedepth
990\newdimen\pdffirstlineheight
991\newdimen\pdfignoreddimen
992\stoptyping
993
994\stopsubsection
995
996\stopsection
997
998\startsection[title=Directions]
999
1000\topicindex {\OMEGA}
1001\topicindex {\ALEPH}
1002\topicindex {directions}
1003
1004\startsubsection[title={Four directions}]
1005
1006The directional model in \LUATEX\ is inherited from \OMEGA|/|\ALEPH\ but we tried
1007to improve it a bit. At some point we played with recovery of modes but that was
1008disabled later on when we found that it interfered with nested directions. That
1009itself had as side effect that the node list was no longer balanced with respect
1010to directional nodes which in turn can give side effects when a series of dir
1011changes happens without grouping.
1012
1013When extending the \PDF\ backend to support directions some inconsistencies were
1014found and as a result we decided to support only the four models that make sense
1015\type {TLT} (latin), \type {TRT} (arabic), \type {RTT} (cjk) and \type {LTL}
1016(mongolian).
1017
1018\stopsubsection
1019
1020\startsubsection[title={How it works}]
1021
1022The approach is that we again make the list balanced but try to avoid some side
1023effects. What happens is quite intuitive if we forget about spaces (turned into
1024glue) but even there what happens makes sense if you look at it in detail.
1025However that logic makes in|-|group switching kind of useless when no proper
1026nested grouping is used: switching from right to left several times nested,
1027results in spacing ending up after each other due to nested mirroring. Of course
1028a sane macro package will manage this for the user but here we are discussing the
1029low level dir injection.
1030
1031This is what happens:
1032
1033\starttyping
1034\textdir TRT nur {\textdir TLT run \textdir TRT NUR} nur
1035\stoptyping
1036
1037This becomes stepwise:
1038
1039\startnarrower
1040\starttyping
1041injected: [+TRT]nur {[+TLT]run [+TRT]NUR} nur
1042balanced: [+TRT]nur {[+TLT]run [-TLT][+TRT]NUR[-TRT]} nur[-TRT]
1043result  : run {RUNrun } run
1044\stoptyping
1045\stopnarrower
1046
1047And this:
1048
1049\starttyping
1050\textdir TRT nur {nur \textdir TLT run \textdir TRT NUR} nur
1051\stoptyping
1052
1053becomes:
1054
1055\startnarrower
1056\starttyping
1057injected: [+TRT]nur {nur [+TLT]run [+TRT]NUR} nur
1058balanced: [+TRT]nur {nur [+TLT]run [-TLT][+TRT]NUR[-TRT]} nur[-TRT]
1059result  : run {run RUNrun } run
1060\stoptyping
1061\stopnarrower
1062
1063Now, in the following examples watch where we put the braces:
1064
1065\startbuffer
1066\textdir TRT nur {{\textdir TLT run} {\textdir TRT NUR}} nur
1067\stopbuffer
1068
1069\typebuffer
1070
1071This becomes:
1072
1073\startnarrower
1074\getbuffer
1075\stopnarrower
1076
1077Compare this to:
1078
1079\startbuffer
1080\textdir TRT nur {{\textdir TLT run }{\textdir TRT NUR}} nur
1081\stopbuffer
1082
1083\typebuffer
1084
1085Which renders as:
1086
1087\startnarrower
1088\getbuffer
1089\stopnarrower
1090
1091So how do we deal with the next?
1092
1093\startbuffer
1094\def\ltr{\textdir TLT\relax}
1095\def\rtl{\textdir TRT\relax}
1096
1097run {\rtl nur {\ltr run \rtl NUR \ltr run \rtl NUR} nur}
1098run {\ltr run {\rtl nur \ltr RUN \rtl nur \ltr RUN} run}
1099\stopbuffer
1100
1101\typebuffer
1102
1103It gets typeset as:
1104
1105\startnarrower
1106\startlines
1107\getbuffer
1108\stoplines
1109\stopnarrower
1110
1111We could define the two helpers to look back, pick up a skip, remove it and
1112inject it after the dir node. But that way we loose the subtype information that
1113for some applications can be handy to be kept as|-|is. This is why we now have a
1114variant of \lpr {textdir} which injects the balanced node before the skip.
1115Instead of the previous definition we can use:
1116
1117\startbuffer[def]
1118\def\ltr{\linedir TLT\relax}
1119\def\rtl{\linedir TRT\relax}
1120\stopbuffer
1121
1122\typebuffer[def]
1123
1124and this time:
1125
1126\startbuffer[txt]
1127run {\rtl nur {\ltr run \rtl NUR \ltr run \rtl NUR} nur}
1128run {\ltr run {\rtl nur \ltr RUN \rtl nur \ltr RUN} run}
1129\stopbuffer
1130
1131\typebuffer[txt]
1132
1133comes out as a properly spaced:
1134
1135\startnarrower
1136\startlines
1137\getbuffer[def,txt]
1138\stoplines
1139\stopnarrower
1140
1141Anything more complex that this, like combination of skips and penalties, or
1142kerns, should be handled in the input or macro package because there is no way we
1143can predict the expected behaviour. In fact, the \lpr {linedir} is just a
1144convenience extra which could also have been implemented using node list parsing.
1145
1146Directions are complicated by the fact that they often need to work over groups
1147so a separate grouping related stack is used. A side effect is that there can be
1148paragraphs with only a local par node followed by direction synchronization
1149nodes. Paragraphs like that are seen as empty paragraphs and therefore ignored.
1150Because \type {\noindent} doesn't inject anything but a \type {\indent} injects
1151an box, paragraphs with only an indent and directions are handled as paragraphs
1152with content.
1153
1154\stopsubsection
1155
1156\startsubsection[title={Controlling glue with \lpr {breakafterdirmode}}]
1157
1158Glue after a dir node is ignored in the linebreak decision but you can bypass that
1159by setting \lpr {breakafterdirmode} to~\type {1}. The following table shows the
1160difference. Watch your spaces.
1161
1162\def\ShowSome#1{%
1163    \BC \type{#1}
1164    \NC \breakafterdirmode\zerocount\hsize\zeropoint#1
1165    \NC
1166    \NC \breakafterdirmode\plusone\hsize\zeropoint#1
1167    \NC
1168    \NC \NR
1169}
1170
1171\starttabulate[|l|Tp(1pt)|w(5em)|Tp(1pt)|w(5em)|]
1172    \DB
1173    \BC \type{0}
1174    \NC
1175    \BC \type{1}
1176    \NC
1177    \NC \NR
1178    \TB
1179    \ShowSome{pre {\textdir TLT xxx} post}
1180    \ShowSome{pre {\textdir TLT xxx }post}
1181    \ShowSome{pre{ \textdir TLT xxx} post}
1182    \ShowSome{pre{ \textdir TLT xxx }post}
1183    \ShowSome{pre { \textdir TLT xxx } post}
1184    \ShowSome{pre {\textdir TLT\relax\space xxx} post}
1185    \LL
1186\stoptabulate
1187
1188\stopsubsection
1189
1190\startsubsection[title={Controling parshapes with \lpr {shapemode}}]
1191
1192Another adaptation to the \ALEPH\ directional model is control over shapes driven
1193by \prm {hangindent} and \prm {parshape}. This is controlled by a new parameter
1194\lpr {shapemode}:
1195
1196\starttabulate[|c|l|l|]
1197\DB value    \BC \prm {hangindent} \BC \prm {parshape} \NC \NR
1198\TB
1199\BC \type{0} \NC  normal             \NC normal            \NC \NR
1200\BC \type{1} \NC  mirrored           \NC normal            \NC \NR
1201\BC \type{2} \NC  normal             \NC mirrored          \NC \NR
1202\BC \type{3} \NC  mirrored           \NC mirrored          \NC \NR
1203\LL
1204\stoptabulate
1205
1206The value is reset to zero (like \prm {hangindent} and \prm {parshape})
1207after the paragraph is done with. You can use negative values to prevent
1208this. In \in {figure} [fig:shapemode] a few examples are given.
1209
1210\startplacefigure[reference=fig:shapemode,title={The effect of \type {shapemode}.}]
1211    \startcombination[2*3]
1212        {\ruledvbox \bgroup \setuptolerance[verytolerant]
1213            \hsize .45\textwidth \switchtobodyfont[6pt]
1214                \pardir TLT \textdir TLT
1215                \hangindent 40pt \hangafter -3
1216                \leftskip10pt \input tufte \par
1217         \egroup} {TLT: hangindent}
1218        {\ruledvbox \bgroup \setuptolerance[verytolerant]
1219            \hsize .45\textwidth \switchtobodyfont[6pt]
1220            \pardir TLT \textdir TLT
1221            \parshape 4 0pt .8\hsize 10pt .8\hsize 20pt .8\hsize 0pt \hsize
1222            \input tufte \par
1223         \egroup} {TLT: parshape}
1224        {\ruledvbox \bgroup \setuptolerance[verytolerant]
1225            \hsize .45\textwidth \switchtobodyfont[6pt]
1226            \pardir TRT \textdir TRT
1227            \hangindent 40pt \hangafter -3
1228            \leftskip10pt \input tufte \par
1229         \egroup} {TRT: hangindent mode 0}
1230        {\ruledvbox \bgroup \setuptolerance[verytolerant]
1231            \hsize .45\textwidth \switchtobodyfont[6pt]
1232            \pardir TRT \textdir TRT
1233            \parshape 4 0pt .8\hsize 10pt .8\hsize 20pt .8\hsize 0pt \hsize
1234            \input tufte \par
1235         \egroup} {TRT: parshape mode 0}
1236        {\ruledvbox \bgroup \setuptolerance[verytolerant]
1237            \hsize .45\textwidth \switchtobodyfont[6pt]
1238            \shapemode=3
1239            \pardir TRT \textdir TRT
1240            \hangindent 40pt \hangafter -3
1241            \leftskip10pt \input tufte \par
1242         \egroup} {TRT: hangindent mode 1 & 3}
1243        {\ruledvbox \bgroup \setuptolerance[verytolerant]
1244            \hsize .45\textwidth \switchtobodyfont[6pt]
1245            \shapemode=3
1246            \pardir TRT \textdir TRT
1247            \parshape 4 0pt .8\hsize 10pt .8\hsize 20pt .8\hsize 0pt \hsize
1248            \input tufte \par
1249         \egroup} {TRT: parshape mode 2 & 3}
1250    \stopcombination
1251\stopplacefigure
1252
1253\stopsubsection
1254
1255\startsubsection[title={Symbols or numbers}]
1256
1257Internally the implementation is different from \ALEPH. First of all we use no
1258whatsits but dedicated nodes, but also we have only 4 directions that are mapped
1259onto 4 numbers. A text direction node can mark the start or end of a sequence of
1260nodes, and therefore has two states. At the \TEX\ end we don't see these states
1261because \TEX\ itself will add proper end state nodes if needed.
1262
1263The symbolic names \type {TLT}, \type {TRT}, etc.\ originate in \OMEGA. In
1264\LUATEX\ we also have a number based model which sometimes makes more sense.
1265
1266\starttabulate[|c|l|l|]
1267\DB value     \BC equivalent \NC \NR
1268\TB
1269\BC \type {0} \NC TLT \NC \NR
1270\BC \type {1} \NC TRT \NC \NR
1271\BC \type {2} \NC LTL \NC \NR
1272\BC \type {3} \NC RTT \NC \NR
1273\LL
1274\stoptabulate
1275
1276We support the \OMEGA\ primitives \orm {textdir}, \orm {pardir}, \orm {pagedir},
1277\orm {pardir} and \orm {mathdir}. These accept three character keywords. The
1278primitives that set the direction by number are: \lpr {textdirection}, \lpr
1279{pardirection}, \lpr {pagedirection} and \lpr {bodydirection} and \lpr
1280{mathdirection}. When specifying a direction for a box you can use \type {bdir}
1281instead of \type {dir}.
1282
1283\stopsubsection
1284
1285\stopsection
1286
1287\startsection[title=Implementation notes]
1288
1289\startsubsection[title=Memory allocation]
1290
1291\topicindex {memory}
1292
1293The single internal memory heap that traditional \TEX\ used for tokens and nodes
1294is split into two separate arrays. Each of these will grow dynamically when
1295needed.
1296
1297The \type {texmf.cnf} settings related to main memory are no longer used (these
1298are: \type {main_memory}, \type {mem_bot}, \type {extra_mem_top} and \type
1299{extra_mem_bot}). \quote {Out of main memory} errors can still occur, but the
1300limiting factor is now the amount of RAM in your system, not a predefined limit.
1301
1302Also, the memory (de)allocation routines for nodes are completely rewritten. The
1303relevant code now lives in the C file \type {texnode.c}, and basically uses a
1304dozen or so \quote {avail} lists instead of a doubly|-|linked model. An extra
1305function layer is added so that the code can ask for nodes by type instead of
1306directly requisitioning a certain amount of memory words.
1307
1308Because of the split into two arrays and the resulting differences in the data
1309structures, some of the macros have been duplicated. For instance, there are now
1310\type {vlink} and \type {vinfo} as well as \type {token_link} and \type
1311{token_info}. All access to the variable memory array is now hidden behind a
1312macro called \type {vmem}. We mention this because using the \TEX book as
1313reference is still quite valid but not for memory related details. Another
1314significant detail is that we have double linked node lists and that most nodes
1315carry more data.
1316
1317The input line buffer and pool size are now also reallocated when needed, and the
1318\type {texmf.cnf} settings \type {buf_size} and \type {pool_size} are silently
1319ignored.
1320
1321\stopsubsection
1322
1323\startsubsection[title=Sparse arrays]
1324
1325The \prm {mathcode}, \prm {delcode}, \prm {catcode}, \prm {sfcode}, \prm {lccode}
1326and \prm {uccode} (and the new \lpr {hjcode}) tables are now sparse arrays that
1327are implemented in~\CCODE. They are no longer part of the \TEX\ \quote
1328{equivalence table} and because each had 1.1 million entries with a few memory
1329words each, this makes a major difference in memory usage. Performance is not
1330really hurt by this.
1331
1332The \prm {catcode}, \prm {sfcode}, \prm {lccode}, \prm {uccode} and \lpr {hjcode}
1333assignments don't show up when using the \ETEX\ tracing routines \prm
1334{tracingassigns} and \prm {tracingrestores} but we don't see that as a real
1335limitation.
1336
1337A side|-|effect of the current implementation is that \prm {global} is now more
1338expensive in terms of processing than non|-|global assignments but not many users
1339will notice that.
1340
1341The glyph ids within a font are also managed by means of a sparse array as glyph
1342ids can go up to index $2^{21}-1$ but these are never accessed directly so again
1343users will not notice this.
1344
1345\stopsubsection
1346
1347\startsubsection[title=Simple single|-|character csnames]
1348
1349\topicindex {csnames}
1350
1351Single|-|character commands are no longer treated specially in the internals,
1352they are stored in the hash just like the multiletter csnames.
1353
1354The code that displays control sequences explicitly checks if the length is one
1355when it has to decide whether or not to add a trailing space.
1356
1357Active characters are internally implemented as a special type of multi|-|letter
1358control sequences that uses a prefix that is otherwise impossible to obtain.
1359
1360\stopsubsection
1361
1362\startsubsection[title=The compressed format file]
1363
1364\topicindex {format}
1365
1366The format is passed through \type {zlib}, allowing it to shrink to roughly half
1367of the size it would have had in uncompressed form. This takes a bit more \CPU\
1368cycles but much less disk \IO, so it should still be faster. We use a level~3
1369compression which we found to be the optimal trade|-|off between filesize and
1370decompression speed.
1371
1372\stopsubsection
1373
1374\startsubsection[title=Binary file reading]
1375
1376\topicindex {files+binary}
1377
1378All of the internal code is changed in such a way that if one of the \type
1379{read_xxx_file} callbacks is not set, then the file is read by a \CCODE\ function
1380using basically the same convention as the callback: a single read into a buffer
1381big enough to hold the entire file contents. While this uses more memory than the
1382previous code (that mostly used \type {getc} calls), it can be quite a bit faster
1383(depending on your \IO\ subsystem).
1384
1385\stopsubsection
1386
1387\startsubsection[title=Tabs and spaces]
1388
1389\topicindex {space}
1390\topicindex {newline}
1391
1392We conform to the way other \TEX\ engines handle trailing tabs and spaces. For
1393decades trailing tabs and spaces (before a newline) were removed from the input
1394but this behaviour was changed in September 2017 to only handle spaces. We are
1395aware that this can introduce compatibility issues in existing workflows but
1396because we don't want too many differences with upstream \TEXLIVE\ we just follow
1397up on that patch (which is a functional one and not really a fix). It is up to
1398macro packages maintainers to deal with possible compatibility issues and in
1399\LUATEX\ they can do so via the callbacks that deal with reading from files.
1400
1401The previous behaviour was a known side effect and (as that kind of input
1402normally comes from generated sources) it was normally dealt with by adding a
1403comment token to the line in case the spaces and|/|or tabs were intentional and
1404to be kept. We are aware of the fact that this contradicts some of our other
1405choices but consistency with other engines and the fact that in \KPSE\ mode a
1406common file \IO\ layer is used can have a side effect of breaking compatibility.
1407We still stick to our view that at the log level we can (and might be) more
1408incompatible. We already expose some more details.
1409
1410\stopsubsection
1411
1412\startsubsection[title=Hyperlinks]
1413
1414\topicindex {hyperlinks}
1415
1416There is an experimental feature that makes multi|-|line hyper links behave a
1417little better, fixing some side effects that showed up in r2l typesetting but
1418also can surface in l2r. Because this got unnoticed till 2023, and because it
1419depends bit on how macro packages deal with hyper links, the fix is currently
1420under parameter control:
1421
1422\starttyping
1423\pdfvariable linking = 1
1424\stoptyping
1425
1426That way (we hope) legacy documents come out as expected, whatever those
1427expectations are. One of the aspects dealt with concerns (unusual) left and right
1428skips.
1429
1430\stopsubsection
1431
1432\stopsection
1433
1434\stopchapter
1435
1436\stopcomponent
1437