1
2
3\environment luatexstyle
4
5\startcomponent luatexmodifications
6
7\startchapter[reference=modifications,title={Modifications}]
8
9\startsection[title=The merged engines]
10
11\startsubsection[title=The need for change]
12
13\topicindex {engines}
14\topicindex {history}
15
16The first version of \LUATEX\ only had a few extra primitives and it was largely
17the same as \PDFTEX. Then we merged substantial parts of \ALEPH\ into the code
18and got more primitives. When we got more stable the decision was made to clean
19up the rather hybrid nature of the program. This means that some primitives have
20been promoted to core primitives, often with a different name, and that others
21were removed. This made it possible to start cleaning up the code base. In \in
22{chapter} [enhancements] we discussed some new primitives, here we will cover
23most of the adapted ones.
24
25Besides the expected changes caused by new functionality, there are a number of
26notsoexpected changes. These are sometimes a sideeffect of a new
27(conflicting) feature, or, more often than not, a change necessary to clean up
28the internal interfaces. These will also be mentioned.
29
30\stopsubsection
31
32\startsubsection[title=Changes from \TEX\ 3.1415926]
33
34\topicindex {\TEX}
35
36Of course it all starts with traditional \TEX. Even if we started with \PDFTEX,
37most still comes from the original. But we divert a bit.
38
39\startitemize
40
41\startitem
42 The current code base is written in \CCODE, not \PASCAL. We use \CWEB\ when
43 possible. As a consequence instead of one large file plus change files, we
44 now have multiple files organized in categories like \type {tex}, \type
45 {pdf}, \type {lang}, \type {font}, \type {lua}, etc. There are some artifacts
46 of the conversion to \CCODE, but in due time we will clean up the source code
47 and make sure that the documentation is done right. Many files are in the
48 \CWEB\ format, but others, like those interfacing to \LUA, are \CCODE\ files.
49 Of course we want to stay as close as possible to the original so that the
50 documentation of the fundamentals behind \TEX\ by Don Knuth still applies.
51\stopitem
52
53\startitem
54 See \in {chapter} [languages] for many small changes related to paragraph
55 building, language handling and hyphenation. The most important change is
56 that adding a brace group in the middle of a word (like in \type {of{}fice})
57 does not prevent ligature creation.
58\stopitem
59
60\startitem
61 There is no pool file, all strings are embedded during compilation.
62\stopitem
63
64\startitem
65 The specifier \type {plus 1 fillll} does not generate an error. The extra
66 \quote{l} is simply typeset.
67\stopitem
68
69\startitem
70 The upper limit to \prm {endlinechar} and \prm {newlinechar} is 127.
71\stopitem
72
73\startitem
74 Magnification (\prm {mag}) is only supported in \DVI\ output mode. You can
75 set this parameter and it even works with \type {true} units till you switch
76 to \PDF\ output mode. When you use \PDF\ output you can best not touch the
77 \prm {mag} variable. This fuzzy behaviour is not much different from using
78 \PDF\ backend related functionality while eventually \DVI\ output is
79 required.
80
81 After the output mode has been frozen (normally that happens when the first
82 page is shipped out) or when \PDF\ output is enabled, the \type {true}
83 specification is ignored. When you preload a plain format adapted to
84 \LUATEX\ it can be that the \prm {mag} parameter already has been set.
85\stopitem
86
87\startitem
88 When \type {\globaldefs} is positive while a local assignment is asked for,
89 \type {{\global enforced}} is shown in the log when \type {\tracingcommands}
90 is larger than one. When \type {\globaldefs} is negative and a global
91 assignment is requested by \type {\global}, \type {\gdef} etc.\ the log will
92 mention \type {{\global canceled}}.
93\stopitem
94
95\stopitemize
96
97\stopsubsection
98
99\startsubsection[title=Changes from \ETEX\ 2.2]
100
101\topicindex {\ETEX}
102
103Being the de factor standard extension of course we provide the \ETEX\
104functionality, but with a few small adaptations.
105
106\startitemize
107
108\startitem
109 The \ETEX\ functionality is always present and enabled so the prepended
110 asterisk or \type {etex} switch for \INITEX\ is not needed.
111\stopitem
112
113\startitem
114 The \TEXXET\ extension is not present, so the primitives \type
115 {\TeXXeTstate}, \type {\beginR}, \type {\beginL}, \type {\endR} and \type
116 {\endL} are missing. Instead we used the \OMEGA\ALEPH\ approach to
117 directionality as starting point.
118\stopitem
119
120\startitem
121 Some of the tracing information that is output by \ETEXs \prm
122 {tracingassigns} and \prm {tracingrestores} is not there.
123\stopitem
124
125\startitem
126 Register management in \LUATEX\ uses the \OMEGA\ALEPH\ model, so the maximum
127 value is 65535 and the implementation uses a flat array instead of the mixed
128 flat sparse model from \ETEX.
129\stopitem
130
131\startitem
132 When kpathsea is used to find files, \LUATEX\ uses the \type {ofm} file
133 format to search for font metrics. In turn, this means that \LUATEX\ looks at
134 the \type {OFMFONTS} configuration variable (like \OMEGA\ and \ALEPH) instead
135 of \type {TFMFONTS} (like \TEX\ and \PDFTEX). Likewise for virtual fonts
136 (\LUATEX\ uses the variable \type {OVFFONTS} instead of \type {VFFONTS}).
137\stopitem
138
139\startitem
140 The primitives that report a stretch or shrink order report a value in a
141 convenient range zero upto four. Because some macro packages can break on
142 that we also provide \type {\eTeXgluestretchorder} and \type
143 {\eTeXglueshrinkorder} which report values compatible with \ETEX. The (new)
144 \type {fi} value is reported as \type {1} (so when used in an \type
145 {\ifcase} test that value makes one end up in the \type {\else}).
146\stopitem
147
148\stopitemize
149
150\stopsubsection
151
152\startsubsection[title=Changes from \PDFTEX\ 1.40]
153
154\topicindex {\PDFTEX}
155
156Because we want to produce \PDF\ the most natural starting point was the popular
157\PDFTEX\ program. We inherit the stable features, dropped most of the
158experimental code and promoted some functionality to core \LUATEX\ functionality
159which in turn triggered renaming primitives.
160
161For compatibility reasons we still refer to \type {\pdf...} commands but \LUATEX\
162has a different backend interface. Instead of these primitives there are three
163interfacing primitives: \lpr {pdfextension}, \lpr {pdfvariable} and \lpr
164{pdffeedback} that take keywords and optional further arguments (below we will
165still use the \tex {pdf} prefix names as reference). This way we can extend the
166features when needed but dont need to adapt the core engine. The front and
167backend are decoupled as much as possible.
168
169\startitemize
170
171\startitem
172 The (experimental) support for snap nodes has been removed, because it is
173 much more natural to build this functionality on top of node processing and
174 attributes. The associated primitives that are gone are: \orm
175 {pdfsnaprefpoint}, \orm {pdfsnapy}, and \orm {pdfsnapycomp}.
176\stopitem
177
178\startitem
179 The (experimental) support for specialized spacing around nodes has also been
180 removed. The associated primitives that are gone are: \orm
181 {pdfadjustinterwordglue}, \orm {pdfprependkern}, and \orm {pdfappendkern}, as
182 well as the five supporting primitives \orm {knbscode}, \orm {stbscode}, \orm
183 {shbscode}, \orm {knbccode}, and \orm {knaccode}.
184\stopitem
185
186\startitem
187 A number of \quote {\PDFTEX\ primitives} have been removed as they can be
188 implemented using \LUA: \orm {pdfelapsedtime}, \orm {pdfescapehex}, \orm
189 {pdfescapename}, \orm {pdfescapestring}, \orm {pdffiledump}, \orm
190 {pdffilemoddate}, \orm {pdffilesize}, \orm {pdfforcepagebox}, \orm
191 {pdflastmatch}, \orm {pdfmatch}, \orm {pdfmdfivesum}, \orm {pdfmovechars},
192 \orm {pdfoptionalwaysusepdfpagebox}, \orm {pdfoptionpdfinclusionerrorlevel},
193 \orm {pdfresettimer}, \orm {pdfshellescape}, \orm {pdfstrcmp} and \orm
194 {pdfunescapehex}.
195\stopitem
196
197\startitem
198 The version related primitives \orm {pdftexbanner}, \orm {pdftexversion}
199 and \orm {pdftexrevision} are no longer present as there is no longer a
200 relationship with \PDFTEX\ development.
201\stopitem
202
203\startitem
204 The experimental snapper mechanism has been removed and therefore also the
205 primitives \orm {pdfignoreddimen}, \orm {pdffirstlineheight}, \orm
206 {pdfeachlineheight}, \orm {pdfeachlinedepth} and \orm {pdflastlinedepth}.
207\stopitem
208
209\startitem
210 The experimental primitives \lpr {primitive}, \lpr {ifprimitive}, \lpr
211 {ifabsnum} and \lpr {ifabsdim} are promoted to core primitives. The \type
212 {\pdf*} prefixed originals are not available.
213\stopitem
214
215\startitem
216 Because \LUATEX\ has a different subsystem for managing images, more
217 diversion from its ancestor happened in the meantime. We dont adapt to
218 changes in \PDFTEX.
219\stopitem
220
221\startitem
222 Two extra token lists are provided, \orm {pdfxformresources} and \orm
223 {pdfxformattr}, as an alternative to \orm {pdfxform} keywords.
224\stopitem
225
226\startitem
227 Image specifications also support \type {visiblefilename}, \type
228 {userpassword} and \type {ownerpassword}. The password options are only
229 relevant for encrypted \PDF\ files.
230\stopitem
231
232\startitem
233 The current version of \LUATEX\ no longer replaces andor merges fonts in
234 embedded \PDF\ files with fonts of the enveloping \PDF\ document. This
235 regression may be temporary, depending on how the rewritten font backend will
236 look like.
237\stopitem
238
239\startitem
240 The primitives \orm {pdfpagewidth} and \orm {pdfpageheight} have been removed
241 because \lpr {pagewidth} and \lpr {pageheight} have that purpose.
242\stopitem
243
244\startitem
245 The primitives \orm {pdfnormaldeviate}, \orm {pdfuniformdeviate}, \orm
246 {pdfsetrandomseed} and \orm {pdfrandomseed} have been promoted to core
247 primitives without \type {pdf} prefix so the original commands are no longer
248 recognized.
249\stopitem
250
251\startitem
252 The primitives \lpr {ifincsname}, \lpr {expanded} and \lpr {quitvmode}
253 are now core primitives.
254\stopitem
255
256\startitem
257 As the hz and protrusion mechanism are part of the core the related
258 primitives \lpr {lpcode}, \lpr {rpcode}, \lpr {efcode}, \lpr
259 {leftmarginkern}, \lpr {rightmarginkern} are promoted to core primitives. The
260 two commands \lpr {protrudechars} and \lpr {adjustspacing} replace their
261 prefixed with \type {\pdf} originals.
262\stopitem
263
264\startitem
265 The hz optimization code has been partially redone so that we no longer need
266 to create extra font instances. The front and backend have been decoupled
267 and more efficient (\PDF) code is generated.
268\stopitem
269
270\startitem
271 When \lpr {adjustspacing} has value2, hz optimization will be applied to
272 glyphs and kerns. When the value is3, only glyphs will be treated. A value
273 smaller than2 disables this feature. With value of 1, font expansion is
274 applied after \TEXs normal paragraph breaking routines have broken the
275 paragraph into lines. In this case, line breaks are identical to standard
276 \TEX\ behavior (as with \PDFTEX).
277\stopitem
278
279\startitem
280 The \lpr {tagcode} primitive is promoted to core primitive.
281\stopitem
282
283\startitem
284 The \lpr {letterspacefont} feature is now part of the core but will not be
285 changed (improved). We just provide it for legacy use.
286\stopitem
287
288\startitem
289 The \orm {pdfnoligatures} primitive is now \lpr {ignoreligaturesinfont}.
290\stopitem
291
292\startitem
293 The \orm {pdfcopyfont} primitive is now \lpr {copyfont}.
294\stopitem
295
296\startitem
297 The \orm {pdffontexpand} primitive is now \lpr {expandglyphsinfont}.
298\stopitem
299
300\startitem
301 Because position tracking is also available in \DVI\ mode the \lpr {savepos},
302 \lpr {lastxpos} and \lpr {lastypos} commands now replace their \type {pdf}
303 prefixed originals.
304\stopitem
305
306\startitem
307 The introspective primitives \type {\pdflastximagecolordepth} and \type
308 {\pdfximagebbox} have been removed. One can use external applications to
309 determine these properties or use the builtin \type {img} library.
310\stopitem
311
312\startitem
313 The initializers \orm {pdfoutput} has been replaced by \lpr {outputmode} and
314 \orm {pdfdraftmode} is now \lpr {draftmode}.
315\stopitem
316
317\startitem
318 The pixel multiplier dimension \orm {pdfpxdimen} lost its prefix and is now
319 called \lpr {pxdimen}.
320\stopitem
321
322\startitem
323 An extra \orm {pdfimageaddfilename} option has been added that can be used to
324 block writing the filename to the \PDF\ file.
325\stopitem
326
327\startitem
328 The primitive \orm {pdftracingfonts} is now \lpr {tracingfonts} as it
329 doesnt relate to the backend.
330\stopitem
331
332\startitem
333 The experimental primitive \orm {pdfinsertht} is kept as \lpr {insertht}.
334\stopitem
335
336\startitem
337 There is some more control over what metadata goes into the \PDF\ file.
338\stopitem
339
340\startitem
341 The promotion of primitives to core primitives as well as the separation of
342 font and backend means that the initialization namespace \type {pdftex} is
343 gone.
344\stopitem
345
346\stopitemize
347
348One change involves the so called xforms and ximages. In \PDFTEX\ these are
349implemented as so called whatsits. But contrary to other whatsits they have
350dimensions that need to be taken into account when for instance calculating
351optimal line breaks. In \LUATEX\ these are now promoted to a special type of rule
352nodes, which simplifies code that needs those dimensions.
353
354Another reason for promotion is that these are useful concepts. Backends can
355provide the ability to use content that has been rendered in several places, and
356images are also common. As already mentioned in \in {section}
357[sec:imagedandforms], we now have:
358
359\starttabulate[ll]
360\DB \LUATEX \BC \PDFTEX \NC \NR
361\TB
362\NC \lpr {saveboxresource} \NC \orm {pdfxform} \NC \NR
363\NC \lpr {saveimageresource} \NC \orm {pdfximage} \NC \NR
364\NC \lpr {useboxresource} \NC \orm {pdfrefxform} \NC \NR
365\NC \lpr {useimageresource} \NC \orm {pdfrefximage} \NC \NR
366\NC \lpr {lastsavedboxresourceindex} \NC \orm {pdflastxform} \NC \NR
367\NC \lpr {lastsavedimageresourceindex} \NC \orm {pdflastximage} \NC \NR
368\NC \lpr {lastsavedimageresourcepages} \NC \orm {pdflastximagepages} \NC \NR
369\LL
370\stoptabulate
371
372There are a few \lpr {pdffeedback} features that relate to this but these are
373typical backend specific ones. The index that gets returned is to be considered
374as \quote {just a number} and although it still has the same meaning (object
375related) as before, you should not depend on that.
376
377The protrusion detection mechanism is enhanced a bit to enable a bit more complex
378situations. When protrusion characters are identified some nodes are skipped:
379
380\startitemize[packed,columns,two]
381\startitem zero glue \stopitem
382\startitem penalties \stopitem
383\startitem empty discretionaries \stopitem
384\startitem normal zero kerns \stopitem
385\startitem rules with zero dimensions \stopitem
386\startitem math nodes with a surround of zero \stopitem
387\startitem dir nodes \stopitem
388\startitem empty horizontal lists \stopitem
389\startitem local par nodes \stopitem
390\startitem inserts, marks and adjusts \stopitem
391\startitem boundaries \stopitem
392\startitem whatsits \stopitem
393\stopitemize
394
395Because this can not be enough, you can also use a protrusion boundary node to
396make the next node being ignored. When the value is1 or3, the next node will be
397ignored in the test when locating a left boundary condition. When the value is2
398or3, the previous node will be ignored when locating a right boundary condition
399(the search goes from right to left). This permits protrusion combined with for
400instance content moved into the margin:
401
402\starttyping
403\protrusionboundary1\llap{!\quad}«Who needs protrusion?»
404\stoptyping
405
406\stopsubsection
407
408\startsubsection[title=Changes from \ALEPH\ RC4]
409
410\topicindex {\ALEPH}
411
412Because we wanted proper directional typesetting the \ALEPH\ mechanisms looked
413most attractive. These are rather close to the ones provided by \OMEGA, so what
414we say next applies to both these programs.
415
416\startitemize
417
418\startitem
419 The extended 16bit math primitives (\orm {omathcode} etc.) have been
420 removed.
421\stopitem
422
423\startitem
424 The \OCP\ processing has been removed completely and as a consequence, the
425 following primitives have been removed: \orm {ocp}, \orm {externalocp}, \orm
426 {ocplist}, \orm {pushocplist}, \orm {popocplist}, \orm {clearocplists}, \orm
427 {addbeforeocplist}, \orm {addafterocplist}, \orm {removebeforeocplist}, \orm
428 {removeafterocplist} and \orm {ocptracelevel}.
429\stopitem
430
431\startitem
432 \LUATEX\ only understands 4of the 16direction specifiers of \ALEPH: \type
433 {TLT} (latin), \type {TRT} (arabic), \type {RTT} (cjk), \type {LTL} (mongolian).
434 All other direction specifiers generate an error. In addition to a keyword
435 driven model we also provide an integer driven one.
436\stopitem
437
438\startitem
439 The input translations from \ALEPH\ are not implemented, the related
440 primitives are not available: \orm {DefaultInputMode}, \orm
441 {noDefaultInputMode}, \orm {noInputMode}, \orm {InputMode}, \orm
442 {DefaultOutputMode}, \orm {noDefaultOutputMode}, \orm {noOutputMode}, \orm
443 {OutputMode}, \orm {DefaultInputTranslation}, \orm
444 {noDefaultInputTranslation}, \orm {noInputTranslation}, \orm
445 {InputTranslation}, \orm {DefaultOutputTranslation}, \orm
446 {noDefaultOutputTranslation}, \orm {noOutputTranslation} and \orm
447 {OutputTranslation}.
448\stopitem
449
450\startitem
451 Several bugs have been fixed and confusing implementation details have been
452 sorted out.
453\stopitem
454
455\startitem
456 The scanner for direction specifications now allows an optional space after
457 the direction is completely parsed.
458\stopitem
459
460\startitem
461 The \type {} notation has been extended: after \type {} four
462 hexadecimal characters are expected and after \type {} six hexadecimal
463 characters have to be given. The original \TEX\ interpretation is still valid
464 for the \type {} case but the four and six variants do no backtracking,
465 i.e.\ when they are not followed by the right number of hexadecimal digits
466 they issue an error message. Because \type{} is a normal \TEX\ case, we
467 dont support the odd number of \type {} either.
468\stopitem
469
470\startitem
471 Glues {\it immediately after} direction change commands are not legal
472 breakpoints.
473\stopitem
474
475\startitem
476 Several mechanisms that need to be righttoleft aware have been
477 improved. For instance placement of formula numbers.
478\stopitem
479
480\startitem
481 The page dimension related primitives \lpr {pagewidth} and \lpr {pageheight}
482 have been promoted to core primitives. The \prm {hoffset} and \prm {voffset}
483 primitives have been fixed.
484\stopitem
485
486\startitem
487 The primitives \type {\charwd}, \type {\charht}, \type {\chardp} and \type
488 {\charit} have been removed as we have the \ETEX\ variants \type
489 {\fontchar*}.
490\stopitem
491
492\startitem
493 The two dimension registers \lpr {pagerightoffset} and \lpr
494 {pagebottomoffset} are now core primitives.
495\stopitem
496
497\startitem
498 The direction related primitives \lpr {pagedir}, \lpr {bodydir}, \lpr
499 {pardir}, \lpr {textdir}, \lpr {mathdir} and \lpr {boxdir} are now core
500 primitives.
501\stopitem
502
503\startitem
504 The promotion of primitives to core primitives as well as removing of all
505 others means that the initialization namespace \type {aleph} that early
506 versions of \LUATEX\ provided is gone.
507\stopitem
508
509\stopitemize
510
511The above lets itself summarize as: we took the 32 bit aspects and much of the
512directional mechanisms and merged it into the \PDFTEX\ code base as starting
513point for further development. Then we simplified directionality, fixed it and
514opened it up.
515
516\stopsubsection
517
518\startsubsection[title=Changes from anywhere]
519
520The \type {\partokenname} and \type {\partokencontext} primitives are taken from
521the \PDFTEX\ change file posted on the implementers list. They are explained in
522the \PDFTEX\ manual and are classified as \ETEX\ extensions.
523
524\stopsubsection
525
526\startsubsection[title=Changes from standard \WEBC]
527
528\topicindex {\WEBC}
529
530The compilation framework is \WEBC\ and we keep using that but without the
531\PASCAL\ to \CCODE\ step. This framework also provides some common features that
532deal with reading bytes from files and locating files in \TDS. This is what we do
533different:
534
535\startitemize
536
537\startitem
538 There is no mltex support.
539\stopitem
540
541\startitem
542 There is no enctex support.
543\stopitem
544
545\startitem
546 The following encoding related command line switches are silently ignored,
547 even in non\LUA\ mode: \type {8bit}, \type {translatefile}, \type
548 {mltex}, \type {enc} and \type {etex}.
549\stopitem
550
551\startitem
552 The \prm {openout} whatsits are not written to the log file.
553\stopitem
554
555\startitem
556 Some of the socalled \WEBC\ extensions are hard to set up in non\KPSE\
557 mode because \type {texmf.cnf} is not read: \type {shellescape} is off (but
558 that is not a problem because of \LUAs \type {os.execute}), and the paranoia
559 checks on \type {openin} and \type {openout} do not happen. However, it is
560 easy for a \LUA\ script to do this itself by overloading \type {io.open} and
561 alike.
562\stopitem
563
564\startitem
565 The \quote{E} option does not do anything useful.
566\stopitem
567
568\stopitemize
569
570\stopsubsection
571
572\stopsection
573
574\startsection[reference=backendprimitives,title=The backend primitives]
575
576\startsubsection[title={Less primitives}]
577
578\topicindex {backend}
579\topicindex {\PDFbackend}
580
581In a previous section we mentioned that some \PDFTEX\ primitives were removed and
582others promoted to core \LUATEX\ primitives. That is only part of the story. In
583order to separate the backend specific primitives in de code these commands are
584now replaced by only a few. In traditional \TEX\ we only had the \DVI\ backend
585but now we have two: \DVI\ and \PDF. Additional functionality is implemented as
586\quote {extensions} in \TEX\ speak. By separating more strickly we are able to
587keep the core (frontend) clean and stable and isolate these extensions. If for
588some reason an extra backend option is needed, it can be implemented without
589touching the core. The three \PDF\ backend related primitives are:
590
591\starttyping
592\pdfextension command [specification]
593\pdfvariable name
594\pdffeedback name
595\stoptyping
596
597An extension triggers further parsing, depending on the command given. A variable is
598a (kind of) register and can be read and written, while a feedback is reporting
599something (as it comes from the backend its normally a sequence of tokens).
600
601\stopsubsection
602
603\startsubsection[title={\lpr{pdfextension}, \lpr {pdfvariable} and \lpr {pdffeedback}},reference=sec:pdfextensions]
604
605In order for \LUATEX\ to be more than just \TEX\ you need to enable primitives. That
606has already been the case right from the start. If you want the traditional \PDFTEX\
607primitives (for as far their functionality is still around) you now can do this:
608
609\starttyping
610\protected\def\pdfliteral {\pdfextension literal}
611\protected\def\pdflateliteral {\pdfextension lateliteral}
612\protected\def\pdfcolorstack {\pdfextension colorstack}
613\protected\def\pdfsetmatrix {\pdfextension setmatrix}
614\protected\def\pdfsave {\pdfextension save\relax}
615\protected\def\pdfrestore {\pdfextension restore\relax}
616\protected\def\pdfobj {\pdfextension obj }
617\protected\def\pdfrefobj {\pdfextension refobj }
618\protected\def\pdfannot {\pdfextension annot }
619\protected\def\pdfstartlink {\pdfextension startlink }
620\protected\def\pdfendlink {\pdfextension endlink\relax}
621\protected\def\pdfoutline {\pdfextension outline }
622\protected\def\pdfdest {\pdfextension dest }
623\protected\def\pdfthread {\pdfextension thread }
624\protected\def\pdfstartthread {\pdfextension startthread }
625\protected\def\pdfendthread {\pdfextension endthread\relax}
626\protected\def\pdfinfo {\pdfextension info }
627\protected\def\pdfcatalog {\pdfextension catalog }
628\protected\def\pdfnames {\pdfextension names }
629\protected\def\pdfincludechars {\pdfextension includechars }
630\protected\def\pdffontattr {\pdfextension fontattr }
631\protected\def\pdfmapfile {\pdfextension mapfile }
632\protected\def\pdfmapline {\pdfextension mapline }
633\protected\def\pdftrailer {\pdfextension trailer }
634\protected\def\pdfglyphtounicode {\pdfextension glyphtounicode }
635\protected\def\pdfrunninglinkoff {\pdfextension linkstate 1 }
636\protected\def\pdfrunninglinkon {\pdfextension linkstate 0 }
637\stoptyping
638
639The introspective primitives can be defined as:
640
641\starttyping
642\def\pdftexversion {\numexpr\pdffeedback version\relax}
643\def\pdftexrevision {\pdffeedback revision}
644\def\pdflastlink {\numexpr\pdffeedback lastlink\relax}
645\def\pdfretval {\numexpr\pdffeedback retval\relax}
646\def\pdflastobj {\numexpr\pdffeedback lastobj\relax}
647\def\pdflastannot {\numexpr\pdffeedback lastannot\relax}
648\def\pdfxformname {\numexpr\pdffeedback xformname\relax}
649\def\pdfcreationdate {\pdffeedback creationdate}
650\def\pdffontname {\numexpr\pdffeedback fontname\relax}
651\def\pdffontobjnum {\numexpr\pdffeedback fontobjnum\relax}
652\def\pdffontsize {\dimexpr\pdffeedback fontsize\relax}
653\def\pdfpageref {\numexpr\pdffeedback pageref\relax}
654\def\pdfcolorstackinit {\pdffeedback colorstackinit}
655\stoptyping
656
657The configuration related registers have become:
658
659\starttyping
660\edef\pdfcompresslevel {\pdfvariable compresslevel}
661\edef\pdfobjcompresslevel {\pdfvariable objcompresslevel}
662\edef\pdfrecompress {\pdfvariable recompress}
663\edef\pdfdecimaldigits {\pdfvariable decimaldigits}
664\edef\pdfgamma {\pdfvariable gamma}
665\edef\pdfimageresolution {\pdfvariable imageresolution}
666\edef\pdfimageapplygamma {\pdfvariable imageapplygamma}
667\edef\pdfimagegamma {\pdfvariable imagegamma}
668\edef\pdfimagehicolor {\pdfvariable imagehicolor}
669\edef\pdfimageaddfilename {\pdfvariable imageaddfilename}
670\edef\pdfpkresolution {\pdfvariable pkresolution}
671\edef\pdfpkfixeddpi {\pdfvariable pkfixeddpi}
672\edef\pdfinclusioncopyfonts {\pdfvariable inclusioncopyfonts}
673\edef\pdfinclusionerrorlevel {\pdfvariable inclusionerrorlevel}
674\edef\pdfignoreunknownimages {\pdfvariable ignoreunknownimages}
675\edef\pdfgentounicode {\pdfvariable gentounicode}
676\edef\pdfomitcidset {\pdfvariable omitcidset}
677\edef\pdfomitcharset {\pdfvariable omitcharset}
678\edef\pdfomitinfodict {\pdfvariable omitinfodict}
679\edef\pdfomitinfodict {\pdfvariable omitinfodict}
680\edef\pdfomitmediabox {\pdfvariable omitmediabox}
681\edef\pdfomitprocset {\pdfvariable omitprocset}
682\edef\pdfptexprefix {\pdfvariable ptexprefix}
683\edef\pdfpagebox {\pdfvariable pagebox}
684\edef\pdfminorversion {\pdfvariable minorversion}
685\edef\pdfuniqueresname {\pdfvariable uniqueresname}
686
687\edef\pdfhorigin {\pdfvariable horigin}
688\edef\pdfvorigin {\pdfvariable vorigin}
689\edef\pdflinkmargin {\pdfvariable linkmargin}
690\edef\pdfdestmargin {\pdfvariable destmargin}
691\edef\pdfthreadmargin {\pdfvariable threadmargin}
692\edef\pdfxformmargin {\pdfvariable xformmargin}
693
694\edef\pdfpagesattr {\pdfvariable pagesattr}
695\edef\pdfpageattr {\pdfvariable pageattr}
696\edef\pdfpageresources {\pdfvariable pageresources}
697\edef\pdfxformattr {\pdfvariable xformattr}
698\edef\pdfxformresources {\pdfvariable xformresources}
699\edef\pdfpkmode {\pdfvariable pkmode}
700
701\edef\pdfsuppressoptionalinfo {\pdfvariable suppressoptionalinfo }
702\edef\pdftrailerid {\pdfvariable trailerid }
703\stoptyping
704
705The variables are internal ones, so they are anonymous. When you ask for the
706meaning of a few previously defined ones:
707
708\starttyping
709\meaning\pdfhorigin
710\meaning\pdfcompresslevel
711\meaning\pdfpageattr
712\stoptyping
713
714you will get:
715
716\starttyping
717macro:>[internal backend dimension]
718macro:>[internal backend integer]
719macro:>[internal backend tokenlist]
720\stoptyping
721
722The \prm {edef} can also be a \prm {def} but its a bit more efficient to expand
723the lookup related register beforehand.
724
725The backend is derived from \PDFTEX\ so the same syntax applies. However, the
726\type {outline} command accepts a \type {objnum} followed by a number. No
727checking takes place so when this is used it had better be a valid (flushed)
728object.
729
730In order to be (more or less) compatible with \PDFTEX\ we also support the option
731to suppress some info but we do so via a bitset:
732
733\starttyping
734\pdfvariable suppressoptionalinfo \numexpr
735 0
736 1
737 2
738 4
739 8
740 16
741 32
742 64
743 128
744 256
745 512
746\relax
747\stoptyping
748
749In addition you can overload the trailer id, but we dont do any checking on
750validity, so you have to pass a valid array. The following is like the ones
751normally generated by the engine. You even need to include the brackets here!
752
753\starttyping
754\pdfvariable trailerid {[
755 <FA052949448907805BA83C1E78896398>
756 <FA052949448907805BA83C1E78896398>
757]}
758\stoptyping
759
760Although we started from a merge of \PDFTEX\ and \ALEPH, by now the code base as
761well as functionality has diverted from those parents. Here we show the options
762that can be passed to the extensions. The \type {shipout} option is a compatibility
763feature. Instead one can use the \type {deferred} prefix.
764
765\starttexsyntax
766\pdfextension literal
767 [shipout] [ direct page raw ] { tokens }
768\stoptexsyntax
769
770\starttexsyntax
771\pdfextension dest
772 num integer name { tokens }!crlf
773 [ fitbh fitbv fitb fith fitv fit
774 fitr <rule spec> xyz [ zoom <integer> ]
775\stoptexsyntax
776
777\starttexsyntax
778\pdfextension annot
779 reserveobjnum useobjnum <integer>
780 { tokens }
781\stoptexsyntax
782
783\starttexsyntax
784\pdfextension save
785\stoptexsyntax
786
787\starttexsyntax
788\pdfextension restore
789\stoptexsyntax
790
791\starttexsyntax
792\pdfextension setmatrix
793 { tokens }
794\stoptexsyntax
795
796\starttexsyntax
797[ \immediate ] \pdfextension obj
798 reserveobjnum
799\stoptexsyntax
800
801\starttexsyntax
802[ \immediate ] \pdfextension obj
803 [ useobjnum <integer> ]
804 [ uncompressed ]
805 [ stream [ attr { tokens } ] ]
806 [ file ]
807 { tokens }
808\stoptexsyntax
809
810\starttexsyntax
811\pdfextension refobj
812 <integer>
813\stoptexsyntax
814
815\starttexsyntax
816\pdfextension colorstack
817 <integer>
818 set { tokens } push { tokens } pop current
819\stoptexsyntax
820
821\starttexsyntax
822\pdfextension startlink
823 [ attr { tokens } ]
824 user { tokens } goto thread
825 [ file { tokens } ]
826 [ page <integer> { tokens } name { tokens } num integer ]
827 [ newwindow nonewwindow ]
828\stoptexsyntax
829
830\starttexsyntax
831\pdfextension endlink
832\stoptexsyntax
833
834\starttexsyntax
835\pdfextension startthread
836 num <integer> name { tokens }
837\stoptexsyntax
838
839\starttexsyntax
840\pdfextension endthread
841\stoptexsyntax
842
843\starttexsyntax
844\pdfextension thread
845 num <integer> name { tokens }
846\stoptexsyntax
847
848\starttexsyntax
849\pdfextension outline
850 [ attr { tokens } ]
851 [ useobjnum <integer> ]
852 [ count <integer> ]
853 { tokens }
854\stoptexsyntax
855
856\starttexsyntax
857\pdfextension glyphtounicode
858 { tokens }
859 { tokens }
860\stoptexsyntax
861
862\starttexsyntax
863\pdfextension catalog
864 { tokens }
865 [ openaction
866 user { tokens } goto thread
867 [ file { tokens } ]
868 [ page <integer> { tokens } name { tokens } num <integer> ]
869 [ newwindow nonewwindow ] ]
870\stoptexsyntax
871
872\starttexsyntax
873\pdfextension fontattr
874 <integer>
875 {tokens}
876\stoptexsyntax
877
878\starttexsyntax
879\pdfextension mapfile
880 {tokens}
881\stoptexsyntax
882
883\starttexsyntax
884\pdfextension mapline
885 {tokens}
886\stoptexsyntax
887
888\starttexsyntax
889\pdfextension includechars
890 {tokens}
891\stoptexsyntax
892
893\starttexsyntax
894\pdfextension info
895 {tokens}
896\stoptexsyntax
897
898\starttexsyntax
899\pdfextension names
900 {tokens}
901\stoptexsyntax
902
903\starttexsyntax
904\pdfextension trailer
905 {tokens}
906\stoptexsyntax
907
908\stopsubsection
909
910\startsubsection[title={Defaults}]
911
912The engine sets the following defaults.
913
914\starttyping
915\pdfcompresslevel 9
916\pdfobjcompresslevel 1
917\pdfrecompress 0
918\pdfdecimaldigits 4
919\pdfgamma 1000
920\pdfimageresolution 71
921\pdfimageapplygamma 0
922\pdfimagegamma 2200
923\pdfimagehicolor 1
924\pdfimageaddfilename 1
925\pdfpkresolution 72
926\pdfpkfixeddpi 0
927\pdfinclusioncopyfonts 0
928\pdfinclusionerrorlevel 0
929\pdfignoreunknownimages 0
930\pdfgentounicode 0
931\pdfomitcidset 0
932\pdfomitcharset 0
933\pdfomitinfodict 0
934\pdfomitmediabox 0
935\pdfomitprocset 0
936\pdfptexprefix 0
937\pdfpagebox 0
938\pdfminorversion 4
939\pdfuniqueresname 0
940
941\pdfhorigin 1in
942\pdfvorigin 1in
943\pdflinkmargin 0pt
944\pdfdestmargin 0pt
945\pdfthreadmargin 0pt
946\pdfxformmargin 0pt
947\stoptyping
948
949\stopsubsection
950
951\startsubsection[title={Backward compatibility}]
952
953If you also want some backward compatibility, you can add:
954
955\starttyping
956\let\pdfpagewidth \pagewidth
957\let\pdfpageheight \pageheight
958
959\let\pdfadjustspacing \adjustspacing
960\let\pdfprotrudechars \protrudechars
961\let\pdfnoligatures \ignoreligaturesinfont
962\let\pdffontexpand \expandglyphsinfont
963\let\pdfcopyfont \copyfont
964
965\let\pdfxform \saveboxresource
966\let\pdflastxform \lastsavedboxresourceindex
967\let\pdfrefxform \useboxresource
968
969\let\pdfximage \saveimageresource
970\let\pdflastximage \lastsavedimageresourceindex
971\let\pdflastximagepages\lastsavedimageresourcepages
972\let\pdfrefximage \useimageresource
973
974\let\pdfsavepos \savepos
975\let\pdflastxpos \lastxpos
976\let\pdflastypos \lastypos
977
978\let\pdfoutput \outputmode
979\let\pdfdraftmode \draftmode
980
981\let\pdfpxdimen \pxdimen
982
983\let\pdfinsertht \insertht
984
985\let\pdfnormaldeviate \normaldeviate
986\let\pdfuniformdeviate \uniformdeviate
987\let\pdfsetrandomseed \setrandomseed
988\let\pdfrandomseed \randomseed
989
990\let\pdfprimitive \primitive
991\let\ifpdfprimitive \ifprimitive
992
993\let\ifpdfabsnum \ifabsnum
994\let\ifpdfabsdim \ifabsdim
995\stoptyping
996
997And even:
998
999\starttyping
1000\newdimen\pdfeachlineheight
1001\newdimen\pdfeachlinedepth
1002\newdimen\pdflastlinedepth
1003\newdimen\pdffirstlineheight
1004\newdimen\pdfignoreddimen
1005\stoptyping
1006
1007\stopsubsection
1008
1009\stopsection
1010
1011\startsection[title=Directions]
1012
1013\topicindex {\OMEGA}
1014\topicindex {\ALEPH}
1015\topicindex {directions}
1016
1017\startsubsection[title={Four directions}]
1018
1019The directional model in \LUATEX\ is inherited from \OMEGA\ALEPH\ but we tried
1020to improve it a bit. At some point we played with recovery of modes but that was
1021disabled later on when we found that it interfered with nested directions. That
1022itself had as side effect that the node list was no longer balanced with respect
1023to directional nodes which in turn can give side effects when a series of dir
1024changes happens without grouping.
1025
1026When extending the \PDF\ backend to support directions some inconsistencies were
1027found and as a result we decided to support only the four models that make sense
1028\type {TLT} (latin), \type {TRT} (arabic), \type {RTT} (cjk) and \type {LTL}
1029(mongolian).
1030
1031\stopsubsection
1032
1033\startsubsection[title={How it works}]
1034
1035The approach is that we again make the list balanced but try to avoid some side
1036effects. What happens is quite intuitive if we forget about spaces (turned into
1037glue) but even there what happens makes sense if you look at it in detail.
1038However that logic makes ingroup switching kind of useless when no proper
1039nested grouping is used: switching from right to left several times nested,
1040results in spacing ending up after each other due to nested mirroring. Of course
1041a sane macro package will manage this for the user but here we are discussing the
1042low level dir injection.
1043
1044This is what happens:
1045
1046\starttyping
1047\textdir TRT nur {\textdir TLT run \textdir TRT NUR} nur
1048\stoptyping
1049
1050This becomes stepwise:
1051
1052\startnarrower
1053\starttyping
1054injected: [TRT]nur {[TLT]run [TRT]NUR} nur
1055balanced: [TRT]nur {[TLT]run [TLT][TRT]NUR[TRT]} nur[TRT]
1056result : run {RUNrun } run
1057\stoptyping
1058\stopnarrower
1059
1060And this:
1061
1062\starttyping
1063\textdir TRT nur {nur \textdir TLT run \textdir TRT NUR} nur
1064\stoptyping
1065
1066becomes:
1067
1068\startnarrower
1069\starttyping
1070injected: [TRT]nur {nur [TLT]run [TRT]NUR} nur
1071balanced: [TRT]nur {nur [TLT]run [TLT][TRT]NUR[TRT]} nur[TRT]
1072result : run {run RUNrun } run
1073\stoptyping
1074\stopnarrower
1075
1076Now, in the following examples watch where we put the braces:
1077
1078\startbuffer
1079\textdir TRT nur {{\textdir TLT run} {\textdir TRT NUR}} nur
1080\stopbuffer
1081
1082\typebuffer
1083
1084This becomes:
1085
1086\startnarrower
1087\getbuffer
1088\stopnarrower
1089
1090Compare this to:
1091
1092\startbuffer
1093\textdir TRT nur {{\textdir TLT run }{\textdir TRT NUR}} nur
1094\stopbuffer
1095
1096\typebuffer
1097
1098Which renders as:
1099
1100\startnarrower
1101\getbuffer
1102\stopnarrower
1103
1104So how do we deal with the next?
1105
1106\startbuffer
1107\def\ltr{\textdir TLT\relax}
1108\def\rtl{\textdir TRT\relax}
1109
1110run {\rtl nur {\ltr run \rtl NUR \ltr run \rtl NUR} nur}
1111run {\ltr run {\rtl nur \ltr RUN \rtl nur \ltr RUN} run}
1112\stopbuffer
1113
1114\typebuffer
1115
1116It gets typeset as:
1117
1118\startnarrower
1119\startlines
1120\getbuffer
1121\stoplines
1122\stopnarrower
1123
1124We could define the two helpers to look back, pick up a skip, remove it and
1125inject it after the dir node. But that way we loose the subtype information that
1126for some applications can be handy to be kept asis. This is why we now have a
1127variant of \lpr {textdir} which injects the balanced node before the skip.
1128Instead of the previous definition we can use:
1129
1130\startbuffer[def]
1131\def\ltr{\linedir TLT\relax}
1132\def\rtl{\linedir TRT\relax}
1133\stopbuffer
1134
1135\typebuffer[def]
1136
1137and this time:
1138
1139\startbuffer[txt]
1140run {\rtl nur {\ltr run \rtl NUR \ltr run \rtl NUR} nur}
1141run {\ltr run {\rtl nur \ltr RUN \rtl nur \ltr RUN} run}
1142\stopbuffer
1143
1144\typebuffer[txt]
1145
1146comes out as a properly spaced:
1147
1148\startnarrower
1149\startlines
1150\getbuffer[def,txt]
1151\stoplines
1152\stopnarrower
1153
1154Anything more complex that this, like combination of skips and penalties, or
1155kerns, should be handled in the input or macro package because there is no way we
1156can predict the expected behaviour. In fact, the \lpr {linedir} is just a
1157convenience extra which could also have been implemented using node list parsing.
1158
1159Directions are complicated by the fact that they often need to work over groups
1160so a separate grouping related stack is used. A side effect is that there can be
1161paragraphs with only a local par node followed by direction synchronization
1162nodes. Paragraphs like that are seen as empty paragraphs and therefore ignored.
1163Because \type {\noindent} doesnt inject anything but a \type {\indent} injects
1164an box, paragraphs with only an indent and directions are handled as paragraphs
1165with content.
1166
1167\stopsubsection
1168
1169\startsubsection[title={Controlling glue with \lpr {breakafterdirmode}}]
1170
1171Glue after a dir node is ignored in the linebreak decision but you can bypass that
1172by setting \lpr {breakafterdirmode} to\type {1}. The following table shows the
1173difference. Watch your spaces.
1174
1175\def\ShowSome#1{
1176 \BC \type{#1}
1177 \NC \breakafterdirmode\zerocount\hsize\zeropoint#1
1178 \NC
1179 \NC \breakafterdirmode\plusone\hsize\zeropoint#1
1180 \NC
1181 \NC \NR
1182}
1183
1184\starttabulate[lTp(1pt)w(5em)Tp(1pt)w(5em)]
1185 \DB
1186 \BC \type{0}
1187 \NC
1188 \BC \type{1}
1189 \NC
1190 \NC \NR
1191 \TB
1192 \ShowSome{pre {\textdir TLT xxx} post}
1193 \ShowSome{pre {\textdir TLT xxx }post}
1194 \ShowSome{pre{ \textdir TLT xxx} post}
1195 \ShowSome{pre{ \textdir TLT xxx }post}
1196 \ShowSome{pre { \textdir TLT xxx } post}
1197 \ShowSome{pre {\textdir TLT\relax\space xxx} post}
1198 \LL
1199\stoptabulate
1200
1201\stopsubsection
1202
1203\startsubsection[title={Controling parshapes with \lpr {shapemode}}]
1204
1205Another adaptation to the \ALEPH\ directional model is control over shapes driven
1206by \prm {hangindent} and \prm {parshape}. This is controlled by a new parameter
1207\lpr {shapemode}:
1208
1209\starttabulate[cll]
1210\DB value \BC \prm {hangindent} \BC \prm {parshape} \NC \NR
1211\TB
1212\BC \type{0} \NC normal \NC normal \NC \NR
1213\BC \type{1} \NC mirrored \NC normal \NC \NR
1214\BC \type{2} \NC normal \NC mirrored \NC \NR
1215\BC \type{3} \NC mirrored \NC mirrored \NC \NR
1216\LL
1217\stoptabulate
1218
1219The value is reset to zero (like \prm {hangindent} and \prm {parshape})
1220after the paragraph is done with. You can use negative values to prevent
1221this. In \in {figure} [fig:shapemode] a few examples are given.
1222
1223\startplacefigure[reference=fig:shapemode,title={The effect of \type {shapemode}.}]
1224 \startcombination[2*3]
1225 {\ruledvbox \bgroup \setuptolerance[verytolerant]
1226 \hsize .45\textwidth \switchtobodyfont[6pt]
1227 \pardir TLT \textdir TLT
1228 \hangindent 40pt \hangafter 3
1229 \leftskip10pt \input tufte \par
1230 \egroup} {TLT: hangindent}
1231 {\ruledvbox \bgroup \setuptolerance[verytolerant]
1232 \hsize .45\textwidth \switchtobodyfont[6pt]
1233 \pardir TLT \textdir TLT
1234 \parshape 4 0pt .8\hsize 10pt .8\hsize 20pt .8\hsize 0pt \hsize
1235 \input tufte \par
1236 \egroup} {TLT: parshape}
1237 {\ruledvbox \bgroup \setuptolerance[verytolerant]
1238 \hsize .45\textwidth \switchtobodyfont[6pt]
1239 \pardir TRT \textdir TRT
1240 \hangindent 40pt \hangafter 3
1241 \leftskip10pt \input tufte \par
1242 \egroup} {TRT: hangindent mode 0}
1243 {\ruledvbox \bgroup \setuptolerance[verytolerant]
1244 \hsize .45\textwidth \switchtobodyfont[6pt]
1245 \pardir TRT \textdir TRT
1246 \parshape 4 0pt .8\hsize 10pt .8\hsize 20pt .8\hsize 0pt \hsize
1247 \input tufte \par
1248 \egroup} {TRT: parshape mode 0}
1249 {\ruledvbox \bgroup \setuptolerance[verytolerant]
1250 \hsize .45\textwidth \switchtobodyfont[6pt]
1251 \shapemode=3
1252 \pardir TRT \textdir TRT
1253 \hangindent 40pt \hangafter 3
1254 \leftskip10pt \input tufte \par
1255 \egroup} {TRT: hangindent mode 1 3}
1256 {\ruledvbox \bgroup \setuptolerance[verytolerant]
1257 \hsize .45\textwidth \switchtobodyfont[6pt]
1258 \shapemode=3
1259 \pardir TRT \textdir TRT
1260 \parshape 4 0pt .8\hsize 10pt .8\hsize 20pt .8\hsize 0pt \hsize
1261 \input tufte \par
1262 \egroup} {TRT: parshape mode 2 3}
1263 \stopcombination
1264\stopplacefigure
1265
1266\stopsubsection
1267
1268\startsubsection[title={Symbols or numbers}]
1269
1270Internally the implementation is different from \ALEPH. First of all we use no
1271whatsits but dedicated nodes, but also we have only 4 directions that are mapped
1272onto 4 numbers. A text direction node can mark the start or end of a sequence of
1273nodes, and therefore has two states. At the \TEX\ end we dont see these states
1274because \TEX\ itself will add proper end state nodes if needed.
1275
1276The symbolic names \type {TLT}, \type {TRT}, etc.\ originate in \OMEGA. In
1277\LUATEX\ we also have a number based model which sometimes makes more sense.
1278
1279\starttabulate[cll]
1280\DB value \BC equivalent \NC \NR
1281\TB
1282\BC \type {0} \NC TLT \NC \NR
1283\BC \type {1} \NC TRT \NC \NR
1284\BC \type {2} \NC LTL \NC \NR
1285\BC \type {3} \NC RTT \NC \NR
1286\LL
1287\stoptabulate
1288
1289We support the \OMEGA\ primitives \orm {textdir}, \orm {pardir}, \orm {pagedir},
1290\orm {pardir} and \orm {mathdir}. These accept three character keywords. The
1291primitives that set the direction by number are: \lpr {textdirection}, \lpr
1292{pardirection}, \lpr {pagedirection} and \lpr {bodydirection} and \lpr
1293{mathdirection}. When specifying a direction for a box you can use \type {bdir}
1294instead of \type {dir}.
1295
1296\stopsubsection
1297
1298\stopsection
1299
1300\startsection[title=Implementation notes]
1301
1302\startsubsection[title=Memory allocation]
1303
1304\topicindex {memory}
1305
1306The single internal memory heap that traditional \TEX\ used for tokens and nodes
1307is split into two separate arrays. Each of these will grow dynamically when
1308needed.
1309
1310The \type {texmf.cnf} settings related to main memory are no longer used (these
1311are: \type {mainmemory}, \type {membot}, \type {extramemtop} and \type
1312{extramembot}). \quote {Out of main memory} errors can still occur, but the
1313limiting factor is now the amount of RAM in your system, not a predefined limit.
1314
1315Also, the memory (de)allocation routines for nodes are completely rewritten. The
1316relevant code now lives in the C file \type {texnode.c}, and basically uses a
1317dozen or so \quote {avail} lists instead of a doublylinked model. An extra
1318function layer is added so that the code can ask for nodes by type instead of
1319directly requisitioning a certain amount of memory words.
1320
1321Because of the split into two arrays and the resulting differences in the data
1322structures, some of the macros have been duplicated. For instance, there are now
1323\type {vlink} and \type {vinfo} as well as \type {tokenlink} and \type
1324{tokeninfo}. All access to the variable memory array is now hidden behind a
1325macro called \type {vmem}. We mention this because using the \TEX book as
1326reference is still quite valid but not for memory related details. Another
1327significant detail is that we have double linked node lists and that most nodes
1328carry more data.
1329
1330The input line buffer and pool size are now also reallocated when needed, and the
1331\type {texmf.cnf} settings \type {bufsize} and \type {poolsize} are silently
1332ignored.
1333
1334\stopsubsection
1335
1336\startsubsection[title=Sparse arrays]
1337
1338The \prm {mathcode}, \prm {delcode}, \prm {catcode}, \prm {sfcode}, \prm {lccode}
1339and \prm {uccode} (and the new \lpr {hjcode}) tables are now sparse arrays that
1340are implemented in\CCODE. They are no longer part of the \TEX\ \quote
1341{equivalence table} and because each had 1.1 million entries with a few memory
1342words each, this makes a major difference in memory usage. Performance is not
1343really hurt by this.
1344
1345The \prm {catcode}, \prm {sfcode}, \prm {lccode}, \prm {uccode} and \lpr {hjcode}
1346assignments dont show up when using the \ETEX\ tracing routines \prm
1347{tracingassigns} and \prm {tracingrestores} but we dont see that as a real
1348limitation.
1349
1350A sideeffect of the current implementation is that \prm {global} is now more
1351expensive in terms of processing than nonglobal assignments but not many users
1352will notice that.
1353
1354The glyph ids within a font are also managed by means of a sparse array as glyph
1355ids can go up to index $2{21}1$ but these are never accessed directly so again
1356users will not notice this.
1357
1358\stopsubsection
1359
1360\startsubsection[title=Simple singlecharacter csnames]
1361
1362\topicindex {csnames}
1363
1364Singlecharacter commands are no longer treated specially in the internals,
1365they are stored in the hash just like the multiletter csnames.
1366
1367The code that displays control sequences explicitly checks if the length is one
1368when it has to decide whether or not to add a trailing space.
1369
1370Active characters are internally implemented as a special type of multiletter
1371control sequences that uses a prefix that is otherwise impossible to obtain.
1372
1373\stopsubsection
1374
1375\startsubsection[title=The compressed format file]
1376
1377\topicindex {format}
1378
1379The format is passed through \type {zlib}, allowing it to shrink to roughly half
1380of the size it would have had in uncompressed form. This takes a bit more \CPU\
1381cycles but much less disk \IO, so it should still be faster. We use a level3
1382compression which we found to be the optimal tradeoff between filesize and
1383decompression speed.
1384
1385\stopsubsection
1386
1387\startsubsection[title=Binary file reading]
1388
1389\topicindex {filesbinary}
1390
1391All of the internal code is changed in such a way that if one of the \type
1392{readxxxfile} callbacks is not set, then the file is read by a \CCODE\ function
1393using basically the same convention as the callback: a single read into a buffer
1394big enough to hold the entire file contents. While this uses more memory than the
1395previous code (that mostly used \type {getc} calls), it can be quite a bit faster
1396(depending on your \IO\ subsystem).
1397
1398\stopsubsection
1399
1400\startsubsection[title=Tabs and spaces]
1401
1402\topicindex {space}
1403\topicindex {newline}
1404
1405We conform to the way other \TEX\ engines handle trailing tabs and spaces. For
1406decades trailing tabs and spaces (before a newline) were removed from the input
1407but this behaviour was changed in September 2017 to only handle spaces. We are
1408aware that this can introduce compatibility issues in existing workflows but
1409because we dont want too many differences with upstream \TEXLIVE\ we just follow
1410up on that patch (which is a functional one and not really a fix). It is up to
1411macro packages maintainers to deal with possible compatibility issues and in
1412\LUATEX\ they can do so via the callbacks that deal with reading from files.
1413
1414The previous behaviour was a known side effect and (as that kind of input
1415normally comes from generated sources) it was normally dealt with by adding a
1416comment token to the line in case the spaces andor tabs were intentional and
1417to be kept. We are aware of the fact that this contradicts some of our other
1418choices but consistency with other engines and the fact that in \KPSE\ mode a
1419common file \IO\ layer is used can have a side effect of breaking compatibility.
1420We still stick to our view that at the log level we can (and might be) more
1421incompatible. We already expose some more details.
1422
1423\stopsubsection
1424
1425\startsubsection[title=Hyperlinks]
1426
1427\topicindex {hyperlinks}
1428
1429There is an experimental feature that makes multiline hyper links behave a
1430little better, fixing some side effects that showed up in r2l typesetting but
1431also can surface in l2r. Because this got unnoticed till 2023, and because it
1432depends bit on how macro packages deal with hyper links, the fix is currently
1433under parameter control:
1434
1435\starttyping
1436\pdfvariable linking = 1
1437\stoptyping
1438
1439That way (we hope) legacy documents come out as expected, whatever those
1440expectations are. One of the aspects dealt with concerns (unusual) left and right
1441skips.
1442
1443\stopsubsection
1444
1445\stopsection
1446
1447\stopchapter
1448
1449\stopcomponent
1450 |