ontarget-accuracy.tex /size: 10 Kb    last modification: 2024-01-16 10:21
1% language=us runpath=texruns:manuals/ontarget
2
3\startcomponent ontarget-accuracy
4
5\environment ontarget-style
6
7\startchapter[title={Accuracy}]
8
9One of the virtues of \TEX\ is that it can produce the same output over a long
10period. The original engine only uses integers and although dimensions have
11fractions but these are just a way to present then to the user because internally
12they are scaled points.
13
14\starttabulate[|l|l|]
15\FL
16\NC \type{\dimexpr .4999pt                     : 2 \relax} \NC
17      \the\dimexpr .4999pt                     : 2 \relax
18\NC \NR
19\NC \type{\dimexpr .4999pt                     / 2 \relax} \NC
20      \the\dimexpr .4999pt                     / 2 \relax
21\NC \NR
22\ML
23\NC \type{\scratchdimen.4999pt \divide \scratchdimen 2 \the\scratchdimen} \NC
24          \scratchdimen.4999pt \divide \scratchdimen 2 \the\scratchdimen
25\NC \NR
26\NC \type{\scratchdimen.4999pt \edivide\scratchdimen 2 \the\scratchdimen} \NC
27          \scratchdimen.4999pt \edivide\scratchdimen 2 \the\scratchdimen
28\NC \NR
29\ML
30\NC \type{\scratchdimen4999pt \divide\scratchdimen 2 \the\scratchdimen} \NC
31          \scratchdimen4999pt \divide\scratchdimen 2 \the\scratchdimen
32\NC \NR
33\NC \type{\scratchdimen4999pt \edivide\scratchdimen 2 \the\scratchdimen} \NC
34          \scratchdimen4999pt \edivide\scratchdimen 2 \the\scratchdimen
35\NC \NR
36\NC \type{\scratchdimen4999pt \rdivide\scratchdimen 2 \the\scratchdimen} \NC
37          \scratchdimen4999pt \rdivide\scratchdimen 2 \the\scratchdimen
38\NC \NR
39\ML
40\NC \type{\numexpr   1001                        : 2 \relax} \NC
41      \the\numexpr   1001                        : 2 \relax
42\NC \NR
43\NC \type{\numexpr   1001                        / 2 \relax} \NC
44      \the\numexpr   1001                        / 2 \relax
45\NC \NR
46\ML
47\NC \type{\scratchcounter1001  \divide \scratchcounter 2 \the\scratchcounter} \NC
48          \scratchcounter1001  \divide \scratchcounter 2 \the\scratchcounter
49\NC \NR
50\NC \type{\scratchcounter1001  \edivide\scratchcounter 2 \the\scratchcounter} \NC
51          \scratchcounter1001  \edivide\scratchcounter 2 \the\scratchcounter
52\NC \NR
53\LL
54\stoptabulate
55
56The above table shows what happens when we divide an odd integer or for that
57matter odd fraction. Note the incompatibility between \type {\numexpr} and \type
58{\dimexpr} on the one hand and \type {\divide} on the other. This is why in
59\LUAMETATEX\ we have the \type {:} variant that does the same integer divide (no
60rounding) as \type {\divide} does, and why we have \type {\edivide} that divides
61like an expression using the \type {/}. The \type {\rdivide} only makes sense for
62dimensions and rounds the result.
63
64As soon as one start calculating or comparing accumulated values one can run into
65the values being a few scaled points off. This means that when one tests against
66a criterium it might be that some range comparison is better. The most likely
67place for that to happen is in the output routine and when special constructs
68like floats, tables and images come into play. Just like not every number can be
69represented in a float (double), we saw that dividing an odd integer can give
70some unexpected rounding as part of the integer is considered a fraction. So, in
71practice, even when the calculations are the same, there is a certain
72unpredictable outcome from the user perspective: \quotation {Why does it fit here
73and not there?} Well, we can be a few scaled points off due to some not entirely
74round|-|trip calculation.
75
76When \TEX\ showed up it came with fonts and in those times once a font was
77released it was unlikely to change. But today fonts do change. And changes means
78that a document can render differently after an update. Of course this is an
79argument for keeping a font in the \TEX\ tree but even then updating is kind of
80normal. Take math: the fact that fonts often have issues makes that we need to
81tweak them and some tweaks only get added when we run into some issue. If that
82issue  has been there for a while we are incompatible.
83
84Hyphenation patterns are another source of breaking compatibility but normally
85they change little. And here one can also assume that the user want words to be
86hyphenated properly. Even with such fundamental changes as a syllable being able
87to move to the next line, it is often unlikely that the paragraphs gets less or
88more lines. I bet that users are more worried about the impact on vertical
89rendering that has consequences for page breaks that for lines coming out
90differently (hopefully better).
91
92So, what are other potential areas in addition to slight differences due to
93division, fonts and patterns? We now enter the world of \LUAMETATEX\ and
94\CONTEXT. As soon as one starts to use \LUA\ code, doubles show up. It means that
95we can do calculation with little loss because a double can safely hold the
96maximum dimension (in scaled point). However, mixing 64 bit doubles at the \LUA\
97end with 32 bit integers in the engine can have side effects. As soon as set some
98property at the \TEX\ end using \LUA\ rounding takes place. Of course we can do
99all calculations like \TEX\ does, but that would have too much of an impact on
100performance.
101
102So, going back and forth between \TEX\ and \LUA\ can introduce some inaccuracies
103creeping in but as long as it is consistent, there is no real issue. It mostly
104involves fonts and especially the dimensions of characters: the width, height and
105depth but when one uses the xheight as relative measure there is also some
106influence on for instance interline spacing, offsets and such.
107
108So how can fonts make a difference? In \CONTEXT\ there are two ways to use fonts:
109normal mode and compact mode. In normal mode every size is an instance, where the
110dimension properties of characters are scaled. In compact mode we use one size
111and delegate scaling to the engine which means that we end up with the (usual)
1121000 being scale 1 kind of calculations. In the end a font with design size of
11310bp (most fonts) scaled to 12pt normal is not behaving the same as a 10pt setup
114where a 12pt size is scaled on demand. First there is the scaling from 10bp
115loaded font to the 10pt used font that gets passed to \TEX. Here we have to deal
116with history: defining a font in pt points is quite normal. Then applying a 1200
117scale (later divided by 1000) in the engine again involves some rounding to
118integers because that is what is used internally. I will come back to this later.
119
120The main conclusion to draw is that normal mode and compact mode come close but
121give different results. We can come closer when we a more accurate normal mode.
122In order to limit the number of font instances we normally limit the number of
123digits (also in compact mode but there accuracy comes a little cost). There is a
124pitfall here: While \TEX\ can happily work with any resolution, the backend has
125to make sure that embedded fonts get scaled right and that (in the case of \PDF)
126we compensate for drift in the page stream, because there character widths
127determine the advance and these are in (often rounded) bp (big postscript
128points). Especially when we enable font expansion drift prevention comes with a
129price as there we are dealing with real small difference in dimensions.
130
131As an experiment I played with clipping measures in the engine which boils down
132to rounding the last digit but that didn't work out well. For simple text we can
133get normal and compact mode identical but kick in some math (many parameters
134involved), font expansion and|/|or protrusion, additional inter|-|character
135kerning and so on, and one never get the same output. Keep in mind that we are
136not talking visual differences here, although there can be cases. More think of
137due to a slightly different vertical spacing triggering a different page break,
138for instance when footnotes are involved. In \CONTEXT\ the line height (and
139therefore derived parameters) is defined in terms of the xheight so even a few
140scaled points off makes a difference.
141
142At the user level, currently compact mode is enabled with:
143
144\starttyping
145\enableexperiments[fonts.compact]
146\stoptyping
147
148It works quite okay already for years (writing end 2023) in most scenarios but
149there might be cases where existing code still needs to be adapted, which is no
150big deal. The additional overhead is compensated by loading less font instances
151and a smaller output file. In some cases documents actually process faster and it
152definitely pays of for large fonts (\CJK) and demanding mix size feature
153processing.
154
155A more accurate normal mode is set by:
156
157\starttyping
158\enableexperiments[fonts.accurate]
159\stoptyping
160
161but it doesn't bring much. It was introduced in order for Mikael Sundqvist and me
162to compare and check math tweaks, especially those that depend on precise
163combinations of glyphs. We temporary had some additional control in the engine
164but after experiments and comparing variants the decision was made to remove that
165feature.
166
167We ran experiments with large documents where different versions were overlaid
168and depending on scenarios indeed there can be differences, but when there are
169chapters starting on new pages and when vertical spacing has stretch, there are
170not that many differences. When you compare the so called \type {tuc} files you
171might notice small difference is position tracking but these values are seldom
172used in a way that influences the rendering of text, line and page breaks.
173
174\def\RatioBpPt{( \number \dimexpr 10bp \relax / \number \dimexpr 10pt \relax ) }
175
176To come back to the bp vs pt issue. Among the options considered are moving the
177character and font properties from integers to doubles, but that would impact the
178memory footprint quite a bit. Another idea is that compact mode goes 10bp instead
179of 10pt but that would not help. One bp is \number \dimexpr 10bp \relax \space
180scaled points and one pt is \number \dimexpr 10pt \relax \space sp. The ratio
181between them is \cldcontext { \RatioBpPt }, so a \TEX\ scale 1200 effectively
182becomes \formatted { "\letterpercent .3N", 1200 * \RatioBpPt }, and assuming
183rounding to an integer we then get \cldcontext { math.round ( 1200 * \RatioBpPt
184) }. So in the end we get a less fortunate number instead of 1200 and it's not
185even accurate. Therefore this option was also rejected. For the record: an
186intermediate approach would have been to cheat: use an internal multiplier (the
187shown ratio) and although it is not hard to support, it also means that at the
188\LUA\ end we always need to take this into account, so again a no|-|go.
189
190In the end the only outcome of this bit of \quote {research} has been that we can
191have accurate normal font handling (which is not that useful) and have two
192additional divide related primitives that might be useful and add some
193consistency (and these might actually get used).
194
195\stopchapter
196
197\stopcomponent
198
199