SourceBrowser

lowlevel-accuracy.tex /size: 14 Kb last modification: 2025-02-21 11:03
1% language=us runpath=texruns:manuals/lowlevel
2
3\environment lowlevel-style
4
5\startdocument
6  [title=accuracy,
7   color=darkgray]
8
9\startsectionlevel[title=Introduction]
10
11{\em This is work in progress, uncorrected.}
12
13When you look at \TEX\ and \METAPOST\ output the accuracy of the rendering stands
14out, unless of course you do a sloppy job on design and interfere badly with the
15system. Much has to do with the fact that calculations are very precise,
16especially given the time when \TEX\ was written. Because \TEX\ doesn't rely on
17(at that time non|-|portable) floating point calculations, it does all with 32
18bit integers, except in the backend where glue calculations are used for
19finalizing the glue values. It all changed a bit when we added \LUA\ because
20there we mix integers and doubles but in practice it works out okay.
21
22When looking at floating point (and posits) one can end up in discussions about
23which one is better, what the flaws fo each are, etc. Here we're only interested
24in the fact that posits are more accurate in the ranges where \TEX\ and
25\METAPOST\ operate, as well as the fact that we only have 32 bits for floats in
26\TEX, unless we patch more heavily. So, it is also very much about storage.
27
28When you work with dimensions like points, they get converted to an integer
29number (the \type {sp} unit) and from that it's just integer calculations. The
30maximum dimension is \the\maxdimen, which already shows a rounding issue. Of
31course when one goes precise for sure there is some loss, but on the average
32we're okay. So, in the next example the two last rows are equivalent:
33
34\starttabulate[|Tr|r|r|]
35\NC .1pt        \NC \the\dimexpr.1pt     \relax  \NC \number\dimexpr.1pt     \relax sp \NC \NR
36\NC .2pt        \NC \the\dimexpr.2pt     \relax  \NC \number\dimexpr.2pt     \relax sp \NC \NR
37\NC .3pt        \NC \the\dimexpr.3pt     \relax  \NC \number\dimexpr.3pt     \relax sp \NC \NR
38\NC .1pt + .2pt \NC \the\dimexpr.1pt+.2pt\relax  \NC \number\dimexpr.1pt+.2pt\relax sp \NC \NR
39\stoptabulate
40
41When we're at the \LUA\ end things are different, there numbers are mapped onto
4264 bit floating point variables (doubles) and not all numbers map well. This is
43what we get when we work with doubles in \LUA:
44
45\starttabulate[|Tr|r|]
46\NC .1      \NC \luaexpr{.1   }  \NC \NR
47\NC .2      \NC \luaexpr{.2   }  \NC \NR
48\NC .3      \NC \luaexpr{.3   }  \NC \NR
49\NC .1 + .2 \NC \luaexpr{.1+.2}  \NC \NR
50\stoptabulate
51
52The serialization looks as if all is okay but when we test for equality there
53is a problem:
54
55\starttabulate[|Tr|l|]
56\NC      .3 == .3 \NC \luaexpr{tostring(     .3 == .3)} \NC \NR
57\NC .1 + .2 == .3 \NC \luaexpr{tostring(.1 + .2 == .3)} \NC \NR
58\stoptabulate
59
60This means that a test like this can give false positives or negatives unless one
61tests the difference against the accuracy (in \METAPOST\ we have the {eps}
62variable for that). In \TEX\ clipping of the decimal fraction influences equality.
63
64\starttabulate[|Tr|l|]
65\NC \type{\iflua {      .3 == .3 } Y\else N\fi} \NC \iflua{     .3 == .3} equal\else different\fi \NC \NR
66\NC \type{\iflua { .1 + .2 == .3 } Y\else N\fi} \NC \iflua{.1 + .2 == .3} equal\else different\fi \NC \NR
67\stoptabulate
68
69The serialization above misguides us because the number of digits displayed is
70limited. Actually, when we would compare serialized strings the equality holds,
71definitely within the accuracy of \TEX. But here is reality:
72
73\startluacode
74    local a = 0.1
75    local b = 0.2
76    local c = 0.3
77    local d = a + b
78    local NC, NR = context.NC, context.NR
79    local function show(f)
80        context.NC() context(context.escaped(f))
81        context.NC() context(f,c)
82        context.NC() context(f,d)
83        context.NC() context.NR()
84    end
85    context.starttabulate { "|T|l|l|" }
86        context.NC()
87        context.NC() context(".3")
88        context.NC() context(".1 + .2")
89        context.NC() context.NR()
90     -- show("%0.05g",c)
91        show("%0.10g",c)
92        show("%0.17g",c)
93        show("%0.20g",c)
94        show("%0.25g",c)
95    context.stoptabulate()
96\stopluacode
97
98The above examples use $0.1$, $0.2$ and $0.3$ and on a 32 bit float that actually
99works out okay, but \LUAMETATEX\ is 64 bit. Is this really important in practice?
100There are indeed cases where we are bitten by this. At the \LUA\ end we seldom
101test for equality on calculated values but it might impact check for less or
102greater then. At the \TEX\ end there are a few cases where we have issues but
103these also relate to the limited precision. It is not uncommon to loose a few
104scaled points so that has to be taken into account then. So how can we deal with
105this? In the next section(s) an alternative approach is discussed. It is not so
106much the solution for all problems but who knows.
107
108\stopsectionlevel
109
110\startsectionlevel[title=Posits]
111
112% TODO: don't check for sign (1+2)
113
114The next table shows the same as what we started with but with a different
115serialization.
116
117\starttabulate[|Tr|r|]
118\NC .1      \NC \positunum{.1     }  \NC \NR
119\NC .2      \NC \positunum{.2     }  \NC \NR
120\NC .3      \NC \positunum{.3     }  \NC \NR
121\NC .1 + .2 \NC \positunum{.1 + .2}  \NC \NR
122\stoptabulate
123
124And here we get equality in both cases:
125
126\starttabulate[|Tr|l|]
127\NC      .3 == .3 \NC \positunum{     .3 == .3} \NC \NR
128\NC .1 + .2 == .3 \NC \positunum{.1 + .2 == .3} \NC \NR
129\stoptabulate
130
131The next table shows what we actually are dealing with. The \type {\if}|-|test is
132not a primitive but provided by \CONTEXT.
133
134\starttabulate[|Tr|l|]
135\NC \type{\ifpositunum {      .3 == .3 } Y\else N\fi} \NC \ifpositunum{     .3 == .3} equal\else different\fi \NC \NR
136\NC \type{\ifpositunum { .1 + .2 == .3 } Y\else N\fi} \NC \ifpositunum{.1 + .2 == .3} equal\else different\fi \NC \NR
137\stoptabulate
138
139And what happens when we do more complex calculations:
140
141\starttabulate[|Tr|l|]
142\NC \type {math .sin(0.1 + 0.2) == math .sin(0.3)} \NC \luaexpr{tostring(math.sin(0.1 + 0.2) == math.sin(0.3))} \NC \NR
143\NC \type {posit.sin(0.1 + 0.2) == posit.sin(0.3)} \NC             \positunum{sin(0.1 + 0.2) == sin(0.3)}       \NC \NR
144\stoptabulate
145
146Of course other numbers might work out differently! I just took the simple tests
147that came to mind.
148
149So what are these posits? Here it's enough to know that they are a different way
150to store numbers with fractions. They still can loose precision but a bit less on
151smaller values and often we have relative small values in \TEX. Here are some links:
152
153\starttyping
154https://www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf
155https://posithub.org/conga/2019/docs/14/1130-FlorentDeDinechin.pdf
156\stoptyping
157
158There are better explanations out there than I can provide (if at all). When I
159first read about these unums (a review of the 2015 book \quotation {The End of
160Error Unum Computing}) I was intrigued and when in 2023 I read something about it
161in relation to RISCV I decided to just add this playground for the users. After
162all we also have decimal support. And interval based solutions might actually be
163good for \METAPOST, so that is why we have it as extra number model. There we
164need to keep in mind that \METAPOST\ in non scaled models also apply some of the
165range checking and clipping that happens in scaled (these magick 4096 tricks).
166
167For now it is enough to know that it's an alternative for floats that {\em could}
168work better in some cases but not all. The presentation mentioned above gives
169some examples of physics constants where 32 posits are not good enough for
170encoding the extremely large or small constants, but for $\pi$ it's all fine.
171\footnote {Are 64 bit posits actually being worked on in softposit? There are
172some commented sections. We also need to patch some unions to make it compile as
173C.} In double mode we actually have quite good precision compared to 32 bit
174posits but with 32 bit floats we gain some. Very small numbers and very large
175numbers are less precise, but around 1 we gain: the next value after 1 is
1761.0000001 for a float and 1.000000008 for a posit (both 32 bit). So, currently
177for \METAPOST\ there is no real gain but if we'd add posits to \TEX\ we could
178gain some because there a halfword (used for storing data) is 32 bit.
179
180But how about \TEX ? Per April 2023 the \LUAMETATEX\ engine has native support
181for floats (this in addition to \LUA\ based floats that we already had in
182\CONTEXT). How that works can be demonstrated with some examples. The float
183related commands are similar to those for numbers and dimensions: \typ
184{\floatdef}, \typ {\float}, \typ {\floatexpr}, \typ {\iffloat}, \typ
185{\ifzerofloat} and \typ {\ifintervalfloat}. That means that we also have them as
186registers. The \typ {\positdef} primitive is similar to \typ {\dimensiondef}.
187When a float (posit) is seen in a dimension context it will be interpreted as
188points, and in an integer context it will be a rounded number. As with other
189registers we have a \typ {\newfloat} macro. The \typ {\advance}, \typ
190{\multiply} and \typ {\divide} primitives accept floats.
191
192\startbuffer[reset]
193\scratchdimen=1.23456pt
194\scratchfloat=1.23456
195\stopbuffer
196
197\typebuffer[reset] \getbuffer[reset]
198
199We now use these two variables in an example:
200
201\startbuffer[dimensions]
202\setbox0\hbox to \scratchdimen {x}\the\wd0
203\scratchdimen \dimexpr \scratchdimen * 2\relax
204\setbox0\hbox to \scratchdimen {x}\the\wd0
205\advance \scratchdimen \scratchdimen
206\setbox0\hbox to \scratchdimen {x}\the\wd0
207\multiply\scratchdimen by 2
208\setbox0\hbox to \scratchdimen {x}\the\wd0
209\stopbuffer
210
211\typebuffer[dimensions] \startlines\darkblue\tttf\getbuffer[reset,dimensions]\stoplines
212
213When we use floats we get this:
214
215\startbuffer[floats]
216\setbox0\hbox to \scratchfloat {x}\the\wd0
217\scratchfloat \floatexpr \scratchfloat * 2\relax
218\setbox0\hbox to \scratchfloat {x}\the\wd0
219\advance \scratchfloat \scratchfloat
220\setbox0\hbox to \scratchfloat {x}\the\wd0
221\multiply\scratchfloat by 2
222\setbox0\hbox to \scratchfloat {x}\the\wd0
223\stopbuffer
224
225\typebuffer[floats] \startlines\darkblue\tttf\getbuffer[reset,floats]\stoplines
226
227So which approach is more accurate? At first sight you might think that the
228dimensions are better because in the last two lines they indeed duplicate.
229However, the next example shows that with dimensions we lost some between steps.
230
231\startbuffer[noboxes]
232                                                 \the\scratchfloat
233\scratchfloat \floatexpr \scratchfloat * 2\relax \the\scratchfloat
234\advance \scratchfloat \scratchfloat             \the\scratchfloat
235\multiply\scratchfloat by 2                      \the\scratchfloat
236\stopbuffer
237
238\typebuffer[noboxes] \startlines\darkblue\tttf\getbuffer[reset,noboxes]\stoplines
239
240One problem with accuracy is that it can build up. So when one eventually does
241some comparison the expectations can be wrong.
242
243\startbuffer
244\dimen0=1.2345pt
245\dimen2=1.2345pt
246
247\ifdim           \dimen0=\dimen2 S\else D\fi \space +0sp: [dim]
248\ifintervaldim0sp\dimen0 \dimen2 O\else D\fi \space +0sp: [0sp]
249
250\advance\dimen2 1sp
251
252\ifdim             \dimen0=\dimen2 S\else D\fi \space +1sp: [dim]
253\ifintervaldim 1sp \dimen0 \dimen2 O\else D\fi \space +1sp: [1sp]
254\ifintervaldim 1sp \dimen2 \dimen0 O\else D\fi \space +1sp: [1sp]
255\ifintervaldim 2sp \dimen0 \dimen2 O\else D\fi \space +1sp: [2sp]
256\ifintervaldim 2sp \dimen2 \dimen0 O\else D\fi \space +1sp: [2sp]
257
258\advance\dimen2 1sp
259
260\ifintervaldim 1sp \dimen0\dimen2 O\else D\fi \space +2sp: [1sp]
261\ifintervaldim 1sp \dimen2\dimen0 O\else D\fi \space +2sp: [1sp]
262\ifintervaldim 5sp \dimen0\dimen2 O\else D\fi \space +2sp: [5sp]
263\ifintervaldim 5sp \dimen2\dimen0 O\else D\fi \space +2sp: [5sp]
264\stopbuffer
265
266\typebuffer
267
268Here we show a test for overlap in values, the same can be done with integer
269numbers (counts) and floats. This interval checking is an experiment and we'll
270see it if gets used.
271
272\startpacked\darkblue \tttf \getbuffer \stoppacked
273
274There are also \typ {\ifintervalfloat} and \typ{\ifintervalnum}. Because I have
275worked around these few scaled point rounding issues for decades, it might
276actually take some time before we see the interval tests being used in \CONTEXT.
277After all, there is no reason to touch somewhat tricky mechanism without reason
278(read: users complaining).
279
280To come back to posits, just to be clear, we use 32 bit posits and not 32 bit
281floats, which we could have but that way we gain some accuracy because less bits
282are used by default for the exponential.
283
284In \CONTEXT\ we also provide a bunch of pseudo primitives. These take one float:
285\type {\pfsin}, \type {\pfcos}, \type {\pftan}, \type {\pfasin}, \type {\pfacos},
286\type {\pfatan}, \type {\pfsinh}, \type {\pfcosh}, \type {\pftanh}, \type
287{\pfasinh}, \type {\pfacosh}, \type {\pfatanh}, \type {\pfsqrt}, \type {\pflog},
288\type {\pfexp}, \type {\pfceil}, \type {\pffloor}, \type {\pfround}, \type
289{\pfabs}, \type {\pfrad} and \type {\pfdeg}, whiel these expect two floats: \type
290{\pfatantwo}, \type {\pfpow}, \type {\pfmod} and \type {\pfrem}.
291
292% \pageextragoal = 5sp
293
294\stopsectionlevel
295
296\startsectionlevel[title=\METAPOST]
297
298In addition to the instances \typ {metafun} (double in \LMTX), \typ {scaledfun},
299\typ {doublefun}, \typ {decimalfun} we now also have \typ {positfun}. Because we
300currently use 32 bit posits in the new number system there is no real gain over
301the already present 64 bit doubles. When 64 bit posits show up we might move on
302to that.
303
304\stopsectionlevel
305
306\startsectionlevel[title=\LUA]
307
308We support posits in \LUA\ too. Here we need to create a posit user data
309object. The usual metatable magick kicks in:
310
311\starttyping
312local p = posit.new(123.456)
313local q = posit.new(789.123)
314local r = p + q
315\stoptyping
316
317Here we just mention what is currently interface. The management functions are:
318\typ {new}, \type {copy}, \typ {tostring}, \typ {tonumber}, \typ {integer}, \typ
319{rounded}, \typ {toposit} and \typ {fromposit}. The usual operators are also
320supported: \type{+}, \type{-}, \type{*}, \type{/}, \type{^}, as well as the
321binary \type {|}. \type {&}, \type {~}, \type {<<} and \type {>>}. We can compare
322with \type {==}, \type {>=}, \type {<=} and \type {~=}. The more verbose \type
323{bor}, \type {bxor}, \type {band}, \type {shift}, \type {rotate} are there too.
324
325There is a subset of math provided: \type {min}, \type {max}, \type {abs}, \type
326{conj}, \type {modf}, \type {acos}, \type {asin}, \type {atan}, \type {ceil},
327\type {cos}, \type {exp}, \type {exp2}, \type {floor}, \type {log}, \type
328{log10}, \type {log1p}, \type {log2}, \type {logb}, \type {pow}, \type {round},
329\type {sin}, \type {sqrt} and \type {tan}. Somewhat special are \type {NaN} and
330\type {NaR}.
331
332Currently integer division (\type {//}) and modulo (\type {%}) are not available,
333but that might happen at some time.
334
335\stopsectionlevel
336
337\stopdocument
338
Source Browser ?