about-hashing.tex /size: 28 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\startcomponent about-hashing
4
5\environment about-environment
6
7\usemodule[lua-hashing]
8
9\startchapter[title={Lua strings}]
10
11\startsection[title=Introduction]
12
13In the crited project \footnote {This is a project by Thomas Schmitz, Alan
14Braslau, Luigi Scarso and Hans Hagen funded by the Institut für Klassische und
15Romanische Philologie Universität Bonn.} we have to deal with large amounts of
16data. The sources are in \TEI\ \XML\ and processed directly in \CONTEXT\ \MKIV,
17and we have to filter content from different places in the \XML\ tree. Processing
18relies on \LUA\ a lot because we use \LUA\ for dealing with the \XML. We're
19talking about Latin and Greek texts so there is no demand for extensive font
20processing in \LUA\ is moderate. But as critical editions have lots of line
21specific referencing and notes there are some more complex layout elements
22involved, and again these use \LUA. There is also extensive use of bibliographies
23and it will be no surprise that \LUA\ comes to help too. \footnote {One of the
24objectives of the project is to update and enhance the bibliographic subsystem.}
25
26One secondary objective is to be able to process the complex documents at a speed
27of at least 20 pages per second on a modern 2014 workstation laptop. One way of
28achieving this is to use \LUAJITTEX\ which has a faster virtual \LUA\ machine.
29However, we ran into several issues with the \LUAJIT\ interpreter, which is fully
30\LUA\ language 5.1 and partly 5.2 compatible but definitely has a different low
31level implementation. In the next sections I will discuss two issues that Luigi
32and I ran into and for which we could come up with reasonable workarounds.
33
34\stopsection
35
36\startsection[title=The stacks]
37
38A \TEX\ job is normally a multi|-|pass experience. One run can produce information
39that is used in a successive one. The reason is that something can happen on page
4015 that influences the typesetting of page~9. There can even be a partial chain
41reaction: you typeset a document the first time the table of contents (and the
42pages it refers to) is not known yet but information is saved that makes it
43possible next time. That next run it gets included and it takes for instance 4
44pages. This means that all page numbers shift up. This in turn will trigger a new
45run because all cross references might change too: two digit page numbers can
46become three digits, so paragraphs can run wider, and that again can trigger more
47pages. Normally an initial three runs is enough, and with minor updates of the
48source one or two runs are enough after that.
49
50The multi|-|pass information is saved in tables in the so called utility file and
51loaded a next run. Common subtables are shared in the process. In order to
52determine if there has been crucial changes that demand an extra run, we have to
53make sure that random order in these tables is eliminated. Normally we already
54sort keys in tables when writing them to file but some tables come out in the
55order the traversing \type {next} function delivers them. In the more recent 5.2
56versions \LUA\ has added some randomness to the order in which hashed tables are
57organized, so while in previous versions we could assume that for a specific
58binary the order was the same each time, we cannot rely on that any longer. This is
59not that important for normal cases, but we compare previous and current versions
60of the utility file and pack shared tables in them as well, which means that we
61are sensitive for a change in order. But, this could be dealt with at the cost of
62some extra sorting. \footnote {In \CONTEXT\ we also pack font tables which saves
63lots of memory and also some load time).}
64
65Anyway, this kind of changes in the \LUA\ machinery is harmless apart from taking
66some time to adapt to it. It is also the reason why we cannot simply push a new
67update of \LUA\ into \LUATEX\ because low level changes can have an (yet unknown)
68impact. Of course performance is the biggest issue here: we don't want a slower
69\LUATEX.
70
71In the past we already reported on the benefits of \LUAJITTEX, especially its
72faster virtual machine. We don't benefit from jitting; on the contrary it slows
73us down. One reason is that we cross the \LUA||\CCODE\ boundary often and hardly
74use any of the optimized functions. Part of the speed is achieved by a different
75implementation deep down and one of them is a different virtual machine
76instruction set. While \LUA\ can go real big in terms of memory and table
77construction, \LUAJIT\ limits us to at most 2G memory and poses some 64K
78limitations in functions and table constructors. The memory is not so much the
79issue in the crited project but the (nested) table constructor is. When we have a
80few tens of thousands of cross references, index entries and|/|or list entries we
81simply cannot load the multi|-|pass data. A few days of playing with splitting up
82nested tables didn't help much: it made the code look horrible and eventually we
83again ran into a maximum of 64K someplace as a \type {dofile} effectively makes a
84function that gets run and \LUAJIT\ doesn't like that size. For the record: we
85don't have such issues with large font tables probably because they are just one
86big table. The reason why we cannot use that approach is that serializing the
87potentially very large tables in the utility file also has limitations.
88
89Eventually this could be solved by assuming only forward referencing for certain
90registers. That way we only used the index entries collected in memory during the
91run and as long as we don't put a register before it's entries are defined we're
92okay. So here we have a typical case where one can set an option to circumvent
93an engine limitation. \footnote {A decade ago similar tricks had to be used to
94support hundreds of thousands of hyperlinks in \TEX\ engines with at that time
95limited memory capabilities.} Explaining this in a user manual is a challenge,
96because an error message like the following is not that helpful:
97
98\starttyping
99main function has more than 65536 constants
100\stoptyping
101
102But, once we could generate these indices again by posing some limitations,
103\LUAJITTEX\ had other issues. This time we got excessive runtime and we spent
104quite some time sorting that one out. More on that in the next section.
105
106\stopsection
107
108\startsection[title=Hashing]
109
110One of the reasons why (text processing with) \LUA\ is rather fast is that it
111hashes its strings so that a test for equality is real fast. This means that for
112each string that enters \LUA\ a hash value is calculated and that hash is used in
113comparisons. Of course hashing takes time, but especially when you work with lots
114of tables the advantage of a simple hash compare outweighs this one||time
115hashing. On the other hand, if you work with files and process lines, and maybe
116split these in words, you might end up with a lot of unneeded hashing. But, in
117\LUATEX\ and therefore \MKIV\ we benefit from hashing a lot. In \LUA\ 5.2 the
118hash function was adapted so that only strings upto than (default) 40 characters
119get hashed. In practice we're not affected much by this, as most keywords we use
120are shorter than this boundary. And in \CONTEXT\ we do quite some keyword checking.
121
122So, when we were conducting tests with these large registers, we were surprised
123that \LUAJITTEX\ performed significantly slower (ten times or more) that stock
124\LUATEX, while until then we had observed that a \LUAJITTEX\ run was normally
125some 20 to 40\% faster.
126
127The first impression was that it related to the large amount of strings that are
128written from \LUA\ to \TEX. After index entries are collected, they are sorted
129and the index is flushed to \TEX. This happens in one go, and \TEX\ code ends up
130in the \TEX\ input stack. Some actions are delayed and create callbacks to \LUA,
131so some wrapping in functions happens too. That means that some (\LUA) strings
132are only freed later on, but that proved not to be the main problem.
133
134When the entries are typeset, an interactive cross reference is kept track of and
135these exist till the document is closed and the referencing information is
136written to the \PDF\ file. Of course we could tweak this but once you start along
137that path there is no end to writing ugly hacks.
138
139Eventually we found that the slowdown relates to hashing, especially because that is
140not the first area where you look. Why is this? The specific register concerned lots
141of small greek words, pointing to locations in a text, where locations looked like
142\type {1.2.3}. In case you wonder why greek is mentioned: in multi|-|byte \UTF\
143sequences there is a lot of repetition:
144
145\startluacode
146local byte = string.byte
147function sample(s)
148    context.NC() context(s)
149    context.NC() context.ttx(false)
150    for b in string.utfvalues(s) do
151        context("%02X ",b)
152    end
153    context.NC() context.ttx(false)
154    for b in string.gmatch(s,".") do
155        context("%02X ",byte(b))
156    end
157    context.NC() context.NR()
158end
159
160context.starttabulate { "||||" }
161context.FL()
162context.NC() context.bold("word")
163context.NC() context.bold("unicode")
164context.NC() context.bold("bytes")
165context.NC() context.NR()
166context.FL()
167sample("βίον")
168sample("βίου")
169sample("βιοὺς")
170sample("βουλὴν")
171sample("βουλῆς")
172context.LL()
173context.stoptabulate()
174\stopluacode
175
176When cross referencing these index entries with their origin, you end up with
177reference identifiers like \type {foo:1.2.3} or, because \CONTEXT\ has automated
178internal references (which are rather efficient in the resulting \PDF), we get
179\type {aut:1}, \type {aut:2} upto in this case some 30.000 of them.
180
181The problem with hashing is as follows. When we write commands to \TEX\ or use
182data with a repetitive property, the similarity of these strings can be hard on
183the hasher as it can produce similar hash keys in which case collisions need to
184be dealt with. I'm no expert on hashing but looking at the code shows that in
185\LUAJIT\ (at least in the version we're talking about) the string is seen as
186chunks of 4 bytes. The first, last, middle and halfway middle chunks are
187consulted and after some bit juggling we get a hash value. In the case of strings
188like the following it is clear that the beginning and end look quite the same:
189
190\starttyping
191foo:000001  foo:010001  foo:100001
192\stoptyping
193
194or:
195
196\starttyping
197foo:1.2.12  foo:1.3.12  foo:1.4.12  foo:1.5.12
198\stoptyping
199
200It seems that the used method of hashing is somewhat arbitrary and maybe tuned
201for specific applications. In order to see what the impact is of hashing quite
202similar strings, some experiments were conducted: with \LUATEX\ 0.73 using \LUA\
2035.2 hashing, with \LUAJITTEX\ 0.73, and with the same \LUAJITTEX\ but using the
204hash variant of native \LUA\ 5.1. For each variant we ran tests where strings of
205increasing length were combined with a number (running from one to one million).
206
207\starttabulate[|||]
208\NC none   \NC <string>                   \NC \NR
209\NC right  \NC <string> <number>          \NC \NR
210\NC left   \NC <number> <string>          \NC \NR
211\NC center \NC <string> <number> <string> \NC \NR
212\NC edges  \NC <number> <string> <number> \NC \NR
213\stoptabulate
214
215The differences between engines can be seen in tables in the next page. In the
216fourth table we summarize which engine performs best. Keep in mind that
217\LUAJITTEX\ has the advantage of the faster virtual machine so it has an
218additional speed advantage.
219
220We show three tables with measurements. The \type {none} column shows the
221baseline of the test:
222
223\starttyping
224
225local t = { }
226for i=1,1000000 do
227    t[i] = i
228end
229\stoptyping
230
231The column tagged \quote {right} does this:
232
233\starttyping
234local t = { }
235for i=1,1000000 do
236    t[i] = text .. i
237end
238\stoptyping
239
240And \quote {left} does:
241
242\starttyping
243local t = { }
244for i=1,1000000 do
245    t[i] = i .. text
246end
247\stoptyping
248
249That leaves \quote {center}:
250
251\starttyping
252local t = { }
253for i=1,1000000 do
254    t[i] = text .. i .. text
255end
256\stoptyping
257
258and \quote {edges}:
259
260\starttyping
261local t = { }
262for i=1,1000000 do
263    t[i] = i .. text .. i
264end
265\stoptyping
266
267Of course there is also the loop and the concatenation involved so the last two
268variants have some more overhead. We show some measurements in \in {tables}
269[tab:torture-1], \in [tab:torture-2] \in {and} [tab:torture-3]. So, there we have
270strings like:
271
272\starttyping
2732abc
274222abc
27522222abc
276abc222222
277222222abc222222
278222222abc222222
279abc2222abc
280\stoptyping
281
282and so on. Of course a million such strings makes not much sense in practice but
283it serves our purpose of testing.
284
285\startplacetable[reference=tab:torture-1,location=page,title=\type{context test.tex}]
286    \scale
287      [height=\the\dimexpr\textheight-3\lineheight\relax]
288    % [width=\the\dimexpr\textwidth+.5\backspace\relax]
289      {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luatex-073-LUA52.lua" }}}}
290\stopplacetable
291
292\startplacetable[reference=tab:torture-2,location=page,title=\type{context --jit --jithash=luajit20 test.tex}]
293    \scale
294      [height=\the\dimexpr\textheight-3\lineheight\relax]
295    % [width=\the\dimexpr\textwidth+.5\backspace\relax]
296      {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luajittex-073-JIT20.lua" }}}}
297\stopplacetable
298
299\startplacetable[reference=tab:torture-3,location=page,title=\type{context --jit --jithash=lua51 test.tex}]
300    \scale
301      [height=\the\dimexpr\textheight-3\lineheight\relax]
302    % [width=\the\dimexpr\textwidth+.5\backspace\relax]
303      {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luajittex-073-LUA51.lua" }}}}
304\stopplacetable
305
306In these tables you can see some extremes. On the average \LUA\ 5.2 performs
307quite okay as does standard \LUAJIT. However, when we bring the 5.1 hash variant
308into \LUAJITTEX\ we get a more predictable average performance as it deals better
309with some of the extreme cases that make \LUAJITTEX\ crawl compared to \LUATEX.
310We have done more tests and interesting is to see that in the 5.1 (and derived
3115,2) method there are sometimes cases where odd lengths perform much worse than
312even lengths. Red values are larger than two times the average, blue values
313larger than average while green values indicate a less than half average value.
314
315In \in {table} [tab:compare-1] we show which method performs best relative to each
316other. Of course in many applications there will be no such extreme cases, but
317we happen to ran into them. But, even if \type {JIT20} is a winner in most cases,
318the fact that it has extreme slow exceptions makes it a bit of a gamble.
319
320\startplacetable[location=page,reference=tab:compare-1,title=The best performances per engine and hasher.]
321    \startcombination
322        \startcontent
323            \scale
324              [height=\the\dimexpr\textheight-4\lineheight\relax]
325              {\vbox{\ctxlua{moduledata.luatests.showhashing {
326                  fileset = {
327                      { tag = "JIT20", filename = "luatest-hash-luajittex-073-JIT20.lua" },
328                      { tag = "JIT51", filename = "luatest-hash-luajittex-073-LUA51.lua" },
329              } } }}}
330        \stopcontent
331        \startcaption
332            \LUAJITTEX\ only
333        \stopcaption
334        \startcontent
335            \scale
336              [height=\the\dimexpr\textheight-4\lineheight\relax]
337              {\vbox{\ctxlua{moduledata.luatests.showhashing {
338                  fileset = {
339                      { tag = "LUA52", filename = "luatest-hash-luatex-073-LUA52.lua" },
340                      { tag = "JIT20", filename = "luatest-hash-luajittex-073-JIT20.lua" },
341                      { tag = "JIT51", filename = "luatest-hash-luajittex-073-LUA51.lua" },
342              } } }}}
343        \stopcontent
344        \startcaption
345            Both engines.
346        \stopcaption
347    \stopcombination
348\stopplacetable
349
350The 5.1 hasher runs over the string with a step that depends on the length of the
351string. We've seen that in 5.2 it doesn't hash strings larger than 40 characters.
352The step is calculated by shifting the length (by default) over 5 bits. This
353means that for strings of size 32 and more the step becomes 2 which is why we see
354this odd|/|even timing issue in the tables. Basically we hash at most 32
355characters of the 40. The next table shows that the less characters we take
356into account (first column) the less unique keys we get (second column).
357
358\starttabulate[|c|r|l|]
359\FL
360\NC \bf n \NC \bf unique \NC \bf text \NC \NR
361\FL
362\NC 3 \NC 22    \NC \tt\tx /Border [ 0 0 0 ] /F 4 /Subtype /Link /A * 0 R \NC \NR
363\NC 3 \NC 31    \NC \tt\tx << /D [ * 0 R /Fit ] /S /GoTo >> \NC \NR
364\NC 4 \NC 43    \NC \tt\tx /Border [ 0 0 0 ] /F 4 /Subtype /Link /A * 0 R \NC \NR
365\NC 4 \NC 51    \NC \tt\tx << /D [ * 0 R /Fit ] /S /GoTo >> \NC \NR
366\NC 5 \NC 410   \NC \tt\tx /Border [ 0 0 0 ] /F 4 /Subtype /Link /A * 0 R \NC \NR
367\NC 5 \NC 210   \NC \tt\tx << /D [ * 0 R /Fit ] /S /GoTo >> \NC \NR
368\NC 6 \NC 29947 \NC \tt\tx /Border [ 0 0 0 ] /F 4 /Subtype /Link /A * 0 R \NC \NR
369\NC 6 \NC 29823 \NC \tt\tx << /D [ * 0 R /Fit ] /S /GoTo >> \NC \NR
370\LL
371\stoptabulate
372
373In the next table we show a few cases. The characters that are taken into account
374are colored red. \footnote {Again the first column indicates the shift applied to
375the length in order to determine the step.}
376
377\starttabulate[|c|l|l|]
378\FL
379\NC \bf n \NC \bf text \NC \bf consulted \NC \NR
380\FL
381\NC 3\NC \tt\tx << /D [ 8 0 R /Fit ] /S /GoTo >>  \NC \tt\tx <{\darkred <} /{\darkred D} [{\darkred \space }8 {\darkred 0} R{\darkred \space }/F{\darkred i}t {\darkred ]} /{\darkred S} /{\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
382\NC 3\NC \tt\tx << /D [ 9 0 R /Fit ] /S /GoTo >>  \NC \tt\tx <{\darkred <} /{\darkred D} [{\darkred \space }9 {\darkred 0} R{\darkred \space }/F{\darkred i}t {\darkred ]} /{\darkred S} /{\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
383\NC 3\NC \tt\tx << /D [ 10 0 R /Fit ] /S /GoTo >> \NC \tt\tx <<{\darkred \space }/D{\darkred \space}[ {\darkred 1}0 {\darkred 0} R{\darkred \space }/F{\darkred i}t {\darkred ]} /{\darkred S} /{\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
384\NC 3\NC \tt\tx << /D [ 11 0 R /Fit ] /S /GoTo >> \NC \tt\tx <<{\darkred \space }/D{\darkred \space}[ {\darkred 1}1 {\darkred 0} R{\darkred \space }/F{\darkred i}t {\darkred ]} /{\darkred S} /{\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
385\NC 3\NC \tt\tx << /D [ 12 0 R /Fit ] /S /GoTo >> \NC \tt\tx <<{\darkred \space }/D{\darkred \space}[ {\darkred 1}2 {\darkred 0} R{\darkred \space }/F{\darkred i}t {\darkred ]} /{\darkred S} /{\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
386\ML
387\NC 4\NC \tt\tx << /D [ 8 0 R /Fit ] /S /GoTo >>  \NC \tt\tx <{\darkred <} {\darkred /}D{\darkred \space }[{\darkred \space }8{\darkred \space }0{\darkred \space }R{\darkred \space }/{\darkred F}i{\darkred t} {\darkred ]} {\darkred /}S{\darkred \space }/{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
388\NC 4\NC \tt\tx << /D [ 9 0 R /Fit ] /S /GoTo >>  \NC \tt\tx <{\darkred <} {\darkred /}D{\darkred \space }[{\darkred \space }9{\darkred \space }0{\darkred \space }R{\darkred \space }/{\darkred F}i{\darkred t} {\darkred ]} {\darkred /}S{\darkred \space }/{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
389\NC 4\NC \tt\tx << /D [ 10 0 R /Fit ] /S /GoTo >> \NC \tt\tx {\darkred <}<{\darkred \space}/{\darkred D} {\darkred [} {\darkred 1}0{\darkred \space }0{\darkred \space }R{\darkred \space }/{\darkred F}i{\darkred t} {\darkred ]} {\darkred /}S{\darkred \space }/{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
390\NC 4\NC \tt\tx << /D [ 11 0 R /Fit ] /S /GoTo >> \NC \tt\tx {\darkred <}<{\darkred \space}/{\darkred D} {\darkred [} {\darkred 1}1{\darkred \space }0{\darkred \space }R{\darkred \space }/{\darkred F}i{\darkred t} {\darkred ]} {\darkred /}S{\darkred \space }/{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
391\NC 4\NC \tt\tx << /D [ 12 0 R /Fit ] /S /GoTo >> \NC \tt\tx {\darkred <}<{\darkred \space}/{\darkred D} {\darkred [} {\darkred 1}2{\darkred \space }0{\darkred \space }R{\darkred \space }/{\darkred F}i{\darkred t} {\darkred ]} {\darkred /}S{\darkred \space }/{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
392\LL
393\stoptabulate
394
395Of course, in practice, in \LUA\ 5.2 the longer string exceeds 40 characters so
396is never hashed anyway. Apart from this maximum, the \LUA\ hash code looks like this:
397
398\starttyping
399/* Lua will use at most ~(2^LUAI_HASHLIMIT) bytes from
400a string to compute its hash */
401...
402h = cast(unsigned int,len) ;
403step = (len>>LUAI_HASHLIMIT) + 1 ;
404for (l1=len; l1>=step; l1-=step) {
405   h = h ^ ((h<<5) + (h>>2) + cast(unsigned char,str[l1-1])) ;
406}
407...
408\stoptyping
409
410This translates in verbose \LUA\ function as follows:
411
412\starttyping
413function string.luahash(str,shift)
414    local len  = #str
415    local hash = len
416    local step = bit32.rshift(len,shift or 5) + 1
417    for i=len,1,-step do
418        hash = bit32.bxor(hash, (
419            bit32.lshift(hash,5) +
420            bit32.rshift(hash,2) +
421            string.byte(string.sub(str,i,i))
422        ) )
423    end
424    return hash
425end
426\stoptyping
427
428The reader can argue that the following string would perform better:
429
430\starttyping
431/Subtype/Link/Border[0 0 0]/F 4/A 12 0 R
432\stoptyping
433
434but this is not the case. Also, here we use \PDF\ code, but similar cases can
435happen if we flush \TEX\ commands:
436
437\starttyping
438\dothisorthat{1}
439\dothisorthat{101}
440\dothisorthat{10101}
441\stoptyping
442
443And in the case of \UTF\ strings, it remains a fact that when characters need two
444bytes a sequence can end up with each odd or even byte being the same. This is
445one more reason to support upto 64 byte (or 40 in practice) hashing.
446
447Because of this we decided to experiment with a value of 64 instead. \footnote {Of
448course, in \LUATEX, the length limit kicks in before we get to 64.} We can do the
449same when we use the \LUA\ 5.1 method in \LUAJIT. In \in {table} [tab:torture-4]
450\in {and} [tab:torture-5] we show the timings. Interesting is that we lost the
451extremes now. The performance of the default settings are compared with the higher
452values in \in {table} [tab:compare-2]. Of course the numbers are just indications
453and there might be small differences between test runs. Therefore we use a threshold
454of 5\% when we compare two methods.
455
456\startplacetable[reference=tab:torture-4,location=page,title={\type{context test.tex} with len<=40 and hash<=64}]
457    \scale
458      [height=\the\dimexpr\textheight-3\lineheight\relax]
459    % [width=\the\dimexpr\textwidth+.5\backspace\relax]
460      {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luatex-073-LUA52-40-6.lua" }}}}
461\stopplacetable
462
463\startplacetable[reference=tab:torture-5,location=page,title={\type{context --jit test.tex} with hash<=64}]
464    \scale
465      [height=\the\dimexpr\textheight-3\lineheight\relax]
466    % [width=\the\dimexpr\textwidth+.5\backspace\relax]
467      {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luajittex-073-LUA51-40-6.lua" }}}}
468\stopplacetable
469
470\startplacetable[location=page,reference=tab:compare-2,title=More than 5\% difference between 32 byte or 64 byte hashing.]
471    \startcombination
472        \startcontent
473           \scale
474              [height=\the\dimexpr\textheight-4\lineheight\relax]
475              {\vbox{\ctxlua{moduledata.luatests.showhashing {
476                  fileset = {
477                      { tag = "40 / 32", filename = "luatest-hash-luatex-073-LUA52.lua" },
478                      { tag = "40 / 64", filename = "luatest-hash-luatex-073-LUA52-40-6.lua" },
479              } } }}}
480
481        \stopcontent
482        \startcaption
483            \LUATEX\ (size limit 40)
484        \stopcaption
485        \startcontent
486           \scale
487              [height=\the\dimexpr\textheight-4\lineheight\relax]
488              {\vbox{\ctxlua{moduledata.luatests.showhashing {
489                  fileset = {
490                      { tag = "40 / 32", filename = "luatest-hash-luajittex-073-LUA51.lua" },
491                      { tag = "40 / 64", filename = "luatest-hash-luajittex-073-LUA51-40-6.lua" },
492              } } }}}
493
494        \stopcontent
495        \startcaption
496            \LUAJITTEX\ (no size limit)
497        \stopcaption
498    \stopcombination
499\stopplacetable
500
501So how does this affect us in document production? It is not that hard to get a
502processing rate of a few dozen pages per second on a modern machine, even with
503somewhat complex documents, where \XML\ turns into \PDF. However, interactivity
504comes somehow with a price when we use \LUAJITTEX. In \CONTEXT\ \MKIV\ we do all
505\PDF\ annotations in \LUA\ and that involves assembling dictionaries. Here are
506two examples, a destination:
507
508\starttyping
509<< /D [ 15 0 R /Fit ] /S /GoTo >>
510\stoptyping
511
512and a reference:
513
514\starttyping
515/Subtype /Link /Border [ 0 0 0 ] /F 4 /A 16 0 R
516\stoptyping
517
518These strings are build with small variations and at some point end up in the \PDF\
519file. The same string can end up in the file several times, although sometimes we
520can create a reusable object. In the last case we keep them at the \LUA\ end as
521reference to such a shareable object, a key in an object reference hash.  Now imagine
522that we have some 30K of such references and/or destinations, which indeed happens in
523crited documents. In the next two lines we use a \type {*} to show where the
524differences are:
525
526\starttyping
527<< /D [ * 0 R /Fit ] /S /GoTo >>
528/Subtype /Link /Border [ 0 0 0 ] /F 4 /A * 0 R
529\stoptyping
530
531If we replace these \type {*} by a number, there are big differences between the
532engines with respect to the time needed. This is summarized in the next table.
533\footnote {The numbers concern 30K hash creations. The time shown is the average
534over 30 runs.}
535
536\starttabulate[|c|c|c|l|]
537\FL
538\NC \bf \LUA\ 5.2 \NC \bf \LUAJIT\ 2.0 \NC \bf \LUAJIT\ 2.0+5.1 \NC \NR
539\FL
540\NC 0.096 \NC 0.046 \NC 0.047 \NC \ttx << /D [ * 0 R /Fit ] /S /GoTo >> \NC \NR
541\NC 0.054 \NC 6.017 \NC 0.055 \NC \ttx /Subtype /Link /Border [ 0 0 0 ] /F 4 /A * 0 R \NC \NR
542\LL
543\stoptabulate
544
545Especially the second case behaves bad in \LUAJIT. Say that a result comes out
546as:
547
548\starttyping
549/Subtype /Link /Border [ 0 0 0 ] /F 4 /A 12 0 R
550/Subtype /Link /Border [ 0 0 0 ] /F 4 /A 123 0 R
551/Subtype /Link /Border [ 0 0 0 ] /F 4 /A 1234 0 R
552\stoptyping
553
554The \LUAJIT\ hasher (more or less) looks at the first~4, last~4, middle~4 and
555somewhere a quarter along the string, and uses these sequences for the
556calculation, so you can imagine that there are clashes. The \LUA\ 5.1 hasher runs
557over part of the string and sees more of the difference. The 5.2 hasher has a
558threshold and doesn't hash at all when the length exceeds (by default) 40
559characters, which is the case with the second string. Looking at only specific
560parts of a string is somewhat arbitrary and what works for one kind of
561application is not always good for another.
562
563After these tests we decided that it makes sense to replace the \LUAJIT\ hash
564calculation by the traditional \LUA\  one (or at least give users a choice at
565startup. The choice of hash is a runtime option:
566
567\starttyping
568mtxrunjit --script context --jithash=lua51    ......
569mtxrunjit --script context --jithash=luajit20 ......
570\stoptyping
571
572For the moment we default to the traditional \LUA\ 5.1 hashing method. Although
573it can behave real bad on some large strings we think that chances are low that
574this will happen in practice. An overall good performance on strings like the
575hyperlink examples is more important. Using the \LUA\ 5.2 method would be even
576better but it required a change in the virtual machine and that is not something
577we have in mind.
578
579\stopsection
580
581\stopchapter
582
583\stopcomponent
584
585% Luatex manual:
586%
587% In \LUA\ strings are hashed which makes a test for equality fast and in \LUATEX\
588% we benefit from that fact. Starting with \LUA\ 5.2 the hash function is no longer
589% hashing strings larger than (by default) 40 characters. Of these at most 32
590% characters are hashed in stock \LUA\ but for a string rich environment as \TEX\
591% this can lead to many collisions. Therefore we have now set that constant limit
592% to 64 characters (so in practice it's now 40 too).
593%
594% In \LUAJIT\ the hash function is not the same as in \LUA\ and can in some cases
595% lead to a significant slowdown. We ran into cases where a \LUAJITTEX\ run was 20
596% times slower than a normal \LUATEX\ run while normally such run is 30\% faster.
597% For this reason we have replaced the hash code with the \LUA\ 5.1 hash code. This
598% change is minimal and gives less collisions. The impact on speed can be neglected.
599%
600% For \LUAJITTEX\ you can control the hash method:
601%
602% \starttyping
603% --jithash=luajit
604% --jithash=lua51
605% \stoptyping
606%
607% The current status of the hash function is available in:
608%
609% \starttyping
610% status.list().luatex_hashtype
611% status.list().luatex_hashchars
612% \stoptyping
613%
614% The first one returns \type {lua}, \type{luajit} or \type {lua51} depending on
615% the engine. The second one should always return 6. If it returns 5 then you have
616% a non|-|optimized binary. Other values are suspicious.
617