1
2
3\startcomponent abouthashing
4
5\environment aboutenvironment
6
7\usemodule[luahashing]
8
9\startchapter[title={Lua strings}]
10
11\startsection[title=Introduction]
12
13In the crited project \footnote {This is a project by Thomas Schmitz, Alan
14Braslau, Luigi Scarso and Hans Hagen funded by the Institut für Klassische und
15Romanische Philologie Universität Bonn.} we have to deal with large amounts of
16data. The sources are in \TEI\ \XML\ and processed directly in \CONTEXT\ \MKIV,
17and we have to filter content from different places in the \XML\ tree. Processing
18relies on \LUA\ a lot because we use \LUA\ for dealing with the \XML. Were
19talking about Latin and Greek texts so there is no demand for extensive font
20processing in \LUA\ is moderate. But as critical editions have lots of line
21specific referencing and notes there are some more complex layout elements
22involved, and again these use \LUA. There is also extensive use of bibliographies
23and it will be no surprise that \LUA\ comes to help too. \footnote {One of the
24objectives of the project is to update and enhance the bibliographic subsystem.}
25
26One secondary objective is to be able to process the complex documents at a speed
27of at least 20 pages per second on a modern 2014 workstation laptop. One way of
28achieving this is to use \LUAJITTEX\ which has a faster virtual \LUA\ machine.
29However, we ran into several issues with the \LUAJIT\ interpreter, which is fully
30\LUA\ language 5.1 and partly 5.2 compatible but definitely has a different low
31level implementation. In the next sections I will discuss two issues that Luigi
32and I ran into and for which we could come up with reasonable workarounds.
33
34\stopsection
35
36\startsection[title=The stacks]
37
38A \TEX\ job is normally a multipass experience. One run can produce information
39that is used in a successive one. The reason is that something can happen on page
4015 that influences the typesetting of page9. There can even be a partial chain
41reaction: you typeset a document the first time the table of contents (and the
42pages it refers to) is not known yet but information is saved that makes it
43possible next time. That next run it gets included and it takes for instance 4
44pages. This means that all page numbers shift up. This in turn will trigger a new
45run because all cross references might change too: two digit page numbers can
46become three digits, so paragraphs can run wider, and that again can trigger more
47pages. Normally an initial three runs is enough, and with minor updates of the
48source one or two runs are enough after that.
49
50The multipass information is saved in tables in the so called utility file and
51loaded a next run. Common subtables are shared in the process. In order to
52determine if there has been crucial changes that demand an extra run, we have to
53make sure that random order in these tables is eliminated. Normally we already
54sort keys in tables when writing them to file but some tables come out in the
55order the traversing \type {next} function delivers them. In the more recent 5.2
56versions \LUA\ has added some randomness to the order in which hashed tables are
57organized, so while in previous versions we could assume that for a specific
58binary the order was the same each time, we cannot rely on that any longer. This is
59not that important for normal cases, but we compare previous and current versions
60of the utility file and pack shared tables in them as well, which means that we
61are sensitive for a change in order. But, this could be dealt with at the cost of
62some extra sorting. \footnote {In \CONTEXT\ we also pack font tables which saves
63lots of memory and also some load time).}
64
65Anyway, this kind of changes in the \LUA\ machinery is harmless apart from taking
66some time to adapt to it. It is also the reason why we cannot simply push a new
67update of \LUA\ into \LUATEX\ because low level changes can have an (yet unknown)
68impact. Of course performance is the biggest issue here: we dont want a slower
69\LUATEX.
70
71In the past we already reported on the benefits of \LUAJITTEX, especially its
72faster virtual machine. We dont benefit from jitting; on the contrary it slows
73us down. One reason is that we cross the \LUA\CCODE\ boundary often and hardly
74use any of the optimized functions. Part of the speed is achieved by a different
75implementation deep down and one of them is a different virtual machine
76instruction set. While \LUA\ can go real big in terms of memory and table
77construction, \LUAJIT\ limits us to at most 2G memory and poses some 64K
78limitations in functions and table constructors. The memory is not so much the
79issue in the crited project but the (nested) table constructor is. When we have a
80few tens of thousands of cross references, index entries andor list entries we
81simply cannot load the multipass data. A few days of playing with splitting up
82nested tables didnt help much: it made the code look horrible and eventually we
83again ran into a maximum of 64K someplace as a \type {dofile} effectively makes a
84function that gets run and \LUAJIT\ doesnt like that size. For the record: we
85dont have such issues with large font tables probably because they are just one
86big table. The reason why we cannot use that approach is that serializing the
87potentially very large tables in the utility file also has limitations.
88
89Eventually this could be solved by assuming only forward referencing for certain
90registers. That way we only used the index entries collected in memory during the
91run and as long as we dont put a register before its entries are defined were
92okay. So here we have a typical case where one can set an option to circumvent
93an engine limitation. \footnote {A decade ago similar tricks had to be used to
94support hundreds of thousands of hyperlinks in \TEX\ engines with at that time
95limited memory capabilities.} Explaining this in a user manual is a challenge,
96because an error message like the following is not that helpful:
97
98\starttyping
99main function has more than 65536 constants
100\stoptyping
101
102But, once we could generate these indices again by posing some limitations,
103\LUAJITTEX\ had other issues. This time we got excessive runtime and we spent
104quite some time sorting that one out. More on that in the next section.
105
106\stopsection
107
108\startsection[title=Hashing]
109
110One of the reasons why (text processing with) \LUA\ is rather fast is that it
111hashes its strings so that a test for equality is real fast. This means that for
112each string that enters \LUA\ a hash value is calculated and that hash is used in
113comparisons. Of course hashing takes time, but especially when you work with lots
114of tables the advantage of a simple hash compare outweighs this onetime
115hashing. On the other hand, if you work with files and process lines, and maybe
116split these in words, you might end up with a lot of unneeded hashing. But, in
117\LUATEX\ and therefore \MKIV\ we benefit from hashing a lot. In \LUA\ 5.2 the
118hash function was adapted so that only strings upto than (default) 40 characters
119get hashed. In practice were not affected much by this, as most keywords we use
120are shorter than this boundary. And in \CONTEXT\ we do quite some keyword checking.
121
122So, when we were conducting tests with these large registers, we were surprised
123that \LUAJITTEX\ performed significantly slower (ten times or more) that stock
124\LUATEX, while until then we had observed that a \LUAJITTEX\ run was normally
125some 20 to 40\% faster.
126
127The first impression was that it related to the large amount of strings that are
128written from \LUA\ to \TEX. After index entries are collected, they are sorted
129and the index is flushed to \TEX. This happens in one go, and \TEX\ code ends up
130in the \TEX\ input stack. Some actions are delayed and create callbacks to \LUA,
131so some wrapping in functions happens too. That means that some (\LUA) strings
132are only freed later on, but that proved not to be the main problem.
133
134When the entries are typeset, an interactive cross reference is kept track of and
135these exist till the document is closed and the referencing information is
136written to the \PDF\ file. Of course we could tweak this but once you start along
137that path there is no end to writing ugly hacks.
138
139Eventually we found that the slowdown relates to hashing, especially because that is
140not the first area where you look. Why is this? The specific register concerned lots
141of small greek words, pointing to locations in a text, where locations looked like
142\type {1.2.3}. In case you wonder why greek is mentioned: in multibyte \UTF\
143sequences there is a lot of repetition:
144
145\startluacode
146local byte = string.byte
147function sample(s)
148 context.NC() context(s)
149 context.NC() context.ttx(false)
150 for b in string.utfvalues(s) do
151 context("%02X ",b)
152 end
153 context.NC() context.ttx(false)
154 for b in string.gmatch(s,".") do
155 context("%02X ",byte(b))
156 end
157 context.NC() context.NR()
158end
159
160context.starttabulate { "||||" }
161context.FL()
162context.NC() context.bold("word")
163context.NC() context.bold("unicode")
164context.NC() context.bold("bytes")
165context.NC() context.NR()
166context.FL()
167sample("βίον")
168sample("βίου")
169sample("βιοὺς")
170sample("βουλὴν")
171sample("βουλῆς")
172context.LL()
173context.stoptabulate()
174\stopluacode
175
176When cross referencing these index entries with their origin, you end up with
177reference identifiers like \type {foo:1.2.3} or, because \CONTEXT\ has automated
178internal references (which are rather efficient in the resulting \PDF), we get
179\type {aut:1}, \type {aut:2} upto in this case some 30.000 of them.
180
181The problem with hashing is as follows. When we write commands to \TEX\ or use
182data with a repetitive property, the similarity of these strings can be hard on
183the hasher as it can produce similar hash keys in which case collisions need to
184be dealt with. Im no expert on hashing but looking at the code shows that in
185\LUAJIT\ (at least in the version were talking about) the string is seen as
186chunks of 4 bytes. The first, last, middle and halfway middle chunks are
187consulted and after some bit juggling we get a hash value. In the case of strings
188like the following it is clear that the beginning and end look quite the same:
189
190\starttyping
191foo:000001 foo:010001 foo:100001
192\stoptyping
193
194or:
195
196\starttyping
197foo:1.2.12 foo:1.3.12 foo:1.4.12 foo:1.5.12
198\stoptyping
199
200It seems that the used method of hashing is somewhat arbitrary and maybe tuned
201for specific applications. In order to see what the impact is of hashing quite
202similar strings, some experiments were conducted: with \LUATEX\ 0.73 using \LUA\
2035.2 hashing, with \LUAJITTEX\ 0.73, and with the same \LUAJITTEX\ but using the
204hash variant of native \LUA\ 5.1. For each variant we ran tests where strings of
205increasing length were combined with a number (running from one to one million).
206
207\starttabulate[]
208\NC none \NC <string> \NC \NR
209\NC right \NC <string> <number> \NC \NR
210\NC left \NC <number> <string> \NC \NR
211\NC center \NC <string> <number> <string> \NC \NR
212\NC edges \NC <number> <string> <number> \NC \NR
213\stoptabulate
214
215The differences between engines can be seen in tables in the next page. In the
216fourth table we summarize which engine performs best. Keep in mind that
217\LUAJITTEX\ has the advantage of the faster virtual machine so it has an
218additional speed advantage.
219
220We show three tables with measurements. The \type {none} column shows the
221baseline of the test:
222
223\starttyping
224
225local t = { }
226for i=1,1000000 do
227 t[i] = i
228end
229\stoptyping
230
231The column tagged \quote {right} does this:
232
233\starttyping
234local t = { }
235for i=1,1000000 do
236 t[i] = text .. i
237end
238\stoptyping
239
240And \quote {left} does:
241
242\starttyping
243local t = { }
244for i=1,1000000 do
245 t[i] = i .. text
246end
247\stoptyping
248
249That leaves \quote {center}:
250
251\starttyping
252local t = { }
253for i=1,1000000 do
254 t[i] = text .. i .. text
255end
256\stoptyping
257
258and \quote {edges}:
259
260\starttyping
261local t = { }
262for i=1,1000000 do
263 t[i] = i .. text .. i
264end
265\stoptyping
266
267Of course there is also the loop and the concatenation involved so the last two
268variants have some more overhead. We show some measurements in \in {tables}
269[tab:torture1], \in [tab:torture2] \in {and} [tab:torture3]. So, there we have
270strings like:
271
272\starttyping
2732abc
274222abc
27522222abc
276abc222222
277222222abc222222
278222222abc222222
279abc2222abc
280\stoptyping
281
282and so on. Of course a million such strings makes not much sense in practice but
283it serves our purpose of testing.
284
285\startplacetable[reference=tab:torture1,location=page,title=\type{context test.tex}]
286 \scale
287 [height=\the\dimexpr\textheight3\lineheight\relax]
288
289 {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luatex-073-LUA52.lua" }}}}
290\stopplacetable
291
292\startplacetable[reference=tab:torture2,location=page,title=\type{context jit jithash=luajit20 test.tex}]
293 \scale
294 [height=\the\dimexpr\textheight3\lineheight\relax]
295
296 {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luajittex-073-JIT20.lua" }}}}
297\stopplacetable
298
299\startplacetable[reference=tab:torture3,location=page,title=\type{context jit jithash=lua51 test.tex}]
300 \scale
301 [height=\the\dimexpr\textheight3\lineheight\relax]
302
303 {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luajittex-073-LUA51.lua" }}}}
304\stopplacetable
305
306In these tables you can see some extremes. On the average \LUA\ 5.2 performs
307quite okay as does standard \LUAJIT. However, when we bring the 5.1 hash variant
308into \LUAJITTEX\ we get a more predictable average performance as it deals better
309with some of the extreme cases that make \LUAJITTEX\ crawl compared to \LUATEX.
310We have done more tests and interesting is to see that in the 5.1 (and derived
3115,2) method there are sometimes cases where odd lengths perform much worse than
312even lengths. Red values are larger than two times the average, blue values
313larger than average while green values indicate a less than half average value.
314
315In \in {table} [tab:compare1] we show which method performs best relative to each
316other. Of course in many applications there will be no such extreme cases, but
317we happen to ran into them. But, even if \type {JIT20} is a winner in most cases,
318the fact that it has extreme slow exceptions makes it a bit of a gamble.
319
320\startplacetable[location=page,reference=tab:compare1,title=The best performances per engine and hasher.]
321 \startcombination
322 \startcontent
323 \scale
324 [height=\the\dimexpr\textheight4\lineheight\relax]
325 {\vbox{\ctxlua{moduledata.luatests.showhashing {
326 fileset = {
327 { tag = "JIT20", filename = "luatest-hash-luajittex-073-JIT20.lua" },
328 { tag = "JIT51", filename = "luatest-hash-luajittex-073-LUA51.lua" },
329 } } }}}
330 \stopcontent
331 \startcaption
332 \LUAJITTEX\ only
333 \stopcaption
334 \startcontent
335 \scale
336 [height=\the\dimexpr\textheight4\lineheight\relax]
337 {\vbox{\ctxlua{moduledata.luatests.showhashing {
338 fileset = {
339 { tag = "LUA52", filename = "luatest-hash-luatex-073-LUA52.lua" },
340 { tag = "JIT20", filename = "luatest-hash-luajittex-073-JIT20.lua" },
341 { tag = "JIT51", filename = "luatest-hash-luajittex-073-LUA51.lua" },
342 } } }}}
343 \stopcontent
344 \startcaption
345 Both engines.
346 \stopcaption
347 \stopcombination
348\stopplacetable
349
350The 5.1 hasher runs over the string with a step that depends on the length of the
351string. Weve seen that in 5.2 it doesnt hash strings larger than 40 characters.
352The step is calculated by shifting the length (by default) over 5 bits. This
353means that for strings of size 32 and more the step becomes 2 which is why we see
354this oddeven timing issue in the tables. Basically we hash at most 32
355characters of the 40. The next table shows that the less characters we take
356into account (first column) the less unique keys we get (second column).
357
358\starttabulate[crl]
359\FL
360\NC \bf n \NC \bf unique \NC \bf text \NC \NR
361\FL
362\NC 3 \NC 22 \NC \tt\tx Border [ 0 0 0 ] F 4 Subtype Link A * 0 R \NC \NR
363\NC 3 \NC 31 \NC \tt\tx << D [ * 0 R Fit ] S GoTo >> \NC \NR
364\NC 4 \NC 43 \NC \tt\tx Border [ 0 0 0 ] F 4 Subtype Link A * 0 R \NC \NR
365\NC 4 \NC 51 \NC \tt\tx << D [ * 0 R Fit ] S GoTo >> \NC \NR
366\NC 5 \NC 410 \NC \tt\tx Border [ 0 0 0 ] F 4 Subtype Link A * 0 R \NC \NR
367\NC 5 \NC 210 \NC \tt\tx << D [ * 0 R Fit ] S GoTo >> \NC \NR
368\NC 6 \NC 29947 \NC \tt\tx Border [ 0 0 0 ] F 4 Subtype Link A * 0 R \NC \NR
369\NC 6 \NC 29823 \NC \tt\tx << D [ * 0 R Fit ] S GoTo >> \NC \NR
370\LL
371\stoptabulate
372
373In the next table we show a few cases. The characters that are taken into account
374are colored red. \footnote {Again the first column indicates the shift applied to
375the length in order to determine the step.}
376
377\starttabulate[cll]
378\FL
379\NC \bf n \NC \bf text \NC \bf consulted \NC \NR
380\FL
381\NC 3\NC \tt\tx << D [ 8 0 R Fit ] S GoTo >> \NC \tt\tx <{\darkred <} {\darkred D} [{\darkred \space }8 {\darkred 0} R{\darkred \space }F{\darkred i}t {\darkred ]} {\darkred S} {\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
382\NC 3\NC \tt\tx << D [ 9 0 R Fit ] S GoTo >> \NC \tt\tx <{\darkred <} {\darkred D} [{\darkred \space }9 {\darkred 0} R{\darkred \space }F{\darkred i}t {\darkred ]} {\darkred S} {\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
383\NC 3\NC \tt\tx << D [ 10 0 R Fit ] S GoTo >> \NC \tt\tx <<{\darkred \space }D{\darkred \space}[ {\darkred 1}0 {\darkred 0} R{\darkred \space }F{\darkred i}t {\darkred ]} {\darkred S} {\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
384\NC 3\NC \tt\tx << D [ 11 0 R Fit ] S GoTo >> \NC \tt\tx <<{\darkred \space }D{\darkred \space}[ {\darkred 1}1 {\darkred 0} R{\darkred \space }F{\darkred i}t {\darkred ]} {\darkred S} {\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
385\NC 3\NC \tt\tx << D [ 12 0 R Fit ] S GoTo >> \NC \tt\tx <<{\darkred \space }D{\darkred \space}[ {\darkred 1}2 {\darkred 0} R{\darkred \space }F{\darkred i}t {\darkred ]} {\darkred S} {\darkred G}oT{\darkred o} >{\darkred >} \NC \NR
386\ML
387\NC 4\NC \tt\tx << D [ 8 0 R Fit ] S GoTo >> \NC \tt\tx <{\darkred <} {\darkred }D{\darkred \space }[{\darkred \space }8{\darkred \space }0{\darkred \space }R{\darkred \space }{\darkred F}i{\darkred t} {\darkred ]} {\darkred }S{\darkred \space }{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
388\NC 4\NC \tt\tx << D [ 9 0 R Fit ] S GoTo >> \NC \tt\tx <{\darkred <} {\darkred }D{\darkred \space }[{\darkred \space }9{\darkred \space }0{\darkred \space }R{\darkred \space }{\darkred F}i{\darkred t} {\darkred ]} {\darkred }S{\darkred \space }{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
389\NC 4\NC \tt\tx << D [ 10 0 R Fit ] S GoTo >> \NC \tt\tx {\darkred <}<{\darkred \space}{\darkred D} {\darkred [} {\darkred 1}0{\darkred \space }0{\darkred \space }R{\darkred \space }{\darkred F}i{\darkred t} {\darkred ]} {\darkred }S{\darkred \space }{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
390\NC 4\NC \tt\tx << D [ 11 0 R Fit ] S GoTo >> \NC \tt\tx {\darkred <}<{\darkred \space}{\darkred D} {\darkred [} {\darkred 1}1{\darkred \space }0{\darkred \space }R{\darkred \space }{\darkred F}i{\darkred t} {\darkred ]} {\darkred }S{\darkred \space }{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
391\NC 4\NC \tt\tx << D [ 12 0 R Fit ] S GoTo >> \NC \tt\tx {\darkred <}<{\darkred \space}{\darkred D} {\darkred [} {\darkred 1}2{\darkred \space }0{\darkred \space }R{\darkred \space }{\darkred F}i{\darkred t} {\darkred ]} {\darkred }S{\darkred \space }{\darkred G}o{\darkred T}o{\darkred \space }>{\darkred >} \NC \NR
392\LL
393\stoptabulate
394
395Of course, in practice, in \LUA\ 5.2 the longer string exceeds 40 characters so
396is never hashed anyway. Apart from this maximum, the \LUA\ hash code looks like this:
397
398\starttyping
399* Lua will use at most (2LUAIHASHLIMIT) bytes from
400a string to compute its hash *
401...
402h = cast(unsigned int,len) ;
403step = (len>>LUAIHASHLIMIT) 1 ;
404for (l1=len; l1>=step; l1=step) {
405 h = h ((h<<5) (h>>2) cast(unsigned char,str[l11])) ;
406}
407...
408\stoptyping
409
410This translates in verbose \LUA\ function as follows:
411
412\starttyping
413function string.luahash(str,shift)
414 local len = #str
415 local hash = len
416 local step = bit32.rshift(len,shift or 5) 1
417 for i=len,1,step do
418 hash = bit32.bxor(hash, (
419 bit32.lshift(hash,5)
420 bit32.rshift(hash,2)
421 string.byte(string.sub(str,i,i))
422 ) )
423 end
424 return hash
425end
426\stoptyping
427
428The reader can argue that the following string would perform better:
429
430\starttyping
431SubtypeLinkBorder[0 0 0]F 4A 12 0 R
432\stoptyping
433
434but this is not the case. Also, here we use \PDF\ code, but similar cases can
435happen if we flush \TEX\ commands:
436
437\starttyping
438\dothisorthat{1}
439\dothisorthat{101}
440\dothisorthat{10101}
441\stoptyping
442
443And in the case of \UTF\ strings, it remains a fact that when characters need two
444bytes a sequence can end up with each odd or even byte being the same. This is
445one more reason to support upto 64 byte (or 40 in practice) hashing.
446
447Because of this we decided to experiment with a value of 64 instead. \footnote {Of
448course, in \LUATEX, the length limit kicks in before we get to 64.} We can do the
449same when we use the \LUA\ 5.1 method in \LUAJIT. In \in {table} [tab:torture4]
450\in {and} [tab:torture5] we show the timings. Interesting is that we lost the
451extremes now. The performance of the default settings are compared with the higher
452values in \in {table} [tab:compare2]. Of course the numbers are just indications
453and there might be small differences between test runs. Therefore we use a threshold
454of 5\% when we compare two methods.
455
456\startplacetable[reference=tab:torture4,location=page,title={\type{context test.tex} with len<=40 and hash<=64}]
457 \scale
458 [height=\the\dimexpr\textheight3\lineheight\relax]
459
460 {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luatex-073-LUA52-40-6.lua" }}}}
461\stopplacetable
462
463\startplacetable[reference=tab:torture5,location=page,title={\type{context jit test.tex} with hash<=64}]
464 \scale
465 [height=\the\dimexpr\textheight3\lineheight\relax]
466
467 {\vbox{\ctxlua{moduledata.luatests.showhashing { filename = "luatest-hash-luajittex-073-LUA51-40-6.lua" }}}}
468\stopplacetable
469
470\startplacetable[location=page,reference=tab:compare2,title=More than 5\% difference between 32 byte or 64 byte hashing.]
471 \startcombination
472 \startcontent
473 \scale
474 [height=\the\dimexpr\textheight4\lineheight\relax]
475 {\vbox{\ctxlua{moduledata.luatests.showhashing {
476 fileset = {
477 { tag = "40 / 32", filename = "luatest-hash-luatex-073-LUA52.lua" },
478 { tag = "40 / 64", filename = "luatest-hash-luatex-073-LUA52-40-6.lua" },
479 } } }}}
480
481 \stopcontent
482 \startcaption
483 \LUATEX\ (size limit 40)
484 \stopcaption
485 \startcontent
486 \scale
487 [height=\the\dimexpr\textheight4\lineheight\relax]
488 {\vbox{\ctxlua{moduledata.luatests.showhashing {
489 fileset = {
490 { tag = "40 / 32", filename = "luatest-hash-luajittex-073-LUA51.lua" },
491 { tag = "40 / 64", filename = "luatest-hash-luajittex-073-LUA51-40-6.lua" },
492 } } }}}
493
494 \stopcontent
495 \startcaption
496 \LUAJITTEX\ (no size limit)
497 \stopcaption
498 \stopcombination
499\stopplacetable
500
501So how does this affect us in document production? It is not that hard to get a
502processing rate of a few dozen pages per second on a modern machine, even with
503somewhat complex documents, where \XML\ turns into \PDF. However, interactivity
504comes somehow with a price when we use \LUAJITTEX. In \CONTEXT\ \MKIV\ we do all
505\PDF\ annotations in \LUA\ and that involves assembling dictionaries. Here are
506two examples, a destination:
507
508\starttyping
509<< D [ 15 0 R Fit ] S GoTo >>
510\stoptyping
511
512and a reference:
513
514\starttyping
515Subtype Link Border [ 0 0 0 ] F 4 A 16 0 R
516\stoptyping
517
518These strings are build with small variations and at some point end up in the \PDF\
519file. The same string can end up in the file several times, although sometimes we
520can create a reusable object. In the last case we keep them at the \LUA\ end as
521reference to such a shareable object, a key in an object reference hash. Now imagine
522that we have some 30K of such references andor destinations, which indeed happens in
523crited documents. In the next two lines we use a \type {*} to show where the
524differences are:
525
526\starttyping
527<< D [ * 0 R Fit ] S GoTo >>
528Subtype Link Border [ 0 0 0 ] F 4 A * 0 R
529\stoptyping
530
531If we replace these \type {*} by a number, there are big differences between the
532engines with respect to the time needed. This is summarized in the next table.
533\footnote {The numbers concern 30K hash creations. The time shown is the average
534over 30 runs.}
535
536\starttabulate[cccl]
537\FL
538\NC \bf \LUA\ 5.2 \NC \bf \LUAJIT\ 2.0 \NC \bf \LUAJIT\ 2.05.1 \NC \NR
539\FL
540\NC 0.096 \NC 0.046 \NC 0.047 \NC \ttx << D [ * 0 R Fit ] S GoTo >> \NC \NR
541\NC 0.054 \NC 6.017 \NC 0.055 \NC \ttx Subtype Link Border [ 0 0 0 ] F 4 A * 0 R \NC \NR
542\LL
543\stoptabulate
544
545Especially the second case behaves bad in \LUAJIT. Say that a result comes out
546as:
547
548\starttyping
549Subtype Link Border [ 0 0 0 ] F 4 A 12 0 R
550Subtype Link Border [ 0 0 0 ] F 4 A 123 0 R
551Subtype Link Border [ 0 0 0 ] F 4 A 1234 0 R
552\stoptyping
553
554The \LUAJIT\ hasher (more or less) looks at the first4, last4, middle4 and
555somewhere a quarter along the string, and uses these sequences for the
556calculation, so you can imagine that there are clashes. The \LUA\ 5.1 hasher runs
557over part of the string and sees more of the difference. The 5.2 hasher has a
558threshold and doesnt hash at all when the length exceeds (by default) 40
559characters, which is the case with the second string. Looking at only specific
560parts of a string is somewhat arbitrary and what works for one kind of
561application is not always good for another.
562
563After these tests we decided that it makes sense to replace the \LUAJIT\ hash
564calculation by the traditional \LUA\ one (or at least give users a choice at
565startup. The choice of hash is a runtime option:
566
567\starttyping
568mtxrunjit script context jithash=lua51 ......
569mtxrunjit script context jithash=luajit20 ......
570\stoptyping
571
572For the moment we default to the traditional \LUA\ 5.1 hashing method. Although
573it can behave real bad on some large strings we think that chances are low that
574this will happen in practice. An overall good performance on strings like the
575hyperlink examples is more important. Using the \LUA\ 5.2 method would be even
576better but it required a change in the virtual machine and that is not something
577we have in mind.
578
579\stopsection
580
581\stopchapter
582
583\stopcomponent
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617 |