lowlevel-balancing.tex /size: 36 Kb    last modification: 2025-02-21 11:03
1% language=us runpath=texruns:manuals/lowlevel
2
3\environment lowlevel-style
4
5\startdocument
6  [title=balancing,
7   color=middlecyan]
8
9\startsectionlevel[title=Introduction]
10
11{\em This is work in progress as per end 2024 these mechanisms are still in flux.
12We expect them to be stable around the \CONTEXT\ meeting in 2025. The text is not
13corrected, so feel free to comment.}
14
15This manual is about a new (sort of fundamental) feature that got added to
16\LUAMETATEX\ when we started upgrading column sets. In \TEX\ we have a par
17builder that does a multi|-|pass optimization where it considers various
18solutions based on tolerance, penalties, demerits etc. The page builder on the
19other hand is forward looking and backtracks to a previous break when there is an
20overflow. The balancing mechanism discussed here is basically a page builder
21operating like the par builder: it looks at the whole picture.
22
23In order to make this a useful mechanism the engine also permits intercepting the
24main vertical list, so we start by introducing this.
25
26\stopsectionlevel
27
28\startsectionlevel[title=Intercepting the MVL]
29
30When content gets processed it's added to a list. We can be in horizontal mode or
31vertical mode (let's forget about math mode). In vertical mode we can be in a box
32context (say \type {\vbox}) or in what is called the main vertical list: the one
33that makes the page. But what is page? When \TEX\ has collected enough to match
34the criteria set by \type {\pagegoal} which starts out as \type {\vsize}, it will
35call the so called output routine which basically is expanding the \type
36{\output} token list. That routine had do so something with the box that has the
37collected material. It can become a page, likely with the content wrapped in a
38page body with headers and footers and such, but it can also be stored for later
39assembly, for instance in multiple columns, or after some analysis fed back into
40the main vertical list.
41
42For various mechanisms it matters if they are used inside a contained boxed
43environment or in the more liberal main vertical list (from now on called mvl).
44That's why we can intercept the mvl and use it later. Intercepting works as
45follows:
46
47\starttyping
48\beginmvl 1
49various content
50\endmvl
51
52\beginmvl 2
53various content
54\endmvl
55\stoptyping
56
57When at some point you want this content, you can do this:
58
59\starttyping
60\setbox\scratchboxone\flushmvl 2
61\setbox\scratchboxtwo\flushmvl 1
62\stoptyping
63
64and then do whatever is needed. You can see what goes on with:
65
66\starttyping
67\tracingmvl 1
68\stoptyping
69
70There is not much more to say other than that this is the way to operate on
71content as if it were added to the page which can be different from collecting
72something in a vertical box. Think of various callbacks that can differ for the
73mvl and a box.
74
75The \type {\beginmvl} primitive takes a number or a set of keywords, as in:
76
77\starttyping
78\beginmvl
79    index   1
80    options \numexpr "01 + "04\relax
81\relax
82\stoptyping
83
84There is of course some possible interference with mechanism that check the page
85properties like \type {\pagegoal}. If needed one can check this:
86
87\starttyping
88\ifcase\mvlcurrentlyactive
89  % main mvl
90\or
91  % first one
92\else
93  % other ones
94\fi
95\stoptyping
96
97Possible applications of this mechanism are the mentioned columns and parallel,
98independent, streams. However for that we need to be able to manipulate the
99collected content. Actually, the next manipulator preceded the capturing, because
100we first wanted to make sure that what we had in mind made sense.
101
102The \type {beginmvl} also accepts keywords. You can specify an \type {index} (an
103integer), a \type {prevdepth} (dimensions) and \type {options} (an integer
104bitset). Possible option bit related values are:
105
106\starttabulate[|Tr|||]
107\NC 0x\tohexadecimal\ignoreprevdepthmvloptioncode \NC ignore prevdepth \NC \type {\ignoreprevdepthmvloptioncode} \NC \NR
108\NC 0x\tohexadecimal\noprevdepthmvloptioncode     \NC no prevdepth     \NC \type {\noprevdepthmvloptioncode    } \NC \NR
109\NC 0x\tohexadecimal\discardtopmvloptioncode      \NC discard top      \NC \type {\discardtopmvloptioncode     } \NC \NR
110\NC 0x\tohexadecimal\discardbottommvloptioncode   \NC discard bottom   \NC \type {\discardbottommvloptioncode  } \NC \NR
111\stoptabulate
112
113Here the last column is a numeric alias available in \CONTEXT. More options are
114likely to show up. When we eventually will balance these lists the routine will
115deal with the discardables (like glue) but one can also remove them via the
116options.
117
118\startbuffer
119\beginmvl
120    index     1
121    prevdepth 0pt
122    options  \discardtopmvloptioncode
123\relax
124\scratchdimen\prevdepth
125\dontleavehmode
126\quad\the\mvlcurrentlyactive\quad\the\scratchdimen
127\quad\blackrule[height=\strutht,depth=\strutdp,color=darkred]
128\endmvl
129
130\ruledhbox {\llap{1\quad}\flushmvl 1}
131\stopbuffer
132
133\typebuffer \start \showmakeup[line] \getbuffer \stop
134
135\startbuffer
136\beginmvl
137    index  2
138    options \numexpr
139                \ignoreprevdepthmvloptioncode
140              + \discardtopmvloptioncode
141            \relax
142\relax
143\scratchdimen\prevdepth
144\dontleavehmode
145\quad\the\mvlcurrentlyactive\quad\the\scratchdimen
146\quad\blackrule[height=\strutht,depth=\strutdp,color=darkred]
147\endmvl
148
149\ruledhbox {\llap{2\quad}\flushmvl 2}
150\stopbuffer
151
152\typebuffer \start \showmakeup[line] \getbuffer \stop
153
154\startbuffer
155\beginmvl 3 % when no keywords are used we expect a number
156\scratchdimen\prevdepth
157\dontleavehmode
158\quad\the\mvlcurrentlyactive\quad\the\scratchdimen
159\quad\blackrule[height=\strutht,depth=\strutdp,color=darkred]
160\endmvl
161
162\ruledhbox {\llap{3\quad}\flushmvl 3}
163\stopbuffer
164
165\typebuffer \start \showmakeup[line] \getbuffer \stop
166
167\startbuffer
168\beginmvl index 4 options 1
169\scratchdimen\prevdepth
170\dontleavehmode
171\quad\the\mvlcurrentlyactive\quad\the\scratchdimen
172\quad\blackrule[height=\strutht,depth=\strutdp,color=darkred]
173\endmvl
174
175\ruledhbox {\llap{4\quad}\flushmvl 4}
176\stopbuffer
177
178\typebuffer \start \showmakeup[line] \getbuffer \stop
179
180\stopsectionlevel
181
182\startsectionlevel[title=Balancing]
183
184Balancing is not referring to balancing columns but to \quote {a result that
185looks well balanced}. Just like we want lines in a paragraph to look consistent
186with each other, something that is reflected in the (adjacent) demerits, we want
187the same with vertical split of pieces. For this purpose we took elements of the
188par builders to construct a (page) snippet builder. Here are some highlights:
189
190\startitemize
191
192\startitem
193    Instead of a pretolerance, tolerance and emergency pass we only enable the
194    last two. In the par builder the pretolerance pass is the one without
195    hyphenation.
196\stopitem
197
198\startitem
199    We seriously considered vertical discretionaries but eventually rejected the
200    idea: we just don't expect users to go through the trouble of adding lots of
201    split related pre, post and replace content. It's not hard to support it but
202    in the end it also interfered with other demands that we had. We kept the
203    code around for a while but then removed it. To mention one complication: if
204    we add some new node we also need to intercept it in various callbacks that
205    we already have in place in \CONTEXT. As with horizontal discretionaries, we
206    then need to go into the components and sometimes even need to make decisions
207    what can not yet be made.
208\stopitem
209
210\startitem
211    As with the par builder, \TEX\ will happily produce an overfull box when no
212    solution is possible that fits the constraints. In a paragraph there are
213    plenty spaces (with stretch) and discretionaries (with components that vary
214    in width) which enlarges the solution space. In vertical material there is
215    less possible so there an emergency pass really makes sense: better be
216    underful than overful.
217\stopitem
218
219\startitem
220    In many cases there is no stretch available. There are also widow, club,
221    shape and orphan penalties that can limit the solution space.
222\stopitem
223
224\startitem
225    When we look at splitting pages (and boxes) we see (split) top skip kick in.
226    This is something that we need to provide one way ot the other. And as we
227    have to do that, we can as well provide support for bottom skip. A horizontal
228    analogue is protrusion, something that also has to be taken into account in a
229    rather dynamic way, at the beginning or end of the currently analyzed line.
230\stopitem
231
232\startitem
233    There is no equivalent of hanging indentation but a shape makes sense. Here
234    the shape defines heights, top and bottom skips and maybe more in the future.
235    For that reason we use a keyword driven shape.
236\stopitem
237
238\startitem
239    Because we have so called par passes, it made sense to have something similar
240    for balancing. This gives is the opportunity to experiment with various
241    variables that drive the process.
242\stopitem
243
244\startitem
245    For those who read what we wrote about the par builder, it will not come as
246    surprise that we also added extensive tracing and a callback for intercepting
247    the results. This makes it possible to show the same detailed output as we
248    can do for par passes.
249\stopitem
250
251\stopitemize
252
253It's about time for some examples but before we come to that it is good to
254roughly explain how the page builder works. When the page builder is triggered it
255will take elements from the contributions list and add them to the page. When
256doing that it keeps track of the height and depth as contributed by boxes and
257rules. Because it will discard glue and kerns it does some checking there. An
258important feature is that the depth is added in a next iteration. The routine
259also needs to look at inserts. The variables \type {\pagegoal} (original \type
260{\vsize} minus accumulated insert heights) and \type {\pagetotal} are compared
261and when we run over the target height the accumulated stretch and shrink in glue
262(when present) will be used to determine how bad this break is. If it is too bad,
263the previous best break will be taken. Penalties can make a possible break more
264or less attractive. When the output routine gets a split of page, the total is
265not reliable because we can have backtracked to the previous break. In
266\LUAMETATEX\ we have some more variables, like \type {\pagelastheight}, that give
267a better estimate of what we got.
268
269In order to make the first lines align properly relative to the top of the page
270there is a variable \type {\topskip}. The height of the first line is at least
271that amount. The correction is calculated when the first contribution happens: a
272box or rule.
273
274When we look at the balancer it is good to keep in mind that where the page
275builder stepwise adds and checks, the balancer looks at the whole picture. The
276page builder does a decent job but is less sophisticated than the par builder.
277There is a badness calculation, penalties are looked at, glue is taken into
278account but there are no demerits.
279
280We want the balancer to work well with column sets that are very much grid based.
281But in getting there we had some hurdles to take. Because the algorithm (like the
282par builder) happily results in overfull boxes unless emergency stretch is set,
283pages can overflow. When there is no stretch and|/|or shrink using emergency
284stretch can give an underfull page.
285
286The way out of this is to have non destructive trial passes and decrease the
287number of lines. Of course we can get short pages but when for instance it
288concerns a section title that gets moved this is no big deal. In a similar
289fashion splitting a multi|-|line formula is also okay.
290
291\startitemize
292\startitem
293    Collect the content in an mvl list and after that's done put the result in a
294    box.
295\stopitem
296\startitem
297    Set up a balance shape that specifies the slots in in columns (normally a
298    column is just a blob of text).
299\stopitem
300\startitem
301    Perform a trial balance run. As soon as an overfull page is seen, adapt the
302    balance shape and do a new trial run.
303\stopitem
304\startitem
305    When we're fine, either because we reached the end without overfull column or
306    by passing the set deadcycles value, quit the trial process and balance the
307    original list using the most recent balance shape.
308\stopitem
309\startitem
310    Flush the result by fetching the topmost from the result split collection and
311    feed it into the page flow. The boxed pseudo page will happily trigger the
312    output routine that in turn construct the final page.
313\stopitem
314\stopitemize
315
316At some point we decided to support multiple mvl streams and therefore changed
317the last mentioned step. Because we store the whole column set we can as well
318also store the assembled page bodies. This way we can flush different streams into
319the same result.
320
321\startitemize
322\startitem
323    Flush the result by fetching the topmost from the result split collection and
324    feed it into the page flow. Do this for every saved (mvl) stream.
325\stopitem
326\startitem
327    When we're done, the boxed pseudo pages will be flushed as pages. In the
328    process, for every page we identify marks.
329\stopitem
330\stopitemize
331
332We are now ready to look at some examples. Here we also show what balance shapes
333do. These basically describe a sequence of slots to be filled. The last
334specification is used when we exceed the number of defined slots. These are just
335examples of simple situations, for real applications more code is needed.
336
337\startbuffer[one]
338\setbox\scratchboxone\vbox\bgroup
339    \hsize.30\hsize
340    \samplefile{tufte}
341\egroup
342\stopbuffer
343
344\startbuffer[two]
345\balanceshape 3
346    vsize      12\lineheight
347    topskip    \strutht
348    bottomskip \strutdp
349next
350    vsize       5\lineheight
351    topskip    \strutht
352    bottomskip \strutdp
353next
354    vsize      8\lineheight
355    topskip    \strutht
356    bottomskip \strutdp
357\relax
358\stopbuffer
359
360\startbuffer[three]
361\setbox\scratchboxtwo\vbalance\scratchboxone
362\stopbuffer
363
364\startbuffer[four]
365\hbox \bgroup
366    \localcontrolledendless {%
367        \ifvoid\scratchboxtwo
368            \expandafter\quitloop
369        \else
370            \setbox\scratchbox\ruledhbox\bgroup
371                \vbalancedbox\scratchboxtwo
372            \egroup
373            \vbox to 12\lineheight \bgroup
374                \box\scratchbox
375                \vfill
376            \egroup
377            \hskip1em
378        \fi
379    }\unskip
380\egroup
381\stopbuffer
382
383We start with some content in a box. This can of course be a flushed
384mvl but here we just set it directly:
385
386\typebuffer[one]
387
388We will split this box in columns. If you are familiar with \TEX\ you might know
389that a paragraph of text can follow a shape defined by \type {\parshape}. In a
390similar way as lines are split by width, we can split a vertical list by height.
391For that we define a balance shape:
392
393\typebuffer[two]
394
395\typebuffer[three]
396
397Contrary to a \type {\parshape}, a \type {\balanceshape} is not wiped after the
398work is done. It also expects keys and values. As with \type {\parpasses} each
399step is separated by \type {next}. This makes it an extensible mechanism. Finally
400we will split the box according to this shape:
401
402\typebuffer[four]
403
404The result is shown here:
405
406\startlinecorrection
407    \small
408    \setuptolerance[tolerant,stretch]
409    \getbuffer[one,two,three,four]
410\stoplinecorrection
411
412Like the par builder we can end up with overfull boxes but we can deal with that
413by using trial runs.
414
415\starttyping
416\setbox\scratchboxtwo\vbalance\scratchboxone trial
417\stoptyping
418
419\startbuffer[one]
420\setbox\scratchboxone\vbox\bgroup
421    \hsize.30\hsize
422    \samplefile{knuthmath} \blank
423    \framed[height=4\lineheight]{test}
424    \samplefile{knuthmath} \blank
425\egroup
426\stopbuffer
427
428In that case the result is made from empty boxes so the original is not
429disturbed. Here we show an overflow, so in the first resulting box you
430can compare the height with the requested one and when it's larger you
431can decide to decrease the first height in the shape and try again.
432
433\startlinecorrection
434    \small
435    \setuptolerance[tolerant,stretch]
436    \getbuffer[one,two,three,four]
437\stoplinecorrection
438
439Of course that involves some juggling of the shape but after all we have \LUA\ at
440our disposal so in the end it's all quite doable.
441
442\startbuffer[three]
443\setbox\scratchboxtwo\vbalance\scratchboxone trial
444\stopbuffer
445
446\startbuffer[four]
447\global\globalscratchtoks\emptytoks
448\localcontrolledendless {%
449    \ifvoid\scratchboxtwo
450        \expandafter\quitloop
451    \else
452        \setbox\scratchbox\vbalancedbox\scratchboxtwo
453        \xtoksapp\globalscratchtoks {
454            \NC \the\currentloopiterator
455            \NC \the\ht\scratchbox
456            \NC \the\balanceshapevsize\currentloopiterator
457            \NC \NR
458        }
459    \fi
460}
461\stopbuffer
462
463\start
464    \small
465    \setuptolerance[tolerant,stretch]
466    \getbuffer[one,two,three,four]
467\stop
468
469\starttabulate[||||]
470\BC \BC real \BC target \NC \NR
471\the\globalscratchtoks
472\stoptabulate
473
474Because the balancer can produce what otherwise the page builder produces, we
475need to handle the equivalent of top skip which is what the already shown \type
476{top} keyword takes care of. This means that the current slice (think current
477line in the par builder) has to take that into account. This can be compared to the
478left- and right protrusion in the par builder. When we typeset on a grid we have an
479additional demand.
480
481When we surround (for instance a formula) with halfline spacing, we eventually
482have to return on the grid. One complication is that when we are in grid mode and
483use half line vertical spacing, we can end up in a situation where the initial
484half line space is on a previous page. That means that we need to use a larger
485top skip. This is not something that we want to burden the balancer with but we
486have ways to trick it into taking that compensation into account.
487
488\startlinecorrection
489\hpack \bgroup
490    \ruledvpack to 8\lineheight \bgroup \forgetall \raggedcenter \offinterlineskip \hsize 3cm
491        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
492        \blackrule[width=.2\hsize,height=\strutht,  depth=\strutdp,  color=darkred]\par
493        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
494        \blackrule[width=.6\hsize,height=\strutht,  depth=\strutdp,  color=middlegray]\par
495        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
496        \blackrule[width=.2\hsize,height=\strutht,  depth=\strutdp,  color=darkred]\par
497        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
498        \blackrule[width=.6\hsize,height=\strutht,  depth=\strutdp,  color=darkgray]
499        \vfill
500    \egroup
501    \quad
502    \ruledvpack to 8\lineheight \bgroup \forgetall \raggedcenter \offinterlineskip \hsize 3cm
503        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
504        \blackrule[width=.2\hsize,height=.5\strutht,depth=.5\strutdp,color=darkred]\par
505        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
506        \blackrule[width=.6\hsize,height=\strutht,  depth=\strutdp,  color=middlegray]\par
507        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
508        \blackrule[width=.2\hsize,height=.5\strutht,depth=.5\strutdp,color=darkred]\par
509        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
510        \blackrule[width=.6\hsize,height=\strutht,  depth=\strutdp,  color=darkgray]
511        \vfill
512    \egroup
513    \quad
514    \ruledvpack to 8\lineheight \bgroup \forgetall \raggedcenter \offinterlineskip \hsize 3cm
515        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
516        \blackrule[width=.6\hsize,height=\strutht,  depth=\strutdp,  color=middlegray]\par
517        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
518        \blackrule[width=.2\hsize,height=.5\strutht,depth=.5\strutdp,color=darkred]\par
519        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
520        \blackrule[width=.6\hsize,height=\strutht,  depth=\strutdp,  color=darkgray]
521        \vfill
522    \egroup
523    \quad
524    \ruledvpack to 8\lineheight \bgroup \forgetall \raggedcenter \offinterlineskip \hsize 3cm
525        \blackrule[width=.2\hsize,height=.5\strutht,depth=.5\strutdp,color=darkred]\par
526        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
527        \blackrule[width=.6\hsize,height=\strutht,  depth=\strutdp,  color=middlegray]\par
528        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
529        \blackrule[width=.2\hsize,height=.5\strutht,depth=.5\strutdp,color=darkred]\par
530        \blackrule[width=\hsize,  height=\strutht,  depth=\strutdp,  color=darkgray]\par
531        \blackrule[width=.6\hsize,height=\strutht,  depth=\strutdp,  color=darkgray]
532        \vfill
533    \egroup
534\egroup
535\stoplinecorrection
536
537However, when we split in the middle of that segment, we can end up with a half
538line skip in a next slot because \TEX\ will remove glue at the edge. So we end up
539with what we see in the third sequence above. We deal with that in a somewhat
540special way: a box as a discardable field which value will be taken into account
541as additional top value. That field is set and reset by glue options {\tt
5420x\tohexadecimal \cldcontext {tex . glueoptioncodes . setdiscardable}} and {\tt
5430x\tohexadecimal \cldcontext {tex . glueoptioncodes . resetdiscardable}} that can
544be manipulated in \LUA\ as part of some spacing model. Here we suffice by
545mentioning that it makes sure that (as in the fourth blob above) at the top we
546have a half line spacing.
547
548\stopsectionlevel
549
550\startsectionlevel[title=Forcing breaks]
551
552Because the initial application of balancing was in column sets, we also need the
553ability to goto a next slot (step in a shape), column (possibly more steps), page
554(depending on the page state), and spread (for instance if we are doubles ided).
555For this we use \type {\balanceboundary}. It takes two values and when the
556boundary node triggers a callback in the builder these are passed along with a
557shape identifier and current shape slot. That callback can then signal back that
558we need to try a break here with a given penalty. Assuming that at the \LUA\ end
559we know at which slot we have a slot, column, page or spread break. Multiple
560slots can be skipped by multiple boundaries. There is one pitfall: we need
561something in a slot in order to break at all, so one ends up with for instance:
562
563\starttyping
564\balanceboundary 3 1\relax
565\vskip\zeropoint
566\balanceboundary 3 0\relax
567\vskip\zeropoint
568\balanceboundary 3 0\relax
569\stoptyping
570
571Here the \type {3} is just some value that the callback can use to determine its
572action (like goto a next page) and the second value provides a detail. Of course
573all depends on the intended usage. By using a callback we can force breaks while
574not burdening the engine with some hard coded solution. For example, in \CONTEXT\
575we used these (the values are these from experiments and might change:
576
577\starttabulate[|c|c|||]
578\BC first \BC second \BC action                                    \BC user interface          \NC \NR
579\NC 1     \NC 1 or 0 \NC goto next spread (1 initial, 0 follow up) \NC \type {\page[spread]}   \NC \NR
580\NC 2     \NC 1 or 0 \NC goto next page (idem)                     \NC \type {\page}           \NC \NR
581\NC 3     \NC 1 or 0 \NC goto next column (idem)                   \NC \type {\column}         \NC \NR
582\NC 4     \NC 1 or 0 \NC goto next slot (idem)                     \NC \type {\column[slot]}   \NC \NR
583\NC 5     \NC n      \NC next slot when more than n lines          \NC \type {\testroom[5]}    \NC \NR
584\NC 6     \NC s      \NC next slot when more than s scaled points  \NC \type {\testroom[80pt]} \NC \NR
585\stoptabulate
586
587\stopsectionlevel
588
589\startsectionlevel[title=Marks]
590
591It is possible to synchronize the marks with those in the results of balanced
592segments with a few \LUA\ helpers that do the same as the page builder does at
593the start of a page, while packaging the page and when wrapping it up. So, instead
594of split marks we can have real marks.
595
596\stopsectionlevel
597
598\startsectionlevel[title=Inserts]
599
600Before we go into detail, we want to point out that when implementing a
601(balancing) mechanism as introduced above, decisions have to be made. In
602traditional \TEX\ there is for instance an approach to inserts that involves
603splitting them over pages. In our case that is a bit harder to do but there are
604ways to deal with it. When deciding on an approach it helps that we know a bit
605what situations occur and where we can put some constraints. One can argue that
606solutions should be very generic because (for instance) a publisher has some
607specific demands but in practice those are not our audience. In decades of
608developing \LUATEX\ and \LUAMETATEX\ it's (\CONTEXT) user demands and challenges
609that drives what gets implemented. Publishers, their suppliers, and large scale
610(commercial) users are pretty silent when it comes to development (and supporting
611it) while users communicate via meetings and mailing lists. Also, rendering of
612documents that have notes are often typeset kind of traditional.
613
614Users on the other hand have come up with demands for columns, typesetting on the
615grid, multiple notes, balancing, and parallel content streams. The picture we get
616from that makes us confident that what we provide is generally enough and as
617users understand the issues at hand (maybe as side effect of struggling with
618solutions) it's not that hard to explain why constraints are in place. It makes
619more sense to have a limited reliable mechanism that deals with the kind of
620(foot)notes that known users need than to cook up some complex mechanism that
621caters potential specific demands by potential users. Of course we have our own
622challenges to deal with, even if the resulting features will probably not be used
623that often. So here are the criteria that make sense:
624
625\startitemize[packed]
626\startitem We can assume a reasonable amount of notes. \stopitem
627\startitem These are normally small with no (vertical) whitespace. \stopitem
628\startitem Notes taking multiple lines may split. \stopitem
629\startitem But we need to obey widow and club penalties. \stopitem
630\startitem There can be math formulas but mostly inline. \stopitem
631\startitem We need to keep them close to where they are referred from. \stopitem
632\stopitemize
633
634But,
635
636\startitemize[packed]
637\startitem We can ignore complex conflicting demands. \stopitem
638\startitem As long as we get some result, we're fine. \stopitem
639\startitem So users have to check what comes out. \stopitem
640\startitem We don't assume fully automated unattended usage. \stopitem
641\stopitemize
642
643And of course:
644
645\startitemize[packed]
646\startitem Performance should be acceptable. \stopitem
647\startitem User interfaces should be intuitive. \stopitem
648\startitem Memory consumption should be reasonable. \stopitem
649\stopitemize
650
651We have users who use multiple note classes so that also has to be handled but
652again we don't need to come up with solutions that solve all possible demands. We
653can assume that when a book is published that needs them, the author will operate
654within the constraints.
655
656We mentioned footnotes being handled by the page builder so how about them in
657these balanced slots? Given the above remarks, we assume sane usage, so for
658instance columns that have a single slot with possibly fixed content at the top
659or bottom (and maybe as part of the stream). The balancer handles notes by taking
660their height into account and when a result is used one can request the embedded
661inserts and deal with them. Again this is very macro package dependent. Among the
662features dealt with are space above and between a set of notes, which means that
663we need to identify the first and successive notes in a class. Given how the
664routine works, this is a dynamic feature of a line: the amount of space needed
665depends on how many inserts are within a slot. When we did some extreme tests
666with several classes of notes and multiple per column we saw runtime increasing
667because instead of a few passes we got a few hundred. In an extreme case of 800
668passes to balance the result we noticed over four million checks for note related
669spacing. We could bring that down to one tenth so in the end we are still slower
670but less noticeable. Here are the helper primitives for inserts:
671
672\starttyping
673<state> = \boxinserts <box>
674<box>   = \vbalancedinsert <box> <class>
675<state> = \boxinserts <box>
676\stoptyping
677
678A (foot)note implementation is very macro package dependent so the next example
679is just that: an example of using the available primitive. We start by populating
680a mvl with a sample text and a single footnote.
681
682\startbuffer[populate]
683\begingroup
684    \forgetall
685    \beginmvl
686        index 5
687        options \numexpr
688            \ignoreprevdepthmvloptioncode
689          + \discardtopmvloptioncode
690        \relax
691    \relax
692        \hsize .4tw
693        Line 1 \par Line 2 \footnote {Note 1} \par Line 3 \par
694        Line 4 \footnote {Note 2} \par Line 5 \par Line 6 \par
695    \endmvl
696\endgroup
697\stopbuffer
698
699\typebuffer[populate]
700
701We fetch the footnote number, which is one of many possible defined
702inserts
703
704\startbuffer[whatever]
705\cdef\currentnote{footnote}%
706\scratchcounter\currentnoteinsertionnumber
707\stopbuffer
708
709\typebuffer[whatever]
710
711The quick and dirty balancer uses a simple shape of 5 lines with normal strut
712properties. From the balanced result we take two columns. We test if there is an
713insert and take action when there is. Here we just filter the footnotes but there
714can of course be more. We overlay these notes over (under) the column that has
715them. So we work per column.
716
717\startbuffer[balance]
718\begingroup
719    \setbox\scratchboxone\flushmvl 5
720    \balanceshape 1
721        vsize      5lh
722        topskip    1sh
723        bottomskip 1sd
724    \relax
725    \setbox\scratchboxtwo\vbalance\scratchboxone
726    \ruledhbox \bgroup
727        \localcontrolledrepeat 2 {
728          \ifnum\currentloopiterator > 1
729            \hskip2\emwidth
730          \fi
731          \setbox\scratchboxthree\vbalancedbox\scratchboxtwo \relax
732          \ifnum\boxinserts\scratchboxthree > 3
733            \setbox\scratchboxfour\vbalancedinsert
734                \scratchboxthree\scratchcounter
735            \wd\scratchboxfour 0pt
736            \box\scratchboxfour
737          \fi
738          \box\scratchboxthree
739        }\unskip
740    \egroup
741\endgroup
742\stopbuffer
743
744\typebuffer[balance]
745
746The result is:
747
748\start
749    \getbuffer[populate] % outside next vbox
750    \startlinecorrection
751        \getbuffer[whatever]
752        \automigrationmode 0
753        \getbuffer[balance]
754    \stoplinecorrection
755\stop
756
757As we progressed we realized that the \quote {balancer} used in column sets can
758also be used for single columns and we can even support a mix of single and multi
759columns. There is however a problem: within a mvl we can deal with spacing but we
760can't do that reliable across mvl's and especially when we cross a page it
761becomes hard to identify if some (vertical) spacing is needed; we don't want it
762at the bottom or top of a page. This feature is too experimental to be discussed
763right now.
764
765We assumed reasonable notes to be used but even if a user tries to keep notes
766small and avoid too many, there are cases where they might look like a paragraph
767and when there are more in a row, it might be that a column overflows. This is
768why we have some support for split notes. This is accomplished by two additional
769commands:
770
771\starttyping
772\setbox\scratchboxone\vbalance\scratchboxone\relax
773\vbalanceddeinsert\scratchboxone\relax
774\stoptyping
775
776Here we convert inserts in such a way that they are taken into account by the
777balancer so that multi|-|slot optimization takes place. Afterwards, when we loop
778over the result we can reconstruct the inserts:
779
780\starttyping
781\setbox\scratchboxtwo\vbalancedbox\scratchboxone
782\vbalancedreinsert\scratchboxtwo\relax
783\stoptyping
784
785Among the reasons that these are explicit actions, is that we want to experiment
786but also be able to see the effect by selectively enabling it. You can get better
787results by forcing depth correction.
788
789\starttyping
790\setbox\scratchboxone\vbalance\scratchboxone
791\vbalanceddeinsert\scratchboxone forcedepth\relax
792\stoptyping
793
794This will use the depth as defined by \type {\insertlinedepth} which is an insert
795class specific parameter, but discussing details of inserts is not what we do
796here. The reason for using a \type {\relax} in the above examples is that we want
797to stress that when keywords are involved, you need to prevent look|-|ahead,
798especially when an \type {\if...} or expandable loop follows, which is not
799uncommon when we balance.
800
801It is possible to define top and bottom inserts but of course these need to be
802filtered and placed at the \TEX\ end, so this is macro package specific. Here we
803just mention that it is possible to set \type {\insertstretch} and \type
804{\insertshrink} which will be taken into account. However, this can result in
805overlap so if indeed stretch or shrink is applied, the \type {handle_uinsert}
806callback should be used for bringing what actually gets inserted to the right
807dimensions. For now we consider this an experimental feature.
808
809\stopsectionlevel
810
811\startsectionlevel[title=Discardables]
812
813This is a preliminary explanation.
814
815\startbuffer[populate]
816\begingroup
817    \beginmvl
818        index 5
819        options \numexpr
820            \ignoreprevdepthmvloptioncode
821          + \discardtopmvloptioncode
822        \relax
823    \relax
824        \hsize .4tw
825        \par
826        \vskip0pt
827        {\darkred \hrule discardable height 1sh depth 1sd width 1em}
828        \par
829        % we need the strut because the rule obscures it .. todo
830        \dorecurse{8}{\strut Line #1 \par}
831        \vskip\zeropoint
832        {\darkblue \hrule discardable height 1sh depth 1sd width 1em}
833        \par
834    \endmvl
835\endgroup
836\stopbuffer
837
838\typebuffer[populate]
839
840\startbuffer[balance]
841\setbox\scratchboxone\flushmvl 5
842\balanceshape 1
843    vsize       5lh
844    topskip     1sh % see comment above
845    bottomskip  1sd
846    options     3
847\relax
848\setbox\scratchboxtwo\vbalance\scratchboxone\relax % lookhead
849\stopbuffer
850
851\startbuffer[flush]
852\hpack \bgroup
853    \localcontrolledrepeat 3 {
854        \ifvoid\scratchboxtwo\else
855            \setbox\scratchboxthree\vbalancedbox\scratchboxtwo
856            \ifvoid\scratchboxthree\else
857                \dontleavehmode\llap{[\the\currentloopiterator]\quad}%
858                \ruledhpack{\box\scratchboxthree}\par
859            \fi
860            \hskip 4em
861        \fi
862    }\unskip
863\egroup
864\stopbuffer
865
866\typebuffer[balance,flush]
867
868\start
869    \forgetall
870    \getbuffer[populate] % outside next vbox
871    \blank[2*line]
872    \startlinecorrection
873        \getbuffer[balance]
874        \getbuffer[flush]
875    \stoplinecorrection
876%     \blank[2*line]
877\stop
878
879When at the top, the rule will be ignored and basically sticks out. When at the
880bottom the rule might end up in a zero dimension box. With \typ
881{\vbalanceddiscard \scratchboxtwo} they will become an \type {\nohrule}.
882Basically we're talking of optional content. The \type {options} bitset in the
883shape definition tells if we have a top (1) and|/| or bottom (2), here we have
884both (3) but in for instance column sets it depends.
885
886\start
887    \forgetall
888    \showmakeup[vglue]
889    \getbuffer[populate] % outside next vbox
890    \showmakeup[reset]
891    \blank
892    \startlinecorrection
893    \showmakeup[vglue]
894        \getbuffer[balance]
895        \vbalanceddiscard\scratchboxtwo
896        \getbuffer[flush]
897    \stoplinecorrection
898%     \blank[2*line]
899\stop
900
901Here we actually still have the rule but marked as invisible. So, topskip has a
902negative amount. In the next case the \type {remove} keyword makes the rule go
903away in which case we also adapt the topskip accordingly.
904
905\start
906    \forgetall
907    \showmakeup[vglue]
908    \getbuffer[populate] % outside next vbox
909    \showmakeup[reset]
910    \blank
911    \startlinecorrection
912        \getbuffer[balance]
913        \vbalanceddiscard \scratchboxtwo remove\relax
914        \getbuffer[flush]
915    \stoplinecorrection
916%     \blank[2*line]
917\stop
918
919You need to juggle a bit with skips and penalties to get this working as you
920like. Instead of rules you can also use boxes, for example before:
921
922\starttyping
923\vskip\zeropoint
924\ruledvbox discardable {\hpack{\strut BEFORE}}
925\par
926\stoptyping
927
928and after:
929
930\starttyping
931\forgetall \par \vskip\zeropoint
932\ruledvbox discardable {\hpack{\strut AFTER}}%
933\penalty\minusone % !
934\par
935\stoptyping
936
937It currently is a playground so it might (and probably will) evolve. Although it
938was also made for a specific issue it might have other usage.
939
940\stopsectionlevel
941
942\startsectionlevel[title=Passes]
943
944{\em todo}
945
946\starttyping
947\showmakeup[vpenalty,line]
948\balancefinalpenalties 6 10000 9000 8000 7000 6000 5000\relax
949\balancevsize 5\lineheight
950\setbox\scratchbox\vbox{\dorecurse{1}{\samplefile{tufte}\footnote{!}\par}}
951\vbalance\scratchbox
952\stoptyping
953
954\stopsectionlevel
955
956\startsectionlevel[title=Passes]
957
958In \LUAMETATEX\ the par builder has been extended with additional features (like
959orphan, toddler and twin control) and the ability to define and apply multiple
960passes over the paragraph to get the best result. The balancer has a similar
961feature: \type {\balancepasses}. As with \type {\parpasses} we have an
962infrastructure for tracing.
963
964\starttyping
965% threshold
966% tolerance
967% looseness
968% adjdemerits
969% originalstretch
970% emergencystretch
971% emergencyfactor
972% emergencypercentage
973\stoptyping
974
975\stopsectionlevel
976
977% tests/mkiv/typesetting/balancing-001.tex
978
979\stopdocument
980
981% (Re)written mixed with watching Talk Talk in Montreux DVD and energetic The
982% Warning live concerts on YT, just to get a positive constructive vibe. As with
983% the mechanisms discussed here, it's all about cooperation and subtle (and honest)
984% quality. It's often music that drives this development.
985