hybrid-inserts.tex /size: 16 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\startcomponent hybrid-inserts
4
5\environment hybrid-environment
6
7\startchapter[title={Deeply nested notes}]
8
9\startsection [title={Introduction}]
10
11One of the mechanisms that is not on a users retina when he or she starts using
12\TEX\ is \quote {inserts}. An insert is material that is entered at one point but
13will appear somewhere else in the output. Footnotes for instance can be
14implemented using inserts. You create a reference symbol in the running text and
15put note text at the bottom of the page or at the end of a chapter or document.
16But as you don't want to do that moving around of notes yourself \TEX\ provides
17macro writers with the inserts mechanism that will do some of the housekeeping.
18Inserts are quite clever in the sense that they are taken into account when \TEX\
19splits off a page. A single insert can even be split over two or more pages.
20
21Other examples of inserts are floats that move to the top or bottom of the page
22depending on requirements and|/|or available space. Of course the macro package
23is responsible for packaging such a float (for instance an image) but by finally
24putting it in an insert \TEX\ itself will attempt to deal with accumulated floats
25and help you move kept over floats to following pages. When the page is finally
26assembled (in the output routine) the inserts for that page become available and
27can be put at the spot where they belong. In the process \TEX\ has made sure that
28we have the right amount of space available.
29
30However, let's get back to notes. In \CONTEXT\ we can have many variants of them,
31each taken care of by its own class of inserts. This works quite well, as long as
32a note is visible for \TEX\ which means as much as: ends up in the main page
33flow. Consider the following situation:
34
35\starttyping
36before \footnote{the note} after
37\stoptyping
38
39When the text is typeset, a symbol is placed directly after the word \quote
40{before} and the note itself ends up at the bottom of the page. It also works
41when we wrap the text in an horizontal box:
42
43\starttyping
44\hbox{before \footnote{the note} after}
45\stoptyping
46
47But it fails as soon as we go further:
48
49\starttyping
50\hbox{\hbox{before \footnote{the note} after}}
51\stoptyping
52
53Here we get the reference but no note. This also fails:
54
55\starttyping
56\vbox{before \footnote{the note} after}
57\stoptyping
58
59Can you imagine what happens if we do the following?
60
61\starttyping
62\starttabulate
63\NC knuth \NC test \footnote{knuth} \input knuth \NC \NR
64\NC tufte \NC test \footnote{tufte} \input tufte \NC \NR
65\NC ward  \NC test \footnote{ward}  \input ward  \NC \NR
66\stoptabulate
67\stoptyping
68
69This mechanism uses alignments as well as quite some boxes. The paragraphs are
70nicely split over pages but still appear as boxes to \TEX\ which make inserts
71invisible. Only the three symbols would remain visible. But because in \CONTEXT\
72we know when notes tend to disappear, we take some provisions, and contrary to
73what you might expect the notes actually do show up. However, they are flushed in
74such a way that they end up on the page where the table ends. Normally this is no
75big deal as we will often use local notes that end up at the end of the table
76instead of the bottom of the page, but still.
77
78The mechanism to deal with notes in \CONTEXT\ is somewhat complex at the source
79code level. To mention a few properties we have to deal with:
80
81\startitemize[packed]
82\startitem Notes are collected and can be accessed any time. \stopitem
83\startitem Notes are flushed either directly or delayed. \stopitem
84\startitem Notes can be placed anywhere, any time, perhaps in subsets. \stopitem
85\startitem Notes can be associated to lines in paragraphs. \stopitem
86\startitem Notes can be placed several times with different layouts. \stopitem
87\stopitemize
88
89So, we have some control over flushing and placement, but real synchronization
90between for instance table entries having notes and the note content ending up on
91the same page is impossible.
92
93In the \LUATEX\ team we have been discussing more control over inserts and we
94will definitely deal with that in upcoming releases as more control is needed for
95complex multi|-|column document layouts. But as we have some other priorities
96these extensions have to wait.
97
98As a prelude to them I experimented a bit with making these deeply buried inserts
99visible. Of course I use \LUA\ for this as \TEX\ itself does not provide the kind
100of access we need for this kind of of manipulations.
101
102\stopsection
103
104\startsection [title={Deep down inside}]
105
106Say that we have the following boxed footnote. How does that end up in \LUATEX ?
107
108\starttyping
109\vbox{a\footnote{b}c}
110\stoptyping
111
112Actually it depends on the macro package but the principles remain the same. In
113\LUATEX\ 0.50 and the \CONTEXT\ version used at the time of this writing we get
114(nested) linked list that prints as follows:
115
116\starttyping
117<node   26 <  862 >  nil : vlist 0>
118  <node  401 <  838 >  507 : hlist 1>
119    <node   30 <  611 >  580 : whatsit 6>
120    <node  611 <  580 >  493 : hlist 0>
121    <node  580 <  493 >  653 : glyph 256>
122    <node  493 <  653 >  797 : penalty 0>
123    <node  653 <  797 >  424 : kern 1>
124    <node  797 <  424 >  826 : hlist 2>
125      <node  445 <  563 >  nil : hlist 2>
126        <node  420 <  817 >  821 : whatsit 35>
127        <node  817 <  821 >  nil : glyph 256>
128    <node  507 <  826 > 1272 : kern 1>
129    <node  826 < 1272 > 1333 : glyph 256>
130    <node 1272 < 1333 >  830 : penalty 0>
131    <node 1333 <  830 >  888 : glue 15>
132    <node  830 <  888 >  nil : glue 9>
133  <node  838 <  507 >  nil : ins 131>
134\stoptyping
135
136The numbers are internal references to the node memory pool. Each line represents
137a node:
138
139\starttyping
140<node prev_index < index > next_index : type subtype>
141\stoptyping
142
143The whatsits carry directional information and the deeply nested hlist is the
144note symbol. If we forget about whatsits, kerns and penalties, we can simplify
145this listing to:
146
147\starttyping
148<node   26 <  862 >  nil : vlist 0>
149  <node  401 <  838 >  507 : hlist 1>
150    <node  580 <  493 >  653 : glyph 256>
151    <node  797 <  424 >  826 : hlist 2>
152      <node  445 <  563 >  nil : hlist 2>
153        <node  817 <  821 >  nil : glyph 256>
154    <node  826 < 1272 > 1333 : glyph 256>
155  <node  838 <  507 >  nil : ins 131>
156\stoptyping
157
158So, we have a vlist (the \type {\vbox}), which has one line being a hlist. Inside
159we have a glyph (the \quote{a}) followed by the raised symbol (the
160\quote{\high{1}}) and next comes the second glyph (the \quote{b}). But watch how
161the insert ends up at the end of the line. Although the insert will not show up
162in the document, it sits there waiting to be used. So we have:
163
164\starttyping
165<node   26 <  862 >  nil : vlist 0>
166  <node  401 <  838 >  507 : hlist 1>
167  <node  838 <  507 >  nil : ins 131>
168\stoptyping
169
170but we need:
171
172\starttyping
173<node   26 <  862 >  nil : vlist 0>
174  <node  401 <  838 >  507 : hlist 1>
175<node  838 <  507 >  nil : ins 131>
176\stoptyping
177
178Now, we could use the fact that inserts end up at the end of the line, but as we
179need to recursively identify them anyway, we cannot actually use this fact to
180optimize the code.
181
182In case you wonder how multiple inserts look like, here is an example:
183
184\starttyping
185\vbox{a\footnote{b}\footnote{c}d}
186\stoptyping
187
188This boils down to:
189
190\starttyping
191<node   26 < 1324 >  nil : vlist 0>
192  <node  401 < 1348 >  507 : hlist 1>
193  <node 1348 <  507 >  457 : ins 131>
194  <node  507 <  457 >  nil : ins 131>
195\stoptyping
196
197In case you wonder what more can end up at the end, vertically adjusted material
198(\type {\vadjust}) as well as marks (\type {\mark}) also get that treatment.
199
200\starttyping
201\vbox{a\footnote{b}\vadjust{c}\footnote{d}e\mark{f}}
202\stoptyping
203
204As you see, we start with the line itself, followed by a mixture of inserts and
205vertically adjusted content (that will be placed before that line). This trace
206also shows the list 2~levels deep.
207
208\starttyping
209<node   26 < 1324 >  nil : vlist 0>
210  <node  401 < 1348 >  507 : hlist 1>
211  <node 1348 <  507 >  862 : ins 131>
212  <node  507 <  862 >  240 : hlist 1>
213  <node  862 <  240 > 2288 : ins 131>
214  <node  240 < 2288 >  nil : mark 0>
215\stoptyping
216
217Currently vadjust nodes have the same subtype as an ordinary hlist but in
218\LUATEX\ versions beyond 0.50 they will have a dedicated subtype.
219
220We can summarize the pattern of one \quote {line} in a vertical list as:
221
222\starttyping
223[hlist][insert|mark|vadjust]*[penalty|glue]+
224\stoptyping
225
226In case you wonder what happens with for instance specials, literals (and other
227whatits): these end up in the hlist that holds the line. Only inserts, marks and
228vadjusts migrate to the outer level, but as they stay inside the vlist, they are
229not visible to the page builder unless we're dealing with the main vertical list.
230Compare:
231
232\starttyping
233this is a regular paragraph possibly with inserts and they
234will be visible as the lines are appended to the main
235vertical list \par
236\stoptyping
237
238with:
239
240\starttyping
241but \vbox {this is a nested paragraph where inserts will
242stay with the box} and not migrate here \par
243\stoptyping
244
245So much for the details; let's move on the how we can get
246around this phenomenon.
247
248\stopsection
249
250\startsection [title={Some \LUATEX\ magic}]
251
252The following code is just the first variant I made and \CONTEXT\ ships with a
253more extensive variant. Also, in \CONTEXT\ this is part of a larger suite of
254manipulative actions but it does not make much sense (at least not now) to
255discuss this framework here.
256
257We start with defining a couple of convenient shortcuts.
258
259\starttyping
260local hlist = node.id('hlist')
261local vlist = node.id('vlist')
262local ins   = node.id('ins')
263\stoptyping
264
265We can write a more compact solution but splitting up the functionality better
266shows what we're doing. The main migration function hooks into the callback \type
267{build_page}. Contrary to other callbacks that do phases in building lists and
268pages this callback does not expect the head of a list as argument. Instead, we
269operate directly on the additions to the main vertical list which is accessible
270as \type {tex.lists.contrib_head}.
271
272\starttyping
273local deal_with_inserts -- forward reference
274
275local function migrate_inserts(where)
276    local current = tex.lists.contrib_head
277    while current do
278        local id = current.id
279        if id == vlist or id == hlist then
280            current = deal_with_inserts(current)
281        end
282        current = current.next
283    end
284end
285
286callback.register('buildpage_filter',migrate_inserts)
287\stoptyping
288
289So, effectively we scan for vertical and horizontal lists and deal with embedded
290inserts when we find them. In \CONTEXT\ the migratory function is just one of the
291functions that is applied to this filter.
292
293We locate inserts and collect them in a list with \type {first} and \type {last}
294as head and tail and do so recursively. When we have run into inserts we insert
295them after the horizontal or vertical list that had embedded them.
296
297\starttyping
298local locate -- forward reference
299
300deal_with_inserts = function(head)
301    local h, first, last = head.list, nil, nil
302    while h do
303        local id = h.id
304        if id == vlist or id == hlist then
305            h, first, last = locate(h,first,last)
306        end
307        h = h.next
308    end
309    if first then
310        local n = head.next
311        head.next = first
312        first.prev = head
313        if n then
314            last.next = n
315            n.prev = last
316        end
317        return last
318    else
319        return head
320    end
321end
322\stoptyping
323
324The \type {locate} function removes inserts and adds them to a new list, that is
325passed on down in recursive calls and eventually is returned back to the caller.
326
327\starttyping
328locate = function(head,first,last)
329    local current = head
330    while current do
331        local id = current.id
332        if id == vlist or id == hlist then
333            current.list, first, last = locate(current.list,first,last)
334            current = current.next
335        elseif id == ins then
336            local insert = current
337            head, current = node.remove(head,current)
338            insert.next = nil
339            if first then
340                insert.prev = last
341                last.next = insert
342            else
343                insert.prev = nil
344                first = insert
345            end
346            last = insert
347        else
348            current = current.next
349        end
350    end
351    return head, first, last
352end
353\stoptyping
354
355As we can encounter the content several times in a row, it makes sense to mark
356already processed inserts. This can for instance be done by setting an attribute.
357Of course one has to make sure that this attribute is not used elsewhere.
358
359\starttyping
360if not node.has_attribute(current,8061) then
361    node.set_attribute(current,8061,1)
362    current = deal_with_inserts(current)
363end
364\stoptyping
365
366or integrated:
367
368\starttyping
369local has_attribute = node.has_attribute
370local set_attribute = node.set_attribute
371
372local function migrate_inserts(where)
373    local current = tex.lists.contrib_head
374    while current do
375        local id = current.id
376        if id == vlist or id == hlist then
377            if has_attribute(current,8061) then
378                -- maybe some tracing message
379            else
380                set_attribute(current,8061,1)
381                current = deal_with_inserts(current)
382            end
383        end
384        current = current.next
385    end
386end
387
388callback.register('buildpage_filter',migrate_inserts)
389\stoptyping
390
391\stopsection
392
393\startsection [title={A few remarks}]
394
395Surprisingly, the amount of code needed for insert migration is not that large.
396This makes one wonder why \TEX\ does not provide this feature itself as it could
397have saved macro writers quite some time and headaches. Performance can be a
398reason, unpredictable usage and side effects might be another. Only one person
399knows the answer.
400
401In \CONTEXT\ this mechanism is built in and it can be enabled by saying:
402
403\starttyping
404\automigrateinserts
405\automigratemarks
406\stoptyping
407
408As you can see here, we can also migrate marks. Future versions of \CONTEXT\ will
409do this automatically and also provide some control over what classes of inserts
410are moved around. We will probably overhaul the note handling mechanism a few
411more times anyway as \LUATEX\ evolves and the demands from critical editions that
412use many kind of notes raise.
413
414\stopsection
415
416\startsection [title={Summary of code}]
417
418The following code should work in plain \TEX:
419
420\starttyping
421\directlua 0 {
422local hlist         = node.id('hlist')
423local vlist         = node.id('vlist')
424local ins           = node.id('ins')
425local has_attribute = node.has_attribute
426local set_attribute = node.set_attribute
427
428local status = 8061
429
430local function locate(head,first,last)
431    local current = head
432    while current do
433        local id = current.id
434        if id == vlist or id == hlist then
435            current.list, first, last = locate(current.list,first,last)
436            current = current.next
437        elseif id == ins then
438            local insert = current
439            head, current = node.remove(head,current)
440            insert.next = nil
441            if first then
442                insert.prev, last.next = last, insert
443            else
444                insert.prev, first = nil, insert
445            end
446            last = insert
447        else
448            current = current.next
449        end
450    end
451    return head, first, last
452end
453
454local function migrate_inserts(where)
455    local current = tex.lists.contrib_head
456    while current do
457        local id = current.id
458        if id == vlist or id == hlist and
459                not has_attribute(current,status) then
460            set_attribute(current,status,1)
461            local h, first, last = current.list, nil, nil
462            while h do
463                local id = h.id
464                if id == vlist or id == hlist then
465                    h, first, last = locate(h,first,last)
466                end
467                h = h.next
468            end
469            if first then
470                local n = current.next
471                if n then
472                    last.next, n.prev = n, last
473                end
474                current.next, first.prev = first, current
475                current = last
476            end
477        end
478        current = current.next
479    end
480end
481
482callback.register('buildpage_filter', migrate_inserts)
483}
484\stoptyping
485
486Alternatively you can put the code in a file and load that with:
487
488\starttyping
489\directlua {require "luatex-inserts.lua"}
490\stoptyping
491
492A simple plain test is:
493
494\starttyping
495\vbox{a\footnote{1}{1}b}
496\hbox{a\footnote{2}{2}b}
497\stoptyping
498
499The first footnote only shows up when we have hooked our migrator into the
500callback. A not that bad result for 60 lines of \LUA\ code.
501
502\stopsection
503
504\stopchapter
505
506\stopcomponent
507