evenmore-normalization.tex /size: 10 Kb    last modification: 2021-10-28 13:50
1% language=us runpath=texruns:manuals/evenmore
2
3% \enabletrackers[nodes.directions]
4
5\environment evenmore-style
6
7\startcomponent evenmore-normalization
8
9\startchapter[title=Normalization]
10
11What I describe here was long due. I delayed it because when enabled it had best
12also be used and I need to (check and) adapt some code to it in order to profit
13from it. So, if used at all, it will take some time to have an effect on the
14\CONTEXT\ code base. But first some background information.
15
16When \TEX\ builds a paragraph it splits the current text stream (that makes up
17the paragraph) into lines where each line becomes an horizontal box. In \LUATEX,
18this process is split into distinctive steps, contrary to regular \TEX\ where the
19splitting is combined with hyphenation, ligature construction and font kerning.
20But what all engines have in common is that after the decision is made about what
21a line is, the result gets packages into the horizontal box.
22
23The decision making is influenced by quite some factors, like:
24
25\startitemize[packed]
26\startitem
27    The indentation of the first line, driven by the presence of a box of
28    with a certain width and no height and depth (its always there, also when
29    the indentation is zero).
30\stopitem
31\startitem
32    Hanging indentation, which can happen at each corner of the paragraph, or
33    alternatively a specific parshape.
34\stopitem
35\startitem
36    Left and|/|or right margins, aka left skip and right skip. A right skip is
37    always present, even when zero.
38\stopitem
39\startitem
40    The way the last line gets aligns, aka parfill skip.
41\stopitem
42\startitem
43    Directional changes that need to be carries over to the next line.
44\stopitem
45\startitem
46    Optional protrusion of characters to the left of right of the line, something
47    that is sensitive for directional changes.
48\stopitem
49\startitem
50    Expansion of characters in order to get better inter|-|word spacing and|/|or
51    to prevent lines being too bad. There can be stretch as well as shrink but
52    on a per line basis. Inter|-|character kerns can also get that treatment.
53\stopitem
54\startitem
55    The penalties associated to hyphenation: the pre|-|last line, the last two
56    lines, a list of penalties (\ETEX), specific penalties bound to hyphenation
57    pints (\LUATEX).
58\stopitem
59\startitem
60    The wish to have more or less lines than optimal, aka looseness. I have to
61    admit that I never use that feature.
62\stopitem
63\stopitemize
64
65% In traditional \TEX\ it doesn't really matter how the resulting boxes look like,
66% as long as the following steps can handle them, and those steps don't look into
67% those boxes. In fact, unless you unpack a box, only the backend deals with the
68% content. But in \LUATEX\ we have callbacks that hook into several stages and {\em
69% can} look into the constructed boxes. In \LUATEX\ these boxes also have embedded
70% directional information (needed by the backend) and (although that is seldom
71% used) left or right boxed material, a features inherited from \ALEPH|/|\OMEGA.
72% And when messing around with the content of boxes one has to know what can be
73% seen there. In principle the code can be reorganized a it but adding additional
74% functionality is not that trivial because we want to stay close to the original
75% implementation, even if it has been messed up a bit by successive additions.
76% Eventually I might give it a try to integrate all these features a bit better,
77% but on the other hand: it works.
78%
79% \starttexdefinition Sample #1#2
80%     \startluacode
81%         document.normalizestate = nodes.getnormalizeline()
82%         nodes.setnormalizeline(#1)
83%     \stopluacode
84%     \startsubsubject[title={normalization #1, #2}]
85%         \typebuffer[#2]
86%         \startlinecorrection
87%             \forgetall
88%             \start
89%                 \setupalign[verytolerant,stretch]
90%                 \showmakeup[line,hbox,vbox,glue]
91%                 \vbox{\getbuffer[#2]\samplefile{sapolsky}}
92%             \stop
93%             \par
94%         \stoplinecorrection
95%     \stopsubject
96%     \startluacode
97%         nodes.setnormalizeline(document.normalizestate)
98%     \stopluacode
99% \stoptexdefinition
100
101\newcount\OldNormalizeLineMode
102
103\starttexdefinition Sample #1#2
104    \OldNormalizeLineMode\normalizelinemode
105    \bitwiseflip \normalizelinemode \normalizelinenormalizecode
106    \bitwiseflip \normalizelinemode \parindentskipnormalizecode
107    \startsubsubject[title={#1}]
108        \typebuffer[#2]
109        \startlinecorrection
110            \forgetall
111            \start
112                \setupalign[verytolerant,stretch]
113                \showmakeup[line,hbox,vbox,glue]
114                \vbox{\getbuffer[#2]\samplefile{sapolsky}}
115            \stop
116            \par
117        \stoplinecorrection
118    \stopsubject
119    \normalizelinemode\OldNormalizeLineMode
120\stoptexdefinition
121
122\startbuffer[sample-1]
123    \parindent  = 20pt
124    \leftskip   = 40pt
125    \rightskip  = 50pt
126    \hangindent =  0pt
127    \hangafter  =  0
128\stopbuffer
129
130\startbuffer[sample-2]
131    \parindent  =  0pt
132    \leftskip   =  0pt
133    \rightskip  =  0pt
134    \hangindent =-20pt
135    \hangafter  = -3
136\stopbuffer
137
138\startbuffer[sample-3]
139    \parindent  =  0pt
140    \leftskip   =  0pt
141    \rightskip  =  0pt
142    \hangindent = 20pt
143    \hangafter  =  3
144\stopbuffer
145
146\startbuffer[sample-4]
147    \parindent  =  0pt
148    \leftskip   = 10pt
149    \rightskip  = 30pt
150    \hangindent = 20pt
151    \hangafter  =  3
152\stopbuffer
153
154% In the next examples we show how the result of typesetting a paragraph looks
155% like. We use the Sapolsky quote from the distribution. The cyan glue nodes are
156% the left and right skip nodes, and the gray one at the end of the last line
157% represents the parfill skip. The magenta ones at the edge are baseline skips. An
158% indentation is shown in gray too. As experiment we have four normalization levels
159% but in the end only the highest level makes sense, simply because normalization
160% makes no sense unless one consistently normalizes all. We just keep the
161% granularity because it makes it possible to explain what gets done.
162%
163% \texdefinition{Sample}{0}{sample-1}
164% \texdefinition{Sample}{0}{sample-2}
165% \texdefinition{Sample}{0}{sample-3}
166% \texdefinition{Sample}{0}{sample-4}
167%
168% You might have noticed that the right skip is always there but the left skip is
169% absent when it is zero. As said, as long as the result is okay, it does not
170% really matter. But \unknown\ in \LUATEX\ (and therefore \CONTEXT) it can have
171% consequences because there we can kick in a callback that does something with
172% lines. Such a callback often has to deal with these specific glues and them being
173% optional makes for more testing. The more predictable the order is, the better.
174% Although we can easily normalize lines (in a callback) to always have a left skip
175% too it is also an option in the engine.
176%
177% \texdefinition{Sample}{1}{sample-1}
178% \texdefinition{Sample}{1}{sample-2}
179% \texdefinition{Sample}{1}{sample-3}
180% \texdefinition{Sample}{1}{sample-4}
181%
182% In the previous examples there are always left skips as well as right skips. It
183% makes no sense to have an option to omit both zero left and right skips, because
184% that again is unpredictable. But we can go further.
185%
186% \texdefinition{Sample}{2}{sample-1}
187% \texdefinition{Sample}{2}{sample-2}
188% \texdefinition{Sample}{2}{sample-3}
189% \texdefinition{Sample}{2}{sample-4}
190%
191% In these examples the indentation has been turned into a glue as well (actually
192% it is more a kern but using a glue makes more sense). The hanging indentation
193% however is not seen here: it is not represented by glue but instead sort of
194% hidden in the width of the box and a shift of its content.
195%
196% \texdefinition{Sample}{3}{sample-1}
197% \texdefinition{Sample}{3}{sample-2}
198% \texdefinition{Sample}{3}{sample-3}
199% \texdefinition{Sample}{3}{sample-4}
200%
201% In the previous examples the hanging indentation is turned into left and right
202% hang skips. These cannot be set at the \TEX\ end, but are injected when we
203% instruct the normalizer to do so.
204%
205% \texdefinition{Sample}{4}{sample-1}
206% \texdefinition{Sample}{4}{sample-2}
207% \texdefinition{Sample}{4}{sample-3}
208% \texdefinition{Sample}{4}{sample-4}
209%
210% The previous examples differ from the previous set in that they push these hang
211% related glue nodes before the left and after the right skip. As I couldn't make
212% up my mind yet, I let \LUAMETATEX\ just provide both variants.
213%
214% The option to keep hang related information explicitly in the line has some
215% consequences. First of all, we now have glue and not some shift|/|width
216% combination. Second, we have introduced an incompatibility: the lines now always
217% have the proper width. You might have noticed that but we can show it more
218% explicitly. We use two parameter sets
219%
220% \startbuffer[sample-5]
221%     \hangindent = 20pt
222%     \hangafter  =  0
223% \stopbuffer
224%
225% \startbuffer[sample-6]
226%     \hangindent =-20pt
227%     \hangafter  =  0
228% \stopbuffer
229%
230% \Sample{0}{sample-5}
231% \Sample{4}{sample-5}
232%
233% \Sample{0}{sample-6}
234% \Sample{4}{sample-6}
235
236\texdefinition{Sample}{Sample 1}{sample-1}
237\texdefinition{Sample}{Sample 2}{sample-2}
238\texdefinition{Sample}{Sample 3}{sample-3}
239\texdefinition{Sample}{Sample 4}{sample-4}
240
241% A not yet mention part of the normalization is that, because they are no longer
242% of relevance, the special local par nodes have been removed. The one that starts
243% a paragraph is turned into a normal directional node if needed, so that we get
244% properly balanced pairs of directional nodes. It must been said that the code
245% that does all this is a bit of a mess. We want to stay close to the original
246% code, but we also need to deal with all these extensions, like directions,
247% protrusion, extra boxes, etc.
248%
249% Not shown here is that there is a fifth mode of operation. When we enable that
250% level an overfull box will get a correction skip so that the right skip etc are
251% properly aligned. How useful this is: we'll see.
252%
253% Now, when I decide to keep this feature, which can be set at the \LUA\ end to do
254% the previously mentioned tasks, depending on a feature level ranging from zero to
255% four, I also need to check the impact on existing \CONTEXT\ code, which
256% (currently) is complicated by the fact that most is shared between \MKIV\ and
257% \LMTX, and only \LUAMETATEX\ has this normalization feature. I will probably
258% enable it for a while locally in order to see if there are side effects. Then,
259% when the code base gets adapted, we have to assume that normalization happens, so
260% there is no way back.
261
262\stopchapter
263
264\stopcomponent
265
266