% language=us runpath=texruns:manuals/lowlevel

\environment lowlevel-style

\usemodule[system-tokens]

\definefontfeature[default][default][expansion=quality]

\appendtoks\showmakeup[reset]\to\everybeforeoutput

\usebodyfont[dejavu]
\usebodyfont[pagella]

\startdocument
  [title=debugging,
   color=darkgray]

\startsectionlevel[title=Introduction]

Below there will be some examples of how you can see what \TEX\ is doing. We
start with some verbose logging but then move on to the more visual features. We
occasionally point to some features present in the \LUAMETATEX\ engine. More
details about what is possible can be found in documents in the \CONTEXT\
distribution, for instance the \quote {lowlevel} manuals.

Typesetting involves par building, page building, inserts (footnotes, floats),
vertical adjusters (stuff before and after the current line), marks (used for
running headers and footers), alignments (to build tables), math, local boxes
(left and right of lines), hyphenation, font handling, and more and each has its
own specific ways of tracing, either provided by the engine, or by \CONTEXT\
itself. You can run \typ {context --trackers} to get a list of what \CONTEXT\ can
do, as it lists most of them. But we start with the language, where tokens play
an important role.

\stopsectionlevel

\startsectionlevel[title=Token lists]

There are two main types of linked lists in \TEX: token lists and node lists.
Token lists relate to the language and node lists collect (to be) typeset content
and are used for several stack based structures. Both are efficiently memory
managed by the engine. Token lists have only forward links, but node lists link
in both directions, at least in \LUATEX\ and \LUAMETATEX.

When you define a macro, like the following, you get a token list:

\startbuffer
\def\test#1{\bgroup\bf#1\egroup}
\stopbuffer

\typebuffer \getbuffer

Internally the \type {\test} macro has carry the argument part and the body,
and each is encoded as a number plus a pointer to the next token.

\luatokentable\test

Here the first (large) number is a memory location that holds two 4 byte integers
per token: the so called info part codes the command and sub command, the two
smaller numbers in the table, and a link part that points to the next memory
location, here the nest row. The last columns provide details. A character like
\quote {a} is one token, but a control sequence like \type {\foo} is also one
token because every control sequence gets a number. So, both take eight bytes of
memory which is why a format file can become large and memory consumption grows
the more macros you use.

In the body of the above \type {\test} macro we used \type {\bf} so let's see how
that looks:

\luatokentable\bf

Here the numbers are much lower which is an indication that they are likely in
the format. They are also ordered, which is a side effect of \LUAMETATEX\ making
sure that the token lists stored in the format file keep their tokens close
together in memory which could potentially be a bit faster. But, when we are in a
production run, the tokens come from the pool of freed or additionally allocated
tokens:

\start

\startbuffer
\tolerant\permanent\protected\def\test[#1]#:#2%
  {{\iftok{#1}{sl}\bs\else\bf\fi#2}}
\stopbuffer

\typebuffer \getbuffer

Gives us:

\luatokentable\test

\stop

If you are familiar with \TEX\ and spend some time looking at this you will start
recognizing entries. For instance \type {11 115} translates to \type {letter s}
because \type {11} is the so called command code of letters (also its \type
{\catcode}) and the \type {s} has \UTF8 value \type {115}. The \LUAMETATEX\
specific \type {\iftok} conditional has command code \type {135} and sub code
\type {29}. Internally these are called \type {cmd} and \type {chr} codes because
in many cases it's characters that are the sub commands.

There is more to tell about these commands and the way macros are defined, for
instance \type {tolerant} here means that we can omit the the first argument
(between brackets) in which case we pick up after the \type {#:}. With \type
{protected} we indicate that the macro will not expand in for instance an \type
{\edef} and \type {permanent} marks the macro as one that a user cannot redefine
(assuming that overload protection is enabled). The extended macro argument
parsing features and macro overload protection are something specific to
\LUAMETATEX.

These introspective tables can be generated with:

\starttyping
\luatokentable\test
\stoptyping

after loading the module \type {system-tokens}. The reason for having a module
and not a built|-|in tracer is that users seldom want to do this. Instead they
might use \typ {\showluatokens \test} that just reports something similar to the
console and|/|or log file.

There is much more to tell but most users have no need to look into these
details unless they are curious about what \TEX\ does. In that case using \type
{tracingall} and inspecting the log file can be revealing too, but be prepared
for huge files. In \LUAMETATEX\ we have tried to improve these traces a bit but
that's of course subjective and even then logs can become huge. But even if one
doesn't understand all that is shown, it gives an impression about how much work
\TEX\ is actually doing.

\stopsectionlevel

\startsectionlevel[title=Node lists]

A node list is what you get from input that is (to be) typeset. There are several
ways to see what node lists are produced but these are all very verbose. Take for
instance:

\startbuffer
\setbox\scratchbox\hbox{test \bf test}

\showboxhere\scratchbox
\stopbuffer

\typebuffer

This gives us:

\starttyping[style=small,align=hangright]
\hlist[box][color=1,colormodel=1,mathintervals=1], width 47.8457pt, height 7.48193pt, depth 0.15576pt, direction l2r, state 1
.\list
..\glyph[unset][color=1,colormodel=1], protected, wd 4.42041pt, ht 7.48193pt, dp 0.15576pt, language (n=1,l=2,r=3), hyphenationmode "79F3F, options "80, font <1: DejaVuSerif @ 11.0pt>, glyph U+0074
..\glyph[unset][color=1,colormodel=1], protected, wd 6.50977pt, ht 5.86523pt, dp 0.15576pt, language (n=1,l=2,r=3), hyphenationmode "79F3F, options "80, font <1: DejaVuSerif @ 11.0pt>, glyph U+0065
..\glyph[unset][color=1,colormodel=1], protected, wd 5.64502pt, ht 5.86523pt, dp 0.15576pt, language (n=1,l=2,r=3), hyphenationmode "79F3F, options "80, font <1: DejaVuSerif @ 11.0pt>, glyph U+0073
..\glyph[unset][color=1,colormodel=1], protected, wd 4.42041pt, ht 7.48193pt, dp 0.15576pt, language (n=1,l=2,r=3), hyphenationmode "79F3F, options "80, font <1: DejaVuSerif @ 11.0pt>, glyph U+0074
..\glue[spaceskip][color=1,colormodel=1] 3.49658pt plus 1.74829pt minus 1.16553pt, font 1
..\glyph[unset][color=1,colormodel=1], protected, wd 5.08105pt, ht 7.48193pt, dp 0.15576pt, language (n=1,l=2,r=3), hyphenationmode "79F3F, options "80, font <10: DejaVuSerif-Bold @ 11.0pt>, glyph U+0074
..\glyph[unset][color=1,colormodel=1], protected, wd 6.99854pt, ht 5.86523pt, dp 0.15576pt, language (n=1,l=2,r=3), hyphenationmode "79F3F, options "80, font <10: DejaVuSerif-Bold @ 11.0pt>, glyph U+0065
..\glyph[unset][color=1,colormodel=1], protected, wd 6.19287pt, ht 5.86523pt, dp 0.15576pt, language (n=1,l=2,r=3), hyphenationmode "79F3F, options "80, font <10: DejaVuSerif-Bold @ 11.0pt>, glyph U+0073
..\glyph[unset][color=1,colormodel=1], protected, wd 5.08105pt, ht 7.48193pt, dp 0.15576pt, language (n=1,l=2,r=3), hyphenationmode "79F3F, options "80, font <10: DejaVuSerif-Bold @ 11.0pt>, glyph U+0074
\stoptyping

The periods indicate the nesting level and the slash in front of the initial
field is mostly a historic curiosity because there are no \type {\hlist} and
\type {\glue} primitives, but actually there is in \LUAMETATEX\ a \type {\glyph}
primitive but that one definitely doesn't want the shown arguments.

That said, here we have a horizontal list where the list field points to a glyph
that itself points to a next one. The space became a glue node. In \LUATEX\ and
even more in \LUAMETATEX\ all nodes have or get a subtype assigned that indicates
what we're dealing with. This is shown between the first pair of brackets. Then
there are attributes, between the second pair of brackets, which actually is a
also a (sparse) linked list. Here we have two attributes set, the color, where
the number points to some stored color specification, and the (here somewhat
redundant) color space. The names of these attributes are macro package dependent
because attributes are just a combination of a number and value. The engine
itself doesn't do anything with them; it is the \LUA\ code you plug in that can
do something useful based on the values.

It will be clear that watching a complete page, with many nested boxes, rules,
glyphs, discretionaries, glues, kerns, penalties, boundaries etc quickly becomes
a challenge which is why we have other means to see what we get so let's move on
to that now.

\stopsectionlevel

\startsectionlevel[title=Visual debugging]

In the early days of \CONTEXT, in the mid 90's of the previous century, one of
the first presentations at an \NTG\ meeting was about visual debugging. This
feature was achieved by overloading the primitives that make boxes, add glue,
inject penalties and kerns, etc. It actually worked quite well, although in some
cases, for instance where boxes have to be unboxed, one has to disable it. I
remember some puzzlement among the audience about the fact that indeed these
primitives could be overloaded without too many side effects. It will be no
surprise that this feature has been carried on to later versions, and in
\CONTEXT\ \MKIV\ it was implemented in a different (less intrusive) way and it
got gradually extended.

\startbuffer
\showmakeup \hbox{test \bf test}
\stopbuffer

\typebuffer

This gives us a framed horizontal box, with some text and a space glue:

\startlinecorrection
\scale[height=1cm]{\start \inlinebuffer \stop}
\stoplinecorrection

Of course not all information is well visible simply because it can be overlayed
by what follows, but one gets the idea. Also, when you have a layer capable \PDF\
viewer you can turn on and off categories, so you can decide to only show glue.
You can also do that immediately, with \typ {\showmakeup [glue]}.

There is a lot of granularity: \typ {hbox}, \typ {vbox}, \typ {vtop}, \typ
{kern}, \typ {glue}, \typ {penalty}, \typ {fontkern}, \typ {strut}, \typ
{whatsit}, \typ {glyph}, \typ {simple}, \typ {simplehbox}, \typ {simplevbox},
\typ {simplevtop}, \typ {user}, \typ {math}, \typ {italic}, \typ {origin}, \typ
{discretionary}, \typ {expansion}, \typ {line}, \typ {space}, \typ {depth}, \typ
{marginkern}, \typ {mathkern}, \typ {dir}, \typ {par}, \typ {mathglue}, \typ
{mark}, \typ {insert}, \typ {boundary}, the more selective \typ {vkern}, \typ
{hkern}, \typ {vglue}, \typ {hglue}, \typ {vpenalty} and \typ {hpenalty}, as well
as some presets like \typ {boxes}, \typ {makeup} and \typ {all}.

When we have:

\startbuffer
\showmakeup \framed[align=normal]{\samplefile{ward}}
\stopbuffer

\typebuffer

we get:

\startlinecorrection
\scale[width=1tw]{\start \inlinebuffer \stop}
\stoplinecorrection

And that is why exploring this with a layers enabled \PDF\ viewer can be of help.
Alternatively a more selective use of \typ {\showmakup} makes sense, like

\startbuffer
\showmakeup[line,space] \framed[align=normal]{\samplefile{ward}}
\stopbuffer

\typebuffer

Here we only see lines, regular spaces and spaces that are determined by the
space factor that is driven by punctuation.

\startlinecorrection
\scale[width=1tw]{\start \inlinebuffer \stop}
\stoplinecorrection

\startbuffer[demo]
\leftskip         2cm
\rightskip        3cm
\hangindent       1cm
\hangafter        2
\parfillrightskip 1cm
\parfillleftskip  1cm % new
\parinitrightskip 1cm % new
\parinitleftskip  1cm % new
\parindent        2cm % different
\stopbuffer

\startbuffer
\showmakeup \framed[align=normal]{\getbuffer[demo]\samplefile{ward}}
\stopbuffer

We can typeset the previous example with these settings:

\typebuffer[demo]

This time we get:

\startlinecorrection
\scale[width=1tw]{\start \inlinebuffer \stop}
\stoplinecorrection

Looking at this kind of output only makes sense on screen where you can zoom in
but what we want to demonstrate here is that in \LUAMETATEX\ we have not only a
bit more control over the paragraph (indicated by comments) but also that we
always have the related glue present. The reason is that we then have a more
predictable packaged line when we look at one from the \LUA\ end. Where \TEX\
normally moves the final line content left or right via either glue or the shifts
property of a box, here we always use the glue. We call this normalization. Keep
in mind that \TEX\ was not designed (implemented) with exposing its internals in
mind, but for \LUATEX\ and \LUAMETATEX\ we have to take care of that.

Another characteristic is that the paragraph stores these (and many more)
properties in the so called initial par node so that they work well in situations
where grouping would interfere with our objectives. As with all extensions, these
are things that can be configured in detail but they are enabled in \CONTEXT\ by
default.

\stopsectionlevel

\startsectionlevel[title=Math]

Math is a good example where this kind of tracing helps development. Here is an
example:

\startbuffer
\im { \showmakeup y = \sqrt {2x + 4} }
\stopbuffer

\typebuffer

Scaled up we get:

\blank[2*big]

\startlinecorrection
\scale[height=2cm]{\inlinebuffer}
\stoplinecorrection

\blank[2*big]

Instead of showing everything we can again be more selective:

\startbuffer
\im {
    \showmakeup[mathglue,glyph]
    y = \sqrt {2x + 4}
}
\stopbuffer

\typebuffer

Here we not only limit ourselves to math glue, but also enable showing the
bounding boxes of glyphs.

\startlinecorrection
\scale[height=2cm]{\inlinebuffer}
\stoplinecorrection

This example also shows that in \LUAMETATEX\ we have more classes than in a
traditional \TEX\ engine. For instance, radicals have their own class as do
digits. The radical class is an engine one, the digit class is a user defined
class. You can set up the spacing between each class depending on the style. For
the record: this is just one of the many extensions to the math engine and when
extensions are being developed it helps to have this kind of tracing. Take for
instance the next example, where we have multiple indexes (indicated by \type
{__}) on a nucleus, that get separated by a little so called continuation
spacing.

\startbuffer
\im {
    \showmakeup[mathglue,glyph]
    y = \sqrt {x__1__a {\darkred +} x__1__b}
}
\stopbuffer

\typebuffer

\startlinecorrection
\scale[height=2cm]{\inlinebuffer}
\stoplinecorrection

Here the variable class is used for alphabetic characters and some more, contrary
to the more traditional (often engine assigned) ordinary class that is now used
for the left|-|overs.

\stopsectionlevel

\startsectionlevel[title=Fonts]

Some of the mentioned tracing has shortcuts, for instance \typ {\showglyphs}.
Here we show the same sample paragraph as before:

\startbuffer
\showglyphs
\showfontkerns
\framed[align=normal]{\samplefile{ward}}
\stopbuffer

\typebuffer

Here is the upper left corner of the result:

\startlinecorrection
\clip[nx=4,n=1,ny=4,y=1]{\scale[width=4tw]{\start\inlinebuffer \stop}}
\stoplinecorrection

What font kerns we get depends on the font, here we use pagella:

\startlinecorrection
\switchtobodyfont[pagella]
\scale[width=1tw]{\start\inlinebuffer \stop}
\stoplinecorrection

If we zoom in the kerns are more visible:

\startlinecorrection
\switchtobodyfont[pagella]
\clip[nx=3,n=1]{\scale[width=3tw]{\start\inlinebuffer \stop}}
\stoplinecorrection

And here is another one:

\startbuffer
\showfontexpansion
\framed[align={normal,hz}]{\samplefile{ward}}
\stopbuffer

\typebuffer

\startlinecorrection
\switchtobodyfont[pagella]
\scale[width=1tw]{\start\inlinebuffer \stop}
\stoplinecorrection

or blown up:

\startlinecorrection
\switchtobodyfont[pagella]
\clip[nx=3,n=1]{\scale[width=3tw]{\start\inlinebuffer \stop}}
\stoplinecorrection

The last line (normally) doesn't need expansion, unless we want it to compatible
with preceding lines, space|-|wise. So when we do this:

\startbuffer
\showfontexpansion
\framed[align={normal,hz,fit}]{\samplefile{ward}}
\stopbuffer

\typebuffer

the \type {fit} directives results in somewhat different results:

\startlinecorrection
\switchtobodyfont[pagella]
\clip[nx=3,n=1]{\scale[width=3tw]{\start\inlinebuffer \stop}}
\stoplinecorrection

As with other visual tracers you can get some insight in how \TEX\ turns your input into
a typeset result.

\stopsectionlevel

\startsectionlevel[title=Overflow]

By default the engine is a bit stressed to make paragraphs fit well. This means that we can
get overflowing lines. Because there is a threshold only visible overflow is reported. If you
want a visual clue, you can do this:

\startbuffer
\enabletrackers[builders.hpack.overflow]
\stopbuffer

\typebuffer

With:

\startbuffer
\ruledvbox{\hsize 3cm test test test test test test test test}
\stopbuffer

\typebuffer

We get:

\startlinecorrection
\enabletrackers[builders.hpack.overflow]
\switchtobodyfont[dejavu,12pt]
\getbuffer
\disabletrackers[builders.hpack.overflow]
\stoplinecorrection

The red bar indicates a potential problem. We can also get an underflow, as demonstrated here:

\startbuffer
\ruledvbox {
    \setupalign[verytolerant,stretch]
    \hsize 3cm test test test test test test test test
}
\stopbuffer

\typebuffer

Now we get a blue bar that indicates that we have a bit more stretch than is
considered optimal:

\startlinecorrection
\enabletrackers[builders.hpack.overflow]
\switchtobodyfont[dejavu,12pt]
\getbuffer
\disabletrackers[builders.hpack.overflow]
\stoplinecorrection

Especially in automated flows it makes sense to increase the tolerance and permit
stretch. Only when the strict attempt fails that will kick in.

\stopsectionlevel

\startsectionlevel[title=Side floats]

Some mechanisms are way more complex than a user might expect from the result. An example is
the placement of float and especially side floats.

\enabletrackers[floats.anchoring]
\startplacefigure[location=left]
    \setupexternalfigures[location={global,default}]
    \externalfigure[cow.pdf][width=3cm]
\stopplacefigure
\disabletrackers[floats.anchoring]

Not only do we have to make sure that the spacing before such a float is as good
and consistent as possible, we also need the progression to work out well, that
is: the number of lines that we need to indent. \par For that we need to estimate
the space needed, look at the amount of space before and after the float, check
if it will fit and move to the next page if needed. That all involves dealing
with interline spacing, interparagraph spacing, spacing at the top of a page,
permitted slack at the bottom of page, the depth of the preceding lines, and so
on. The tracer shows some of the corrections involved but leave it to the user to
imagine what it relates to; the previous sentence gives some clues. This tracker is
enables with:

\starttyping
\enabletrackers[floats.anchoring]
\stoptyping

\stopsectionlevel

\startsectionlevel[title=Struts]

We now come to one of the most important trackers, \typ {\showstruts}, and a few
examples shows why:

\startlinecorrection
\startcombination[nx=4,ny=1,width=.2tw]
  {\showstruts\framed[width=.2tw]               {test}} {\type{width=.2tw}}
  {\showstruts\framed[width=.2tw,height=1cm]    {test}} {\type{height=1cm}}
  {\showstruts\framed[width=.2tw,offset=0pt]    {test}} {\type{offset=0pt}}
  {\showstruts\framed[width=.2tw,offset=overlay]{test}} {\type{offset=overlay}}
\stopcombination
\stoplinecorrection

Here in all cases we've set the width to 20 percent of the text width (\type {tw}
is an example of a plugged in dimension). In many places \CONTEXT\ adds struts in
order to enforce proper spacing so when spacing is not what you expect, enabling
this tracker can help you figure out why.

\stopsectionlevel

\startsectionlevel[title=Features]

Compared to the time when \TEX\ showed up the current fonts are more complicated,
especially because features go beyond only ligaturing and kerning. But even
ligaturing can be different, because some fonts use kerning and replacement
instead of a new character. Pagella uses a multiple to single replacement:

\blank \showotfcomposition
	{file:texgyrepagella-regular.otf*default @ 12pt}
	{0}
	{\nl effe fietsen}

Not all features listed here are provided by the font (only the four character
ones) because we're using \TEX\ which, it being \TEX, means that we have plenty
more ways to mess around with additional features: it's all about detailed
control. But what you see here are the steps taken: the font handler loops over
the list of glyphs and here we see the intermediate results when something has
changed. There can be way more loops that in this simple case.

With Cambria we get a single replacement combined with kerning:

\blank \showotfcomposition
	{file:cambria.ttc*default @ 12pt}
	{0}
	{\nl effe fietsen}

One complication is that hyphenation kicks in which means that whatever we do has
to take the pre, post and replacement bits into account combined which what comes
before and after. Especially for complex scripts this tracker can be illustrative
but even then only for those who like to see what fonts do and|/|or when they add
additional features runtime.

\stopsectionlevel

\startsectionlevel[title=Profiling]

There are some features in \CONTEXT\ that are nice but only useful in some situations. An
example is profiling. It is something you will not turn on by default, if only because of
the overhead it brings. The next two paragraphs (using Pagella) show the effect.

\startbuffer
The command \tex {binom} is the standard notation for binomial coefficients and is
preferred over \tex {choose}, which is an older macro that has limited
compatibility with newer packages and font encodings: \im{|A|=\binom{N}{k}^2}.
Additionally, \tex {binom} uses proper spacing and size for the binomial symbol.
In conclusion, it is recommended to use \tex {binom} instead of \tex {choose} in
\TEX\ for typesetting binomial coefficients for better compatibility and uniform
appearance.\par
\stopbuffer

\bgroup \hsize 130mm
\switchtobodyfont[pagella,11pt]
\showmakeup[line]
\getbuffer
\egroup

The previous paragraph is what comes out by default, while the next one used
these settings plus an additional \typ {\enabletrackers [profiling.lines.show]}.

\startbuffer[demo]
\setuplineprofile[factor=0.1,step=0.5\emwidth]
\setupalign[profile]
\stopbuffer

\bgroup \hsize 130mm
\switchtobodyfont[pagella,11pt]
\showmakeup[line]
\getbuffer[demo]
\enabletrackers[profiling.lines.show]
\getbuffer
\disabletrackers[profiling.lines.show]
\egroup

This feature will bring lines together when there is no clash and is mostly of
use when a lot of inline math is used. However, when this variant of profiling
(we have an older one too) is enabled on a 300 page math book with thousands of
formulas, only in a few places it demonstrated effect; it was hardly needed
anyway. So, sometimes tracing shows what makes sense or not.

\stopsectionlevel

\startsectionlevel[title=Par builder]

Here is is a sample paragraph from Knuths \quotation {Digital Typography}:

\startshowbreakpoints%[#1]
    \samplefile{math-knuth-dt}
\stopshowbreakpoints

There are indicators with tiny numbers that indicate the possible breakpoints and
we can see what the verdict is:

\showbreakpoints% [#1]%

The last lines in the last column show the route that the result takes. Without
going into details, here is what we did:

\starttyping
\startshowbreakpoints
    \samplefile{math-knuth-dt}
\stopshowbreakpoints

\showbreakpoints
\stoptyping

This kind of tracing is part of a mechanism that makes it possible to influence
the choice by choosing a specific preferred breakpoint but that is something the
average user is unlikely to do. The main reason why we have this kind of trackers
is that when developing the new multi|-|step par builder feature we wanted to see
what exactly it did influence. That mechanism uses an \LUAMETATEX\ feature where
we can plug in additional passes using the \type {\parpasses} primitive that can
add different strategies that are tried until criteria for over- and underfull
thresholds and|/|or badness are met. Each step can set the relevant parameters
differently, including expansion, which actually makes for more efficient output
and better runtime when that features is not needed to get better results.

\stopsectionlevel

\startsectionlevel[title=More]

There are many more visual trackers, for instance \typ {layout.vz} for when you
enabled vertical expansion, \typ {typesetters.suspects} for identifying possible
issues in the input like invisible spaces. Trackers like \typ
{nodes.destinations} and \typ {nodes.references} will show the areas used by
these mechanisms. There are also trackers for positions, (cjk and other), script
handling, rubies, tagging, italic correction, breakpoints and so on. The examples
in the previous sections illustrate what to expect and when to use a specific
mechanism knowing this might trigger you to check if a tracker exists. Often the
test suite has examples of usage.

\stopsectionlevel

\stopdocument