% language=us runpath=texruns:manuals/luametatex

% todo: move some to elsewhere (e.g. builders / paragraphs

\environment luametatex-style

\startcomponent luametatex-enhancements

\startchapter[reference=enhancements,title={Enhancements}]

\startsection[title={Introduction}]

\startsubsection[title={Primitive behaviour}]

From day one, \LUATEX\ has offered extra features compared to the superset of
\PDFTEX, which includes \ETEX, and \ALEPH. This has not been limited to the
possibility to execute \LUA\ code via \prm {directlua}, but \LUATEX\ also adds
functionality via new \TEX|-|side primitives or extensions to existing ones. The
same is true for \LUAMETATEX. Some primitives have \type {luatex} in their name
and there will be no \type {luametatex} variants. This is because we consider
\LUAMETATEX\ to be \LUATEX 2\high{+}.

Contrary to the \LUATEX\ engine \LUAMETATEX\ enables all its primitives. You can
clone (a selection of) primitives with a different prefix, like this:

\starttyping
\directlua { tex.enableprimitives('normal',tex.extraprimitives()) }
\stoptyping

The \type {extraprimitives} function returns the whole list or a subset,
specified by one or more keywords \type {tex}, \type {etex} or \type {luatex}.
When you clone all primitives you can also do this:

\starttyping
\directlua { tex.enableprimitives('normal',true) }
\stoptyping

But be aware that the curly braces may not have the proper \prm {catcode}
assigned to them at this early time (giving a \quote {Missing number} error), so
it may be needed to put these assignments before the above line:

\starttyping
\catcode `\{ = 1
\catcode `\} = 2
\stoptyping

More fine|-|grained primitives control is possible and you can look up the
details in \in {section} [luaprimitives]. There are only three kinds of
primitives: \type {tex}, \type {etex} and \type {luatex} but a future version
might drop this and no longer make that distinction as it no longer serves a
purpose apart from the fact that it reveals some history.

\stopsubsection

\startsubsection[title={Rationale}]

One can argue that \TEX\ should stay as it is but over decades usage of this
program has evolved and resulted in large macro packages that often need to rely
on what the \TEX\ books calls \quote {dirty tricks}. When you look deep down in
the code of \CONTEXT\ \MKII, \MKIV\ and \MKXL\ (aka \LMTX) you will see plenty of
differences but quite a bit of the functionality in the most recent versions is
also available in \MKII. Of course more has been added over time, and some
mechanisms could be made more efficient and reliable but plenty was possible.

So, when you see something done in \CONTEXT\ \LMTX\ using new \LUAMETATEX\
primitives you can assume that somehow the same is done in \CONTEXT\ \MKIV. We
don't really need \LUAMETATEX\ instead of \LUATEX. Among the main reasons for
still going for this new engine are:

\startitemize[packed]
\startitem
    some new primitives make for less tracing and tracing has become rather
    verbose over years (just try \type {tracingall}); examples are the new macro
    argument handling and some new hooks
\stopitem
\startitem
    some new primitives permits more efficient coding and have a positive impact
    on performance (this sort of compensates a performance hit due to delegating
    work to \LUA)
\stopitem
\startitem
    other primitives are there because they make the code look better; good
    examples are the extensions to conditionals; they remove the necessity for
    all kind of (somewhat unnatural) middle layers; take local control as example
\stopitem
\startitem
    a few primitives make complex and demanding mechanism a bit easier to grasp
    and explain; think of alignments, inserts and marks
\stopitem
\startitem
    more access from the \LUA\ end to \TEX\ internals: a few more callbacks, more
    options, more robust interfaces, etc
\stopitem
\startitem
    some mechanisms are very specific but can be made more generic (and powerful),
    like inserts, marks, adjusts and local boxes
\stopitem
\stopitemize

I realize that new primitives also can make some \TEX\ code look less threatening
to new users. It removes a bit of hackery and limits the level of guru that comes
with showing off the mastery of expansion and lookahead. So be it. I wonder if
those objecting to some of the extensions (with the argument that they are not
needed, and \CONTEXT\ \MKIV\ is proof of that) can resist using them. I admit
that it sometimes hurt to throw away good working but cumbersome code that took a
while to evolve, but I also admit that I favor long distance traveling by bike or
car over riding horseback.

It took a few years for \LUAMETATEX\ to evolve to what it is now and most
extensions are not there \quotation {because they were easy} or \quotation {could
be done}. If that were the case, there would be plenty more. In many aspects it
has been a balancing act and much also relates to looking at the \CONTEXT\ source
code (\TEX\ as well as \LUA) and wondering why it looks that way. It is also
driven by the fact that I want to be able to explain to users why things are done
in a certain way. In fact, I want users to be able to look at the code and
understand it (apart from maybe a few real dirty low level helpers that are also
dirty because of performance reasons). Just take this into account when reading on.

And yes, there are still a few possibilities I want to explore \unknown\ some might
show up temporarily so don't be surprised. I'm also aware that some new features can
have bugs or side effects that didn't show up in \CONTEXT, which after all is the
benchmark and environment in which all this evolves.

Over time, the other \TEX\ engines might have an occasional feature (primitive)
added and it is very unlikely that \LUAMETATEX\ will follow up on that. First of
all we have different internals but most of all because plenty of time went into
considering what got added and what not, apart from the fact that we have
callbacks. Decades of \TEX\ development never really have lead to an extensive
wish list so there is no real need why there should be a demand on anything other
than we offer here. If \TEX\ worked well for ages, it can as well do for more, so
there is no need to cripple the code base simply in order to be compatible with
other engines; \LUAMETATEX\ is already quite different anyway.

\stopsubsection

\startsubsection[title={Version information}]

\topicindex{version}
\topicindex{banner}

There are three primitives to test the version of \LUATEX\ (and \LUAMETATEX):

\unexpanded\def\VersionHack#1% otherwise different luatex and luajittex runs
  {\ctxlua{%
     local banner = "\luatexbanner"
     local banner = string.match(banner,"(.+)\letterpercent(") or banner
     context(string.gsub(banner ,"jit",""))%
  }}

\starttabulate[|l|l|pl|]
\DB primitive             \BC value
                          \BC explanation \NC \NR
\TB
\NC \prm {luatexbanner}   \NC \VersionHack{\luatexbanner}
                          \NC the banner reported on the console \NC \NR
\NC \prm {luatexversion}  \NC \the\luatexversion
                          \NC major and minor number combined \NC \NR
\NC \prm {luatexrevision} \NC \the\luatexrevision
                          \NC the revision number \NC \NR
\LL
\stoptabulate

A version is defined as follows:

\startitemize
\startitem
    The major version is the integer result of \prm {luatexversion} divided by
    100. The primitive is an \quote {internal variable}, so you may need to prefix
    its use with \prm {the} or \prm {number} depending on the context.
\stopitem
\startitem
    The minor version is a number running from 0 upto 99.
\stopitem
\startitem
    The revision is reported by \prm {luatexrevision}. Contrary to other engines
    in \LUAMETATEX\ is also a number so one needs to prefix it with \prm {the} or
    \prm {number}. \footnote {In the past it always was good to prefix the
    revision with \prm {number} anyway, just to play safe, although there have
    for instance been times that \PDFTEX\ had funny revision indicators that at
    some point ended up as letters due to the internal conversions.}
\stopitem
\startitem
    The full version number consists of the major version (\type {X}), minor
    version (\type {YY}) and revision (\type {ZZ}), separated by dots, so \type
    {X.YY.ZZ}.
\stopitem
\stopitemize

The \LUATEX\ binary has companions like \LUAJITTEX\ and a version that has a font
rendering library on board. Both introduce dependencies that don't fit into the
\LUAMETATEX\ agenda: compilation should be easy and future proof and not depend
on code outside the source tree. It means that for instance the \CONTEXT\ runners
don't really need to check much more than the basic name. It also means that the
\type {context} and \type {mtxrun} stubs can be symbolic links to the main
program that itself is about 3MB, so we can keep the binary footprint small. For
normal \CONTEXT\ \LMTX\ processing no other binaries are needed because whatever
support we need is done in \LUA.

The \LUAMETATEX\ version number starts at~2 in order to prevent a clash with
\LUATEX, and the version commands are the same. This is a way to indicate that
these projects are related.

The \type {status} library also provides some information including what we get
with the three mentioned primitives:

\starttabulate[|l|l|]
\DB field                   \BC value                               \NC \NR
\TB
\NC \type {filename}        \NC \cldcontext{status.filename}        \NC \NR
\NC \type {banner}          \NC \cldcontext{status.banner}          \NC \NR
\NC \type {luatex_engine}   \NC \cldcontext{status.luatex_engine}   \NC \NR
\NC \type {luatex_version}  \NC \cldcontext{status.luatex_version}  \NC \NR
\NC \type {luatex_revision} \NC \cldcontext{status.luatex_revision} \NC \NR
\NC \type {luatex_verbose}  \NC \cldcontext{status.luatex_verbose}  \NC \NR
\NC \type {copyright}       \NC \cldcontext{status.copyright}       \NC \NR
\NC \type {development_id}  \NC \cldcontext{status.development_id}  \NC \NR
\NC \type {format_id}       \NC \cldcontext{status.format_id}       \NC \NR
\NC \type {used_compiler}   \NC \cldcontext{status.used_compiler}   \NC \NR
\LL
\stoptabulate

\stopsubsection

\stopsection

\startsection[title={\UNICODE\ text support}]

\startsubsection[title={Extended ranges}]

\topicindex{\UNICODE}

Text input and output is now considered to be \UNICODE\ text, so input characters
can use the full range of \UNICODE\ ($2^{20}+2^{16}-1 = \hbox{0x10FFFF}$). Later
chapters will talk of characters and glyphs. Although these are not
interchangeable, they are closely related. During typesetting, a character is
always converted to a suitable graphic representation of that character in a
specific font. However, while processing a list of to|-|be|-|typeset nodes, its
contents may still be seen as a character. Inside the engine there is no clear
separation between the two concepts. Because the subtype of a glyph node can be
changed in \LUA\ it is up to the user. Subtypes larger than 255 indicate that
font processing has happened.

A few primitives are affected by this, all in a similar fashion: each of them has
to accommodate for a larger range of acceptable numbers. For instance, \prm
{char} now accepts values between~0 and $1{,}114{,}111$. This should not be a
problem for well|-|behaved input files, but it could create incompatibilities for
input that would have generated an error when processed by older \TEX|-|based
engines. The affected commands with an altered initial (left of the equal sign)
or secondary (right of the equal sign) value are: \prm {char}, \prm {lccode},
\prm {uccode}, \prm {hjcode}, \prm {catcode}, \prm {sfcode}, \prm {efcode}, \prm
{cfcode}, \prm {lpcode}, \prm {rpcode}, \prm {chardef}.

As far as the core engine is concerned, all input and output to text files is
\UTF-8 encoded. Input files can be pre|-|processed using the \type {reader}
callback. This will be explained in \in {section} [iocallback]. Normalization of
the \UNICODE\ input is on purpose not built|-|in and can be handled by a macro
package during callback processing. We have made some practical choices and the
user has to live with those.

% Output in byte|-|sized chunks can be achieved by using characters just outside of
% the valid \UNICODE\ range, starting at the value $1{,}114{,}112$ (0x110000). When
% the time comes to print a character $c>=1{,}114{,}112$, \LUATEX\ will actually
% print the single byte corresponding to $c$ minus 1{,}114{,}112. This feature has
% been dropped.

Contrary to other \TEX\ engines, the output to the terminal is as|-|is so there
is no escaping with \type {^^}. We operate in a \UTF\ universe. Because we
operate in a \CCODE\ universum, zero characters are special but because we also
live in a \UNICODE\ galaxy that is no real problem.

\stopsubsection

\startsubsection[title={\prm {tocharacter}}]

\topicindex{\UNICODE}

The expandable command \prm {tocharacter} reads a number between~0 and $1{,}114{,}111$
and expands to the associated \UNICODE\ character.

\stopsubsection

\startsubsection[title={Extended tables}]

All traditional \TEX\ and \ETEX\ registers can be 16-bit numbers. The affected
commands are:

\startfourcolumns
\startlines
\prm {count}
\prm {dimen}
\prm {skip}
\prm {muskip}
\prm {marks}
\prm {toks}
\prm {countdef}
\prm {dimendef}
\prm {skipdef}
\prm {muskipdef}
\prm {toksdef}
\prm {insert}
\prm {box}
\prm {unhbox}
\prm {unvbox}
\prm {copy}
\prm {unhcopy}
\prm {unvcopy}
\prm {wd}
\prm {ht}
\prm {dp}
\prm {setbox}
\prm {vsplit}
\stoplines
\stopfourcolumns

Fonts are loaded via \LUA\ and a minimal amount of information is kept at the
\TEX\ end. Sharing resources is up to the loaders. The engine doesn't really care
about what a character (or glyph) number represents (a \UNICODE\ or index) as it
only is interested in dimensions.

In \TEX\ the number of registers is 256 and \ETEX\ bumped that to 32K. One reason
for a fixed number is that these registers are fast ways to store data and
therefore are part of the main lookup table (used for data and pointers to data
as well as save and restore housekeeping). In \LUATEX\ the number was bumped to
64K but one can argue that less would also do. In order to keep the default
memory footprint reasonable, in \LUAMETATEX\ the number of languages, fonts and
marks is limited. The size of some tables can be limited by configuration
settings, so they can start out small and grow till configured maximum which is
smaller than the absolute maximum.

% % We show this later on so not here.
%
% The following table shows all kind of defaults as reported by \typ
% {status.getconstants()}.
%
% \startluacode
%     context.starttabulate { "|T|r|" }
%     for k, v in table.sortedhash(status.getconstants()) do
%         context.NC() context(k) context.NC() context(v) context.NC() context.NR()
%     end
%     context.stoptabulate()
% \stopluacode

Because we have additional ways to store integers, dimensions and glue, we might
actually decide to decrease the maximum of the registers: if 64K is not enough,
and you work around it, then likely 32K might do as well. Also, we have \LUA\ to
store massive amounts of data. One can argue that saving some 1.5MB memory (when
we go halfway) is not worth the effort in a time when you have to close a browser
in order to free the gigabytes it consumes, but there is no reason not to be lean
and mean: a more conservative approach to start with creates headroom for going
wild later.

\stopsubsection

\stopsection

\startsection[title={Attributes}]

\startsubsection[title={Nodes}]

\topicindex {nodes}

When \TEX\ reads input it will interpret the stream according to the properties
of the characters. Some signal a macro name and trigger expansion, others open
and close groups, trigger math mode, etc. What's left over becomes the typeset
text. Internally we get a linked list of nodes. Characters become \nod {glyph}
nodes that have for instance a \type {font} and \type {char} property and \typ
{\kern 10pt} becomes a \nod {kern} node with a \type {width} property. Spaces are
alien to \TEX\ as they are turned into \nod {glue} nodes. So, a simple paragraph
is mostly a mix of sequences of \nod {glyph} nodes (words) and \nod {glue} nodes
(spaces). A node can have a subtype so that it can be recognized as for instance
a space related glue.

The sequences of characters at some point are extended with \nod {disc} nodes
that relate to hyphenation. After that font logic can be applied and we get a
list where some characters can be replaced, for instance multiple characters can
become one ligature, and font kerns can be injected. This is driven by the
font properties.

Boxes (like \prm {hbox} and \prm {vbox}) become \nod {hlist} or \nod {vlist}
nodes with \type {width}, \type {height}, \type {depth} and \type {shift}
properties and a pointer \type {list} to its actual content. Boxes can be
constructed explicitly or can be the result of subprocesses. For instance, when
lines are broken into paragraphs, the lines are a linked list of \nod {hlist}
nodes, possibly with glue and penalties in between.

Internally nodes have a number. This number is actually an index in the memory
used to store nodes.

So, to summarize: all that you enter as content eventually becomes a node, often
as part of a (nested) list structure. They have a relative small memory footprint
and carry only the minimal amount of information needed. In traditional \TEX\ a
character node only held the font and slot number, in \LUATEX\ we also store some
language related information, the expansion factor, etc. Now that we have access
to these nodes from \LUA\ it makes sense to be able to carry more information
with a node and this is where attributes kick in.

It is important to keep in mind that there are situations where nodes get created
in the current context. For instance, when \TEX\ builds a paragraph or page or
constructs math formulas, it does add nodes and giving these the current
attributes makes no sense and can even give weird side effects. In these cases,
the attributes are inherited from neighbouring nodes.

\stopsubsection

\startsubsection[title={Attribute registers}]

\topicindex {attributes}

Attributes are a completely new concept in \LUATEX. Syntactically, they behave a
lot like counters: attributes obey \TEX's nesting stack and can be used after
\prm {the} etc.\ just like the normal \prm {count} registers.

\startsyntax
\attribute <16-bit number> <optional equals> <32-bit number>!crlf
\attributedef <csname> <optional equals> <16-bit number>
\stopsyntax

Conceptually, an attribute is either \quote {set} or \quote {unset}. Unset
attributes have a special negative value to indicate that they are unset, that
value is the lowest legal value: \type {-"7FFFFFFF} in hexadecimal, a.k.a.
$-2147483647$ in decimal. It follows that the value \type {-"7FFFFFFF} cannot be
used as a legal attribute value, but you {\it can\/} assign \type {-"7FFFFFFF} to
\quote {unset} an attribute. All attributes start out in this \quote {unset}
state in \INITEX.

Attributes can be used as extra counter values, but their usefulness comes mostly
from the fact that the numbers and values of all \quote {set} attributes are
attached to all nodes created in their scope. These can then be queried from any
\LUA\ code that deals with node processing. Further information about how to use
attributes for node list processing from \LUA\ is given in~\in {chapter}[nodes].

Attributes are stored in a sorted (sparse) linked list that are shared when
possible. This permits efficient testing and updating. You can define many
thousands of attributes but normally such a large number makes no sense and is
also not that efficient because each node carries a (possibly shared) link to a
list of currently set attributes. But they are a convenient extension and one of
the first extensions we implemented in \LUATEX.

In \LUAMETATEX\ we try to minimize the memory footprint and creation of these
attribute lists more aggressive sharing them. This feature is still somewhat
experimental.

\stopsubsection

\startsubsection[title={Box attributes}]

\topicindex {attributes}
\topicindex {boxes}
\topicindex {vcentering}

Nodes typically receive the list of attributes that is in effect when they are
created. This moment can be quite asynchronous. For example: in paragraph
building, the individual line boxes are created after the \prm {par} command has
been processed, so they will receive the list of attributes that is in effect
then, not the attributes that were in effect in, say, the first or third line of
the paragraph.

Similar situations happen in \LUATEX\ regularly. A few of the more obvious
problematic cases are dealt with: the attributes for nodes that are created
during hyphenation, kerning and ligaturing borrow their attributes from their
surrounding glyphs, and it is possible to influence box attributes directly.

When you assemble a box in a register, the attributes of the nodes contained in
the box are unchanged when such a box is placed, unboxed, or copied. In this
respect attributes act the same as characters that have been converted to
references to glyphs in fonts. For instance, when you use attributes to implement
color support, each node carries information about its eventual color. In that
case, unless you implement mechanisms that deal with it, applying a color to
already boxed material will have no effect. Keep in mind that this
incompatibility is mostly due to the fact that separate specials and literals are
a more unnatural approach to colors than attributes.

It is possible to fine|-|tune the list of attributes that are applied to a \type
{hbox}, \type {vbox} or \type {vtop} by the use of the keyword \type {attr}. The
\type {attr} keyword(s) should come before a \type {to} or \type {spread}, if
that is also specified. An example is:

\startbuffer[tex]
\attribute997=123
\attribute998=456
\setbox0=\hbox {Hello}
\setbox2=\hbox attr 999 = 789 attr 998 = -"7FFFFFFF{Hello}
\stopbuffer

\startbuffer[lua]
  for b=0,2,2 do
    for a=997, 999 do
      tex.sprint("box ", b, " : attr ",a," : ",tostring(tex.box[b]     [a]))
      tex.sprint("\\quad\\quad")
      tex.sprint("list ",b, " : attr ",a," : ",tostring(tex.box[b].list[a]))
      tex.sprint("\\par")
    end
  end
\stopbuffer

\typebuffer[tex]

Box 0 now has attributes 997 and 998 set while box 2 has attributes 997 and 999
set while the nodes inside that box will all have attributes 997 and 998 set.
Assigning the maximum negative value causes an attribute to be ignored.

To give you an idea of what this means at the \LUA\ end, take the following
code:

\typebuffer[lua]

Later we will see that you can access properties of a node. The boxes here are so
called \nod {hlist} nodes that have a field \type {list} that points to the
content. Because the attributes are a list themselves you can access them by
indexing the node (here we do that with \type {[a]}). Running this snippet gives:

\start
    \getbuffer[tex]
    \startpacked \tt
        \ctxluabuffer[lua]
    \stoppacked
\stop

Because some values are not set we need to apply the \type {tostring} function
here so that we get the word \type {nil}.

A special kind of box is \prm {vcenter}. This one also can have attributes. When
one or more are set these plus the currently set attributes are bound to the
resulting box. In regular \TEX\ these centered boxes are only permitted in math
mode, but in \LUAMETATEX\ there is no error message and the box the height and
depth are equally divided. Of course in text mode there is no math axis related
offset applied.

It is possible to change or add to the attributes assigned to a box with \prm
{boxattribute}:

\starttyping
\boxattribute 0 123 456
\stoptyping

You can set attributes of the current paragraph specification node with \prm
{parattribute}:

\starttyping
\parattribute 123 456
\stoptyping

\stopsubsection

\stopsection

\startsection[title={\LUA\ related primitives}]

\startsubsection[title={\prm {directlua}}]

\topicindex{scripting}
\topicindex{lua+direct}

In order to merge \LUA\ code with \TEX\ input, a few new primitives are needed.
The primitive \prm {directlua} is used to execute \LUA\ code immediately. The
syntax is

\startsyntax
\directlua <general text>
\stopsyntax

The \syntax {<general text>} is expanded fully, and then fed into the \LUA\
interpreter. After reading and expansion has been applied to the \syntax
{<general text>}, the resulting token list is converted to a string as if it was
displayed using \type {\the\toks}. On the \LUA\ side, each \prm {directlua} block
is treated as a separate chunk. In such a chunk you can use the \type {local}
directive to keep your variables from interfering with those used by the macro
package.

The conversion to and from a token list means that you normally can not use \LUA\
line comments (starting with \type {--}) within the argument. As there typically
will be only one \quote {line} the first line comment will run on until the end
of the input. You will either need to use \TEX|-|style line comments (starting
with \%), or change the \TEX\ category codes locally. Another possibility is to
say:

\starttyping
\begingroup
\endlinechar=10
\directlua ...
\endgroup
\stoptyping

Then \LUA\ line comments can be used, since \TEX\ does not replace line endings
with spaces. Of course such an approach depends on the macro package that you
use.

The \prm {directlua} command is expandable. Since it passes \LUA\ code to the
\LUA\ interpreter its expansion from the \TEX\ viewpoint is usually empty.
However, there are some \LUA\ functions that produce material to be read by \TEX,
the so called print functions. The most simple use of these is \type
{tex.print(<string> s)}. The characters of the string \type {s} will be placed on
the \TEX\ input buffer, that is, \quote {before \TEX's eyes} to be read by \TEX\
immediately. For example:

\startbuffer
\count10=20
a\directlua{tex.print(tex.count[10]+5)}b
\stopbuffer

\typebuffer

expands to

\getbuffer

Here is another example:

\startbuffer
$\pi = \directlua{tex.print(math.pi)}$
\stopbuffer

\typebuffer

will result in

\getbuffer

Note that the expansion of \prm {directlua} is a sequence of characters, not of
tokens, contrary to all \TEX\ commands. So formally speaking its expansion is
null, but it collects material in a new level on the input stack to be
immediately read by \TEX\ after the \LUA\ call as finished. It is a bit like
\ETEX's \prm {scantokens}, which now uses the same mechanism. For a description
of print functions look at \in {section} [sec:luaprint].

Because the \syntax {<general text>} is a chunk, the normal \LUA\ error handling
is triggered if there is a problem in the included code. The \LUA\ error messages
should be clear enough, but the contextual information is often suboptimal
because it can come from deep down, and \TEX\ has no knowledge about what you do
in \LUA. Often, you will only see the line number of the right brace at the end
of the code.

While on the subject of errors: some of the things you can do inside \LUA\ code
can break up \LUAMETATEX\ pretty bad. If you are not careful while working with
the node list interface, you may even end up with errors or even crashes from
within the \TEX\ portion of the executable.

\stopsubsection

\startsubsection[title={\prm {luaescapestring}}]

\topicindex {escaping}

This primitive converts a \TEX\ token sequence so that it can be safely used as
the contents of a \LUA\ string: embedded backslashes, double and single quotes,
and newlines and carriage returns are escaped. This is done by prepending an
extra token consisting of a backslash with category code~12, and for the line
endings, converting them to \type {n} and \type {r} respectively. The token
sequence is fully expanded.

\startsyntax
\luaescapestring <general text>
\stopsyntax

Most often, this command is not actually the best way to deal with the
differences between \TEX\ and \LUA. In very short bits of \LUA\ code it is often
not needed, and for longer stretches of \LUA\ code it is easier to keep the code
in a separate file and load it using \LUA's \type {dofile}:

\starttyping
\directlua { dofile("mysetups.lua") }
\stoptyping

\stopsubsection

\startsubsection[title={\prm {luafunction}, \prm {luafunctioncall} and \prm {luadef}}]

\topicindex{functions}
\topicindex{lua+functions}

The \prm {directlua} commands involves tokenization of its argument (after
picking up an optional name or number specification). The tokenlist is then
converted into a string and given to \LUA\ to turn into a function that is
called. The overhead is rather small but when you have millions of calls it can
have some impact. For this reason there is a variant call available: \prm
{luafunction}. This command is used as follows:

\starttyping
\directlua {
    local t = lua.get_functions_table()
    t[1] = function() tex.print("!") end
    t[2] = function() tex.print("?") end
}

\luafunction1
\luafunction2
\stoptyping

Of course the functions can also be defined in a separate file. There is no limit
on the number of functions apart from normal \LUA\ limitations. Of course there
is the limitation of no arguments but that would involve parsing and thereby give
no gain. The function, when called in fact gets one argument, being the index, so
in the following example the number \type {8} gets typeset.

\starttyping
\directlua {
    local t = lua.get_functions_table()
    t[8] = function(slot) tex.print(slot) end
}
\stoptyping

The \prm {luafunctioncall} primitive does the same but is unexpandable, for
instance in an \prm {edef}. In addition \LUATEX\ provides a definer:

\starttyping
                 \luadef\MyFunctionA 1
          \global\luadef\MyFunctionB 2
\protected\global\luadef\MyFunctionC 3
\stoptyping

You should really use these commands with care. Some references get stored in
tokens and assume that the function is available when that token expands. On the
other hand, as we have tested this functionality in relative complex situations
normal usage should not give problems.

{\em It makes sense to delegate the implementation of the primitives to \LUA.}

\stopsubsection

\startsubsection[title={\prm {luabytecode} and \prm {luabytecodecall}}]

\topicindex{lua+bytecode}
\topicindex{bytecode}

Analogue to the function callers discussed in the previous section we have byte
code callers. Again the call variant is unexpandable.

\starttyping
\directlua {
    lua.bytecode[9998] = function(s)
        tex.sprint(s*token.scan_int())
    end
    lua.bytecode[5555] = function(s)
        tex.sprint(s*token.scan_dimen())
    end
}
\stoptyping

This works with:

\starttyping
\luabytecode    9998 5  \luabytecode    5555 5sp
\luabytecodecall9998 5  \luabytecodecall5555 5sp
\stoptyping

The variable \type {s} in the code is the number of the byte code register that
can be used for diagnostic purposes. The advantage of bytecode registers over
function calls is that they are stored in the format (but without upvalues).

{\em It makes sense to delegate the implementation of the primitives to \LUA.}

\stopsubsection

\stopsection

\startsection[title={Catcode tables}]

\startsubsection[title={Catcodes}]

\topicindex {catcodes}

Catcode tables are a new feature that allows you to switch to a predefined
catcode regime in a single statement. You can have lots of different tables, but
if you need a dozen you might wonder what you're doing. This subsystem is
backward compatible: if you never use the following commands, your document will
not notice any difference in behaviour compared to traditional \TEX. The contents
of each catcode table is independent from any other catcode table, and its
contents is stored and retrieved from the format file.

\stopsubsection

\startsubsection[title={\prm {catcodetable}}]

The primitive \prm {catcodetable} switches to a different catcode table. Such a
table has to be previously created using one of the two primitives below, or it
has to be zero. Table zero is initialized by \INITEX.

\startsyntax
\catcodetable <15-bit number>
\stopsyntax

\stopsubsection

\startsubsection[title={\prm {initcatcodetable}}]

\startsyntax
\initcatcodetable <15-bit number>
\stopsyntax

The primitive \prm {initcatcodetable} creates a new table with catcodes
identical to those defined by \INITEX. The new catcode table is allocated
globally: it will not go away after the current group has ended. If the supplied
number is identical to the currently active table, an error is raised. The
initial values are:

\starttabulate[|c|c|l|l|]
\DB catcode \BC character               \BC equivalent \BC category          \NC \NR
\TB
\NC  0 \NC \tttf \letterbackslash       \NC         \NC \type {escape}       \NC \NR
\NC  5 \NC \tttf \letterhat\letterhat M \NC return  \NC \type {car_ret}      \NC \NR
\NC  9 \NC \tttf \letterhat\letterhat @ \NC null    \NC \type {ignore}       \NC \NR
\NC 10 \NC \tttf <space>                \NC space   \NC \type {spacer}       \NC \NR
\NC 11 \NC {\tttf a} \endash\ {\tttf z} \NC         \NC \type {letter}       \NC \NR
\NC 11 \NC {\tttf A} \endash\ {\tttf Z} \NC         \NC \type {letter}       \NC \NR
\NC 12 \NC everything else              \NC         \NC \type {other}        \NC \NR
\NC 14 \NC \tttf \letterpercent         \NC         \NC \type {comment}      \NC \NR
\NC 15 \NC \tttf \letterhat\letterhat ? \NC delete  \NC \type {invalid_char} \NC \NR
\LL
\stoptabulate

\stopsubsection

\startsubsection[title={\prm {savecatcodetable}}]

\startsyntax
\savecatcodetable <15-bit number>
\stopsyntax

\prm {savecatcodetable} copies the current set of catcodes to a new table with
the requested number. The definitions in this new table are all treated as if
they were made in the outermost level. Again, the new table is allocated globally:
it will not go away after the current group has ended. If the supplied number is
the currently active table, an error is raised.

\stopsubsection

\startsubsection[title={\prm {letcharcode}}]

This primitive can be used to assign a meaning to an active character, as in:

\starttyping
\def\foo{bar} \letcharcode123=\foo
\stoptyping

This can be a bit nicer than using the uppercase tricks (using the property of
\prm {uppercase} that it treats active characters special).

\stopsubsection

\stopsection

\startsection[title={Tokens and expansion}]

\startsubsection[title={\prm {scantextokens}, \prm {tokenized} and \prm {retokenized}}]

\topicindex {tokens+scanning}

The syntax of \prm {scantextokens} is identical to \prm {scantokens}. This
primitive is a slightly adapted version of \ETEX's \prm {scantokens}. The
differences are:

\startitemize
\startitem
    The last (and usually only) line does not have a \prm {endlinechar} appended.
\stopitem
\startitem
    \prm {scantextokens} never raises an EOF error, and it does not execute
    \prm {everyeof} tokens.
\stopitem
\startitem
    There are no \quote {\unknown\ while end of file \unknown} error tests
    executed. This allows the expansion to end on a different grouping level or
    while a conditional is still incomplete.
\stopitem
\stopitemize

The implementation in \LUAMETATEX\ is different in the sense that it uses the same
methods as printing from \LUA\ to \TEX\ does. Therefore, in addition to the two
commands we also have this expandable command:

\startsyntax
\tokenized {...}
\tokenized catcodetable <number> {...}
\stopsyntax

The \prm {retokenized} variant differs in that it doesn't check for a keyword and
just used the current catcode regime.

The \ETEX\ command \type {\tracingscantokens} has been dropped in the process as
that was interwoven with the old code.

\stopsubsection

\startsubsection[title={\prm {toksapp}, \prm {tokspre}, \prm {etoksapp}, \prm {etokspre},
\prm {gtoksapp}, \prm {gtokspre}, \prm {xtoksapp},  \prm {xtokspre}}]

Instead of:

\starttyping
\toks0\expandafter{\the\toks0 foo}
\stoptyping

you can use:

\starttyping
\etoksapp0{foo}
\stoptyping

The \type {pre} variants prepend instead of append, and the \type {e} variants
expand the passed general text. The \type {g} and \type {x} variants are global.

\stopsubsection

\startsubsection[title={\prm {etoks} and \prm {xtoks}}]

A mix between the previously discussed append and prepend primitives and simple
toks register assignments are these two. They act like \prm {toks} but expand
their content first. The \type {x} variant does a global assignment.

\stopsubsection

\startsubsection[title={\prm {expanded}, \prm {expandedafter}, \prm {localcontrol}, \prm
{localcontrolled}, \prm {beginlocalcontrol} and \prm {endlocalcontrol}}]

\topicindex {expansion}

The \prm {expanded} primitive takes a token list and expands its content which
can come in handy: it avoids a tricky mix of \prm {expandafter} and \prm
{noexpand}. You can compare it with what happens inside the body of an \prm
{edef}. The \tex {immediateassignment} and \tex {immediateassigned} commands are
gone because we have the more powerful local control commands. They are a tad
slower but this mechanism isn't used that much anyway.

\starttyping
\let\immediateassigned\localcontrolled % sort of what \LUATEX provides
\stoptyping

Say that we define:

\startbuffer
\edef\TestA
  {\advance\scratchcounter\plusone}
\edef\TestB
  {\localcontrol\TestA
   \the\scratchcounter}
\edef\TestC
  {\localcontrolled{\advance\scratchcounter\plusone}%
   \the\scratchcounter}
\edef\TestD
  {\beginlocalcontrol\advance\scratchcounter\plusone\endlocalcontrol
   \the\scratchcounter}
\stopbuffer

\typebuffer \getbuffer

With this example:

\startbuffer
\scratchcounter 10 \meaningasis\TestA
\scratchcounter 20 \meaningasis\TestB
\scratchcounter 30 \meaningasis\TestC
\scratchcounter 40 \meaningasis\TestD
\stopbuffer

\typebuffer

We get this:

\startlines
\tttf \getbuffer
\stoplines

These local control primitives are a bit tricky and error message can be
confusing. Future versions might have a bit better recovery but in practice it
works as expected.

An \prm {expandedafter} primitive is also provided as an variant on \prm
{expandafter} that takes a token list instead of a single token.

\stopsubsection

\startsubsection[title={\prm {semiprotected}, \prm {semiexpanded}, \prm {expand},
\prm {semiexpand} and \prm {expandactive}}]

These primitives can best be explained with a few examples. The semi boils down to
a bit more controlled usage of \prm {protected} macros.

\startbuffer
               \def\Test {test}
               \def\TestA{\Test}
\protected     \def\TestB{\Test}
\semiprotected \def\TestC{\Test}
              \edef\TestD{\Test}
              \edef\TestE{\TestA}
              \edef\TestF{\TestB}
              \edef\TestG{\TestC}
              \edef\TestH{\normalexpanded{\TestB\TestC}} % ctx has \expanded defined
              \edef\TestI{\semiexpanded{\TestB\TestC}}
              \edef\TestJ{\expand\TestB\expand\TestC}
              \edef\TestK{\semiexpand\TestB\semiexpand\TestC}
\stopbuffer

\typebuffer \getbuffer

The effective meanings are given next (we use \prm {meaningasis} for this):

\startlines \tttf
\meaningasis\Test
\meaningasis\TestA
\meaningasis\TestB
\meaningasis\TestC
\meaningasis\TestD
\meaningasis\TestE
\meaningasis\TestF
\meaningasis\TestG
\meaningasis\TestH
\meaningasis\TestI
\meaningasis\TestJ
\meaningasis\TestK
\stoplines

I admit that is not yet applied much in \CONTEXT\ as we have no real need for it
and I implemented it more out for nostalgic reasons: the kind of selective
protect mechanism we have in \MKII.

Assuming that \type {~} is made active:

\starttyping
\protected\def~{!}

\edef\xxxx{~}
\edef\xxxx{\expandactive~}
\stoptyping

In both cases the meaning will show \type {~} so it's kind of subtle because in reality
they have the following internal representation:

\starttyping
active    char 126
protected call ~
\stoptyping

\stopsubsection

\startsubsection[title={Going ahead with \prm {expandafterpars} and \prm {expandafterspaces}}]

\topicindex{expansion+after}

Here are again some convenience primitives that simplify coding, remove the need
to show off with multi|-|step macros and are nicely expandable. They fit in the
repertoire of additional primitives that make macro code look somewhat easier.
Here are a few examples:

\startbuffer
\def\foo{!!} [\expandafterpars  \foo \par test]
\def\foo{!!} [\expandafterspaces\foo      test]

\def\foo{!!} \def\oof{\foo}                   [{\oof} test]
\def\foo{!!} \def\oof{\expandafterspaces\foo} [{\oof}test]
\stopbuffer

\typebuffer

These are typically used when building high level interfaces so not many users
will see them in document sources.

\startlines
\getbuffer
\stoplines

\stopsubsection

\startsubsection[title={\prm {afterassigned}}]

\topicindex{assignments+after}

This primitive is a multiple token variant of \prm {afterassignment} and it takes
a token list. It might look better in some cases than multiple single token
\quote {calls}.

\stopsubsection

\startsubsection[title={\prm {detokenized}}]

\topicindex{serializing}

The \prm {string} primitive serializes what comes next, a control sequence or
something more primitive string representation or just the (\UTF) character so it
does look at what it sees next in some detail. This can give confusing results
when the next token is for instance a new line. The \prm {detokenized} is less
picky and just serializes the token, so in the next examples an empty lines is
what we normally expect it to become: a serialized par token.

\def\oof{s\expandafter\foo\string}
\def\ofo{d\expandafter\foo\detokenized}
\def\foo#1{:[#1]}

\startbuffer
\oof test
\ofo test
\oof \relax
\ofo \relax
\oof \par
\ofo \par

\oof

\ofo

done
\stopbuffer

\typebuffer

We need the empty lines and \quote {done} to make sure we see the effect:

{\tttf \getbuffer}

\stopsubsection

\startsubsection[title={\prm {expandtoken} and \prm {expandcstoken}}]

\topicindex{expansion+tokens}

These two are not really needed but can make code look less weird (and
impressive) because there are no catcode changes involved. The next example
illustrates what they do:

\startbuffer
\edef\foo{\expandtoken 12 123 }              \meaning\foo
\edef\oof{\bgroup \egroup}                              \meaning\oof
\edef\oof{\expandcstoken \bgroup\expandcstoken \egroup} \meaning\oof
\edef\oof{\expandcstoken \foo   }                       \meaning\oof
\stopbuffer

\typebuffer

So \prm {expandtoken} expects two arguments: a catcode and a character number.
The \prm {expandcstoken} will only look at control sequences representing a
character.

\startlines
\getbuffer
\stoplines

\stopsubsection

\stopsection

\startsection[title=Grouping]

\startsubsection[title={\prm {endsimplegroup}}]

\topicindex{grouping+ending}

This feature might look somewhat weird so just ignore that it is there. It is one
of these features that might never make it in a engine when discussed in
committee but it comes in handy in \CONTEXT, so:

\startbuffer
\def\foo{\beginsimplegroup\bf\let\next}

\foo{test}
\foo{test\endgroup
\foo{test\endsimplegroup
\foo{test\egroup
\stopbuffer

\typebuffer

These lines typeset as:

\startlines \getbuffer \stoplines

The \prm {beginsimplegroup} primitives signals that any end group command, except
\prm {endmathgroup} will wrap up the current group. The \prm {endsimplegroup} is
sort of redundant but fits in anyway.

The also \LUAMETATEX\ specific \prm {beginmathgroup} and \prm {endmathgroup}
commands are like \prm {begingroup} and \prm {endgroup} but restore the mathstyle
when it has been changed in the group.

\startsubsection[title={\prm {aftergrouped}}]

\topicindex{grouping+after}

There is a new experimental feature that can inject multiple tokens to after the group
ends. An example demonstrate its use:

\startbuffer
{
    \aftergroup A \aftergroup B \aftergroup C
test 1 : }

{
    \aftergrouped{What comes next 1}
    \aftergrouped{What comes next 2}
    \aftergrouped{What comes next 3}
test 2 : }


{
    \aftergroup A \aftergrouped{What comes next 1}
    \aftergroup B \aftergrouped{What comes next 2}
    \aftergroup C \aftergrouped{What comes next 3}
test 3 : }

{
    \aftergrouped{What comes next 1} \aftergroup A
    \aftergrouped{What comes next 2} \aftergroup B
    \aftergrouped{What comes next 3} \aftergroup C
test 4 : }
\stopbuffer

\typebuffer

This gives:

\startpacked\getbuffer\stoppacked

\stopsubsection

\startsubsection[title={\prm {atendofgroup} and \prm {atendofgrouped}}]

\topicindex{grouping+ending}

These are variants of \prm {aftergroup} and \prm {aftergrouped} but they happen
{\em before} the groups is closed. It is one of these primitives that is not
really needed but that can make code (and tracing) cleaner, which is one of the
objectives (at least for \CONTEXT).

\stopsubsection

\stopsection

\startsection[title=Conditions]

\startsubsection[title={\prm{ifabsnum} and \prm {ifabsdim}}]

\topicindex{conditions}

There are two tests that we took from \PDFTEX:

\startbuffer
\ifabsnum -10 = 10
    the same number
\fi
\ifabsdim -10pt = 10pt
    the same dimension
\fi
\stopbuffer

\typebuffer

This gives

\blank {\tt \getbuffer} \blank

\stopsubsection

\startsubsection[title={Comparing}]

When comparing (for instance) to numbers the a \type {=}, \type {<} or \type {>}
is used. In \LUAMETATEX\ you can negate such a comparison by \type {!}, as in
\type {!=}, \type {!<} or \type {!>}. Multiple \type {!} flip that state.

In addition to these \ASCII\ combinations, we also support some \UNICODE\
variants. The extra comparison options are:

\starttabulate[|l|c|c|l|]
\DB character      \BC                     \BC            \BC operation         \NC \NR
\TB
\NC \type {0x2208} \NC $\tocharacter"2208$ \NC            \NC element of        \NC \NR
\NC \type {0x2209} \NC $\tocharacter"2209$ \NC            \NC not element of    \NC \NR
\NC \type {0x2260} \NC $\tocharacter"2260$ \NC \type {!=} \NC not equal         \NC \NR
\NC \type {0x2264} \NC $\tocharacter"2264$ \NC \type {!>} \NC less equal        \NC \NR
\NC \type {0x2265} \NC $\tocharacter"2265$ \NC \type {!<} \NC greater equal     \NC \NR
\NC \type {0x2270} \NC $\tocharacter"2270$ \NC            \NC not less equal    \NC \NR
\NC \type {0x2271} \NC $\tocharacter"2271$ \NC            \NC not greater equal \NC \NR
\LL
\stoptabulate

\stopsubsection

\startsubsection[title={\prm{ifzeronum}, \prm {ifzerodim}}]

\topicindex {conditions+numbers}
\topicindex {conditions+dimensions}

Their name tells what they test for: zero (point) values.

\stopsubsection

\startsubsection[title={\prm{ifcmpnum}, \prm {ifcmpdim}, \prm {ifnumval}, \prm
{ifdimval}, \prm {ifchknum} and \prm {ifchkdim}}]

\topicindex {conditions+numbers}
\topicindex {conditions+dimensions}
\topicindex {numbers}
\topicindex {dimensions}

New are the ones that compare two numbers or dimensions:

\startbuffer
\ifcmpnum 5 8 less \or equal \else more \fi
\ifcmpnum 5 5 less \or equal \else more \fi
\ifcmpnum 8 5 less \or equal \else more \fi
\stopbuffer

\typebuffer \blank {\tt \getbuffer} \blank

and

\startbuffer
\ifcmpdim 5pt 8pt less \or equal \else more \fi
\ifcmpdim 5pt 5pt less \or equal \else more \fi
\ifcmpdim 8pt 5pt less \or equal \else more \fi
\stopbuffer

\typebuffer \blank {\tt \getbuffer} \blank

There are also some number and dimension tests. All four expose the \type {\else}
branch when there is an error, but two also report if the number is less, equal
or more than zero.

\startbuffer
\ifnumval  -123  \or < \or = \or > \or ! \else ? \fi
\ifnumval     0  \or < \or = \or > \or ! \else ? \fi
\ifnumval   123  \or < \or = \or > \or ! \else ? \fi
\ifnumval   abc  \or < \or = \or > \or ! \else ? \fi

\ifdimval -123pt \or < \or = \or > \or ! \else ? \fi
\ifdimval    0pt \or < \or = \or > \or ! \else ? \fi
\ifdimval  123pt \or < \or = \or > \or ! \else ? \fi
\ifdimval  abcpt \or < \or = \or > \or ! \else ? \fi
\stopbuffer

\typebuffer \blank {\tt \getbuffer} \blank

\startbuffer
\ifchknum  -123  \or okay \else bad \fi
\ifchknum     0  \or okay \else bad \fi
\ifchknum   123  \or okay \else bad \fi
\ifchknum   abc  \or okay \else bad \fi

\ifchkdim -123pt \or okay \else bad \fi
\ifchkdim    0pt \or okay \else bad \fi
\ifchkdim  123pt \or okay \else bad \fi
\ifchkdim  abcpt \or okay \else bad \fi
\stopbuffer

\typebuffer \blank {\tt \getbuffer} \blank

The last checked values are available in \prm {lastchknum} and \prm {lastchkdim}.
These don't obey grouping.

The two primitives \prm {ifchkdimension} and \prm {ifchknumber} are like \prm
{ifchkdimen} and \prm {ifchknum}but are more rigorous: the short ones quit
scanning at a match where after the match there can be anything, while the long
variants don't accept following crap.

\stopsubsection

\startsubsection[title={\prm {ifmathstyle} and \prm {ifmathparameter}}]

These two are variants on \prm {ifcase} where the first one operates with values
in ranging from zero (display style) to seven (cramped script script style) and
the second one can have three values: a parameter is zero, has a value or is
unset. The \type {\ifmathparameter} primitive takes a proper parameter name and a
valid style identifier (a primitive identifier or number). The \type
{\ifmathstyle} primitive is equivalent to \type {\ifcase \mathstyle}.

\stopsubsection

\startsubsection[title={\prm {ifempty}}]

This primitive tests for the following token (control sequence) having no
content. Assuming that \type {\empty} is indeed empty, the following two are
equivalent:

\starttyping
\ifempty\whatever
\ifx\whatever\empty
\stoptyping

There is no real performance gain here, it's more one of these extensions that
lead to less clutter in tracing.

\stopsubsection

\startsubsection[title={\prm {ifrelax}}]

This primitive complements \type {\ifdefined}, \type {\ifempty} and \type
{\ifcsname} so that we have all reasonable tests directly available.

\stopsubsection

\startsubsection[title={\prm {ifboolean}}]

This primitive tests for non|-|zero, so the next variants are similar

\starttyping
       \ifcase   <integer>.F.\else .T.\fi
\unless\ifcase   <integer>.T.\else .F.\fi
       \ifboolean<integer>.T.\else .F.\fi
\stoptyping

\stopsubsection

\startsubsection[title={\prm {iftok} and \prm {ifcstok}}]

\topicindex {conditions+tokens}
\topicindex {tokens}

Comparing tokens and macros can be done with \type {\ifx}. Two extra test are
provided in \LUAMETATEX:

\startbuffer
\def\ABC{abc} \def\DEF{def} \def\PQR{abc} \newtoks\XYZ \XYZ {abc}

\iftok{abc}{def}\relax  (same) \else [different] \fi
\iftok{abc}{abc}\relax  [same] \else (different) \fi
\iftok\XYZ {abc}\relax  [same] \else (different) \fi

\ifcstok\ABC \DEF\relax (same) \else [different] \fi
\ifcstok\ABC \PQR\relax [same] \else (different) \fi
\ifcstok{abc}\ABC\relax [same] \else (different) \fi
\stopbuffer

\typebuffer \startpacked[blank] {\tt\nospacing\getbuffer} \stoppacked

You can check if a macro is defined as protected with \type {\ifprotected} while
frozen macros can be tested with \type {\iffrozen}. A provisional \type
{\ifusercmd} tests will check if a command is defined at the user level (and this
one might evolve).

\stopsubsection

\startsubsection[title={\prm {ifhastok}, \prm {ifhastoks}, \prm {ifhasxtoks} and \prm {ifhaschar}}]

\topicindex {conditions+tokens}
\topicindex {tokens}

The first three test primitives run over a token list in order to encounter a
single token or a sequence. The \type {x} variants applies expansion.

\startbuffer
\def\ab {ab}
\def\abc{abc}
\ifhastok  1    {12} Y\else N\fi
\ifhastoks {ab} {abc}Y\else N\fi
\ifhastoks {ab} {\abc}Y\else N\fi
\ifhastoks {\ab}{\abc}Y\else N\fi
\ifhasxtoks{ab} {\abc}Y\else N\fi
\ifhastok  3    {12} Y\else N\fi
\ifhastoks {de} {abc}Y\else N\fi
\stopbuffer

\typebuffer \startpacked[blank] {\tt\nospacing\getbuffer} \stoppacked

The \prm {ifhaschar} primitive differs from \prm {ifhastok} in that it handles
nested balanced \quote {lists}, as in:

\startbuffer
\ifhastok  a  {abc}Y\else N\fi
\ifhaschar a  {abc}Y\else N\fi
\ifhastok  a{{a}bc}Y\else N\fi
\ifhaschar a{{a}bc}Y\else N\fi
\stopbuffer

\typebuffer \startpacked[blank] {\tt\nospacing\getbuffer} \stoppacked

\stopsubsection

\startsubsection[title={\prm {ifarguments}, \prm {ifparameters} and \prm {ifparameter}}]

These are part of the extended macro argument parsing features. The \prm
{ifarguments} condition is like an \prm {ifcase} where the number is the
picked up number of arguments. The number reflects the {\em last} count, so
successive macro expansions will adapt the value. The \prm {ifparameters}
counts till the first empty parameter and the \prm {ifparameter} (singular)
takes a parameter reference (like \type {#2}) and again is an \prm {ifcase}
where zero means a bad reference, one a non|-|empty argument and two an empty
one. A typical usage is:

\starttyping
\def\foo#1#2%
  {\ifparameter#1\or one\fi
   \ifparameter#2\or two\fi}
\stoptyping

No expansion of arguments takes place here but you can use a test like this:

\starttyping
\def\foo#1#2%
  {\iftok{#1}{}\else one\fi
   \iftok{#2}{}\else two\fi}
\stoptyping


\stopsubsection

\startsubsection[title={\prm {ifcondition}}]

\topicindex {conditions}

This is a somewhat special one. When you write macros conditions need to be
properly balanced in order to let \TEX's fast branch skipping work well. This new
primitive is basically a no||op flagged as a condition so that the scanner can
recognize it as an if|-|test. However, when a real test takes place the work is
done by what follows, in the next example \tex {something}.

\starttyping
\unexpanded\def\something#1#2%
  {\edef\tempa{#1}%
   \edef\tempb{#2}
   \ifx\tempa\tempb}

\ifcondition\something{a}{b}%
    \ifcondition\something{a}{a}%
        true 1
    \else
        false 1
    \fi
\else
    \ifcondition\something{a}{a}%
        true 2
    \else
        false 2
    \fi
\fi
\stoptyping

If you are familiar with \METAPOST, this is a bit like \type {vardef} where the macro
has a return value. Here the return value is a test.

Experiments with something \type {\ifdef} actually worked ok but were rejected
because in the end it gave no advantage so this generic one has to do. The \type
{\ifcondition} test is basically is a no|-|op except when branches are skipped.
However, when a test is expected, the scanner gobbles it and the next test result
is used. Here is an other example:

\startbuffer
\def\mytest#1%
  {\ifabsdim#1>0pt\else
     \expandafter \unless
   \fi
   \iftrue}

\ifcondition\mytest{10pt}\relax non-zero \else zero \fi
\ifcondition\mytest {0pt}\relax non-zero \else zero \fi
\stopbuffer

\typebuffer \blank {\tt \getbuffer} \blank

The last expansion in a macro like \type {\mytest} has to be a condition and here
we use \type {\unless} to negate the result.

\stopsubsection

\startsubsection[title={\prm {orelse} and \prm {orunless}}]

Sometimes you have successive tests that, when laid out in the source lead to
deep trees. The \type {\ifcase} test is an exception. Experiments with \type
{\ifcasex} worked out fine but eventually were rejected because we have many
tests so it would add a lot. As \LUAMETATEX\ permitted more experiments,
eventually an alternative was cooked up, one that has some restrictions but is
relative lightweight. It goes like this:

\starttyping
\ifnum\count0<10
    less
\orelse\ifnum\count0=10
    equal
\else
    more
\fi
\stoptyping

The \type {\orelse} has to be followed by one of the if test commands, except
\type {\ifcondition}, and there can be an \type {\unless} in front of such a
command. These restrictions make it possible to stay in the current condition
(read: at the same level). If you need something more complex, using \type
{\orelse} is probably unwise anyway. In case you wonder about performance, there
is a little more checking needed when skipping branches but that can be
neglected. There is some gain due to staying at the same level but that is only
measurable when you runs tens of millions of complex tests and in that case it is
very likely to drown in the real action. It's a convenience mechanism, in the
sense that it can make your code look a bit easier to follow.

There is a nice side effect of this mechanism. When you define:

\starttyping
\def\quitcondition{\orelse\iffalse}
\stoptyping

you can do this:

\starttyping
\ifnum\count0<10
    less
\orelse\ifnum\count0=10
    equal
    \quitcondition
    indeed
\else
    more
\fi
\stoptyping

Of course it is only useful at the right level, so you might end up with cases like

\starttyping
\ifnum\count0<10
    less
\orelse\ifnum\count0=10
    equal
    \ifnum\count2=30
        \expandafter\quitcondition
    \fi
    indeed
\else
    more
\fi
\stoptyping

The \prm {orunless} variant negates the next test, just like \prm {unless}. In
some cases these commands look at the next token to see if it is an if|-|test so
a following negation will not work (read: making that work would complicate the
code and hurt efficiency too). Side note: interesting is that in \CONTEXT\ we
hardly use this kind of negation.

\stopsubsection

\startsubsection[title={\prm {ifflags}}]

This checker deal with control sequences. You can check if a command is a
protected one, that is, defined with the \type {\protected} prefix. A command is
frozen when it has been defined with the \type {\frozen} prefix. Beware: only
macros can be frozen. A user command is a command that is not part of the
predefined set of commands. This is an experimental command. The flag values can
be queried with \typ {tex.getflagvalues}.

\stopsubsection

\stopsection

\startsection[title={Control and debugging}]

\startsubsection[title={Tracing}]

\topicindex {tracing}

If \prm {tracingonline} is larger than~2, the node list display will also print
the node number of the nodes as well as set attributes (these can be made verbose
by a callback). We have only a generic whatsit but again a callback can be used
to provide detail. So, when a box is shown in \CONTEXT\ you will see quite a lot
more than in other engines. Because nodes have more fields, more is shown anyway,
and for nodes that have sublists (like discretionaries) these are also shown. All
that could have been delegated to \LUA\ but it felt wrong to not made that a core
engine feature.

The \prm {tracingpenalties} parameter triggers the line break routine to report
the applied interline penalties to the output.

When \prm {tracingcommands} is larger than 3 the mode switch will be not be
prefixed to the \type {{command}} but get its own \type {[line]}.

When \prm {tracinghyphenation} is set to 1 duplicate patterns are reported (in
\CONTEXT\ we default to that) and higher values will also show details about the
\LUA\ hyphenation (exception) feedback loop discussed elsewhere.

When set to 1 the \prm {tracingmath} variable triggers the reporting of the mode
(inline or display) an mlist is processed. Other new tracing commands are
discussed where the mechanisms that they relate to are introduced.

The \prm {tracingnodes} variable makes that when a node list is reported the node
numbers are also shown. This is only useful when you have callbacks that access
nodes.

\starttabulate[|l|p|]
\DB value \BC effect \NC \NR
\TB
\NC 1 \NC show node numbers in lists \NC \NR
\NC 2 \NC also show numbers of attribute nodes \NC \NR
\NC 3 \NC also show glue spec node numbers \NC \NR
\LL
\stoptabulate

When the \prm {shownodedetails} variable is set to a value larger than zero and a
node is shown (in a list) then more details will be revealed. This can be rather
verbose because in \LUAMETATEX\ node carry more properties than in traditional
\TEX\ and \LUATEX. A value larger than one will also show details of attributes
that are bound to nodes.

The \prm {tracinglevels} variable is a bitset and offers the following features:

\starttabulate[|l|p|]
\DB value \BC effect \NC \NR
\TB
\NC 1 \NC show group level \NC \NR
\NC 2 \NC show input level \NC \NR
\NC 4 \NC show catcode regime \NC \NR
\LL
\stoptabulate

So a value of~7 shows them all. In \CONTEXT\ we set this variable to~3 which
gives a rather verbose log when tracing is on but in the end its'not that bad
because using some of the newer programming related primitive can save tracing.

The \prm {tracinglists} variable will show some of the (intermediate) lists that
get processed. It is there mainly for development but might evolve.

Because in \LUATEX\ the saving and restoring of locally redefined macros and set
variables is optimized a bit in order to prevent redundant stack usage, there
will be less tracing visible.

Also, because we have a more extensive macro argument parser, a fast path (and
less storage demands) for macros with no arguments, and flags that can be set for
macros the way macros are traced can be different in details (we therefore have
for instance \prm {meaningfull} (double l's indeed) and \prm {meaningless} as
variants of \prm {meaning} as well as \prm {meaningasis} for more literal
alternative). The \prm {meaningful} and \prm {meaningles} variants show no body
but do show the preamble when we have arguments.

\stopsubsection

% \startsubsection[title={\prm {lastnodetype}, \prm {lastnodesubtype}, \prm
% {currentiftype} and \prm {internalcodesmode}.}]
%
% The \ETEX\ command \type {\lastnodetype} is limited to some nodes. When the
% parameter \type {\internalcodesmode} is set to a non|-|zero value the normal
% (internally used) numbers are reported. The same is true for \type
% {\currentiftype}, as we have more conditionals and also use a different order.
% The \type {\lastnodesubtype} is a bonus.
%
% \stopsubsection

\startsubsection[title={\prm {lastnodetype}, \prm {lastnodesubtype}, \prm
{currentiftype}}]

The \ETEX\ command \prm {lastnodetype} returns the node codes as used in the
engine. You can query the numbers at the \LUA\ end if you need the actual values.
The parameter \type {\internalcodesmode} is no longer provided as compatibility
switch because \LUATEX\ has more cq. some different nodes and it makes no sense
to be incompatible with the \LUA\ end of the engine. The same is true for \prm
{currentiftype}, as we have more conditionals and also use a different order.
The \prm {lastnodesubtype} is a bonus and again reports the codes used
internally. During development these might occasionally change, but eventually
they will be stable.

\stopsubsection

\startsubsection[title={\prm {lastboundary} and \prm {unboundary}}]

There are \prm {lastpenalty}, \prm {lastskip}, \prm {lastkern} and \prm {lastbox}
primitives and \LUAMETATEX\ also offers \prm {lastboundary} which gives the value
assigned to a user boundary node. This means that we also have a \prm
{unboundary} to complement the other \tex {un...} primitives.

\stopsubsection

\startsubsection[title=Nodes]

\topicindex {nodes}

The \ETEX\ primitive \prm {lastnodetype} is not honest in reporting the
internal numbers as it uses its own values. But you can set \type
{\internalcodesmode} to a non|-|zero value to get the real id's instead. In
addition there is \prm {lastnodesubtype}.

Another last one is \prm {lastnamedcs} which holds the last match but this one
should be used with care because one never knows if in the meantime something
else \quote {last} has been seen.

\stopsubsection

\stopsection

\startsection[title=Kerns and penalties]

\startsubsection[title=\prm {hkern} and \prm {vkern}]

\topicindex {kerns}

These two primitives complement \prm {hskip} and \prm {vskip} and force the right
mode when issued. Contrary to the skips, internally we still have a common kern
command code but that is not something the user has to worry about.

\stopsubsection

\startsubsection[title=\prm {hpenalty} and \prm {vpenalty}]

\topicindex {penalties}

As the kern and skip related primitives mentioned in the in the previous section
these two primitives are there fort consistency: they force the right (related)
mode. (Sometimes being a bit more explicit is cleaner.)

\stopsubsection

\stopsection

\startsection[title=Scanning]

\startsubsection[title=Keywords]

\topicindex {keywords}
\topicindex {scanning+keywords}

Some primitives accept one or more keywords and \LUAMETATEX\ adds some more. In
order to deal with this efficiently the keyword scanner has been optimized, where
even the context was taken into account. As a result the scanner was quite a bit
faster. This kind of optimization was a graduate process the eventually ended up
in what we have now. In traditional \TEX\ (and also \LUATEX) the order of
keywords is sometimes mixed and sometimes prescribed. In most cases only one
occurrence is permitted. So, for instance, this is valid in \LUATEX:

\starttyping
\hbox attr 123 456 attr 123 456 spread 10cm { }
\hrule width 10cm depth 3mm
\hskip 3pt plus 2pt minus 1pt
\stoptyping

The \type {attr} comes before the \type {spread}, rules can have multiple mixed
dimension specifiers, and in glue the optional \type {minus} part always comes
last. The last two commands are famous for look ahead side effects which is why
macro packages will end them with something not keyword, like \type {\relax},
when needed.

In \LUAMETATEX\ the following is okay. Watch the few more keywords in box and
rule specifications.

\starttyping
\hbox reverse to 10cm attr 123 456 orientation 4 xoffset 10pt spread 10cm { }
\hrule xoffset 10pt width 10cm depth 3mm
\hskip 3pt minus 1pt plus 2pt
\stoptyping

Here the order is not prescribed and, as demonstrated with the box specifier, for
instance dimensions (specified by \type {to} or \type {spread} can be overloaded
by later settings. In case you wonder if that breaks compatibility: in some way
it does but bad or sloppy keyword usage breaks a run anyway. For instance \type
{minuscule} results in \type {minus} with no dimension being seen. So, in the end
the user should not noticed it and when a user does, the macro package already
had an issue that had to be fixed.

\stopsubsection

\startsubsection[title={\prm {norelax}}]

\topicindex{relaxing}

There are a few cases where the \TEX\ scanned skips over spaces and \prm {relax} as
well as quits on a \prm {relax} in which case it gets pushed back. An example is
given below:

\startbuffer
\edef\TestA{\ifnum1=1\relax   Y\else N\fi} \meaningasis\TestA
\edef\TestB{\ifnum1=1\norelax Y\else N\fi} \meaningasis\TestB
\stopbuffer

\typebuffer

The second line also contains a sentinel but this time we use \prm {norelax}
which will not be pushed back. So, this feature is just a trick to get rid of (in
itself reasonable) side effects.

\startlines\getbuffer \stoplines

\stopsubsection

\startsubsection[title={\prm {ignorepars}}]

This primitive is like \prm {ignorespaces} but also skips paragraph ending
commands (normally \prm {par} and empty lines).

\stopsubsection

\startsubsection[title={\prm {futureexpand}, \prm {futureexpandis}, \prm {futureexpandisap}}]

\topicindex{expansion+future}

These commands are used as:

\starttyping
\futureexpand\sometoken\whenfound\whennotfound
\stoptyping

When there is no match and a space was gobbled a space will be put back. The
\type {is} variant doesn't do that while the \type {isap} even skips \type
{\pars}, These characters stand for \quote {ignorespaces} and \quote
{ignorespacesandpars}.

\stopsubsection

\stopsection

\startsection[title=Macros]

\startsubsection[title={\prm {lettonothing} and \prm {glettonothing}}]

This primitive is equivalent to:

\starttyping
\protected\def\lettonothing#1{\def#1{}}
\stoptyping

and although it might feel faster (only measurable with millions of calls) it's
mostly there because it is easier on tracing (less clutter). An advantage over
letting to an empty predefined macro is also that in tracing we keep seeing the
name (relaxing would show the relax equivalent).

\stopsubsection

\startsubsection[title={\prm {glet}}]

This primitive is similar to:

\starttyping
\protected\def\glet{\global\let}
\stoptyping

but faster (only measurable with millions of calls) and probably more convenient
(after all we also have \type {\gdef}).

\stopsubsection

\startsubsection[title={\prm {defcsname}, \prm {edefcsname}, \prm {gdefcsname} and \prm {xdefcsname}}]

Although we can implement these primitives easily using macros it makes sense,
given the popularity of \prm {csname} to have these as primitives. It also saves
some \prm {expandafter} usage and it looks a bit better in the source.

\starttyping
\gdefcsname foo\endcsname{oof}
\stoptyping

\stopsubsection

\startsubsection[title={\prm {letcsname} and \prm {gletcsname}}]

These can also be implemented using macros but again they are natively provided
by the engine for the same reasons: less code and less tracing clutter.

\starttyping
\gletcsname foo\endcsname \relax
\stoptyping

\stopsubsection

\startsubsection[title={\prm{cdef}, \prm {cdefcsname} and \prm {constant}}]

These primitives are like \prm {edef} and \prm {edefcsname} but tag the macro as
being kind of simple, which means that in some scenarios they are serialized in a
fast way, not going through the expansion machinery. This is actually an
experiment but it will stay. The \prm {constant} prefix can be used with other
definition primitives instead.

\stopsubsection

\startsubsection[title={\prm {csstring}, \prm {begincsname} and \prm {lastnamedcs}}]

These are somewhat special. The \prm {csstring} primitive is like
\prm {string} but it omits the leading escape character. This can be
somewhat more efficient than stripping it afterwards.

The \prm {begincsname} primitive is like \prm {csname} but doesn't create
a relaxed equivalent when there is no such name. It is equivalent to

\starttyping
\ifcsname foo\endcsname
  \csname foo\endcsname
\fi
\stoptyping

The advantage is that it saves a lookup (don't expect much speedup) but more
important is that it avoids using the \prm {if} test. The \prm {lastnamedcs}
is one that should be used with care. The above example could be written as:

\starttyping
\ifcsname foo\endcsname
  \lastnamedcs
\fi
\stoptyping

This is slightly more efficient than constructing the string twice (deep down in
\LUATEX\ this also involves some \UTF8 juggling), but probably more relevant is
that it saves a few tokens and can make code a bit more readable.

Active characters are stored in the hash with a special prefix sequence prepended
to the character: \prm {csactive} or the never used \UTF\ representation of \type
{U+FFFF}.

\stopsubsection

\startsubsection[title={\prm {futuredef} and \prm {futurecsname}}]

This is just the definition variant of \prm {futurelet} and a simple example
shows the difference:

\startbuffer
\def\whatever{[\next:\meaning\next]}
\futurelet\next\whatever A
\futuredef\next\whatever B
\stopbuffer

\typebuffer

\getbuffer

The next one was more an experiment that then stayed around, just to see what
surprising abuse of this primitive will happen:

\startbuffer
\def\whateveryes{[YES]}
\def\whatevernop{[NOP]}
\let\whatever\undefined
\futurecsname\whatevernop whatever\endcsname
\futurecsname\whatevernop whateveryes\endcsname
\stopbuffer

\typebuffer

When the assembles control sequence is undefined the given one will be expanded,
a weird one, right? I will probably apply it some day in cases where I want less
tracing and a more direct expansion of an assembled name.

\getbuffer

Here is a usage example:

\starttyping
\xdef\Whatever{\futurecsname\whatevernop    whatever\endcsname}
\xdef\Whatever{\futurecsname\whateveryes whateveryes\endcsname}
\xdef\Whatever{\ifcsname    whatever\endcsname\lastnamedcs\else\whatevernop\fi}
\xdef\Whatever{\ifcsname whateveryes\endcsname\lastnamedcs\else\whatevernop\fi}
\xdef\Whatever{\ifcsname    whatever\endcsname\csname    whatever\endcsname\else\whatevernop\fi}
\xdef\Whatever{\ifcsname whateveryes\endcsname\csname whateveryes\endcsname\else\whatevernop\fi}
\stoptyping

The timings for one million times defining each of these definitions are 0.277,
0.313, 0.310, 0.359, 0.352 and 0.573 seconds (on a 2018 Dell 7250 Precision
laptop with mobile E3-1505M v6 processor), so there is a little gain here, but of
course in practice no one will notice that because not that many such macros are
defined (or used).

\stopsubsection

\startsubsection[title=Prefixes]

Quite some primitive usage can be preceded by a prefix. For instance assignments
can be \prm {global} and macros can be defined \prm {protected}. In \LUAMETATEX\
we have more prefixes, like \prm {tolerant} that signals a different way of
interpreting macro arguments and \type {permanent} that flags a definition in
away that, when overload protection is enabled, will prevent redefinition.
Prefixes like \prm {immediate} and its counterpart \prm {deferred} are backend
related and, as we don't have one in the engine, are not doing much. They are
just intercepted and passed to e.g.\ some \LUA\ function called so that one can
use them to construct additional (\LUA\ based) pseudo primitives. Various
prefixes are discussed elsewhere.

\stopsubsection

\startsubsection[title=Arguments]

\topicindex {macros+arguments}

Again this is experimental and (used and) discussed in document that come with the
\CONTEXT\ distribution. When defining a macro you can do this:

\starttyping
\def\foo(#1)#2{...}
\stoptyping

Here the first argument between parentheses is mandate. But the magic
prefix \prm {tolerant} makes that limitation go away:

\starttyping
\tolerant\def\foo(#1)#2{...}
\stoptyping

A variant is this:

\starttyping
\tolerant\def\foo(#1)#*(#2){...}
\stoptyping

Here we have two optional arguments, possibly be separated by spaces. There are
more parsing options:

\starttabulate[|T|i2l|]
\FL
\NC +   \NC keep the braces \NC \NR
\NC -   \NC discard and don't count the argument \NC \NR
\NC /   \NC remove leading an trailing spaces and pars \NC \NR
\NC =   \NC braces are mandate \NC \NR
\NC _   \NC braces are mandate and kept \NC \NR
\NC ^   \NC keep leading spaces \NC \NR
\ML
\NC 1-9 \NC an argument \NC \NR
\NC 0   \NC discard but count the argument \NC \NR
\ML
\NC *   \NC ignore spaces \NC \NR
\NC .   \NC ignore pars and spaces \NC \NR
\NC ,   \NC push back space when no match \NC \NR
\ML
\NC :   \NC pick up scanning here  \NC \NR
\NC ;   \NC quit scanning \NC \NR
\LL
\stoptabulate

For the moment we leave it to your fantasy what these options do. Most probably
only make sense when you write a bit more complex macros. Just try to imagine
what this does:

\starttyping
\permanent\tolerant\global\protected\def\foo(#1)#*#;[#2]#:#3{...}
\stoptyping

Of course complex combinations can be confusing because after all \TEX\ is
parsing for (multi|-|token) delimiters and will happily gobble the whole file if
you are not careful. You can quit scanning with \prm {ignorearguments} if you
want:

\starttyping
\mymacro 123\ignorearguments
\stoptyping

which of course only makes sense when used in a nested call where an already
picked up arguments is processed further. A not (yet) discussed feature of the
parser is that it will happily skip tokens that have the (probably seldom used)
ignored characters property.

When you use tracing or see error messages arguments defined using for instance
\type {#=} will have their usual number in the macro body, so you need to keep
track of the numbers.

All this is rather easy on the engine and although it might have a little impact
on performance this has been compensated by some more efficiency in the macro
parser and engine in general and of course you can gain back some by using these
features.

\stopsubsection

\startsubsection[title={\prm {parametermark}}]

The meaning of primitive \prm {parametermark} is equivalent to \type {#} in a macro
definition, just like \prm {alignmark} is in an alignment. It can be used to circumvent
catcode issues. The normal \quotation {duplicate them when nesting} rules apply.

\startbuffer
\def\foo\parametermark1%
  {\def\oof\parametermark\parametermark1%
     {[\parametermark1:\parametermark\parametermark1]}}
\stopbuffer

\typebuffer \getbuffer

Here \type {\foo{X}\oof{Y}} gives: \foo{X}\oof{Y}.

\stopsubsection

\startsubsection[title={\prm {lastarguments} and \prm {parametercount}}]

\topicindex{arguments+numberof}

There are two state variables that refer to the number of read arguments. An
example can show the difference:

\startbuffer
\tolerant\def\foo[#1]#*[#2]{[\the\lastarguments,\the\parametercount]}

\foo[1][2]
\foo[1]
\foo

x: \foo[1][2]
x: \foo[1]
x: \foo
\stopbuffer

\typebuffer

What you get actually depends on the macro package. When for instance \prm
{everypar} has some value that results in a macro being expanded, the numbers
reported can refer to the most recent macro because serializing the number can
result in entering horizontal mode.

\startlines
\getbuffer
\stoplines

The \prm {lastarguments} returns the most recent global state variable as with
any \type {\last...} primitives. Because it actually looks at the parameter stack
of the currently expanded macro \prm {parametercount} is more reliable but also
less efficient.

\stopsubsection

\startsubsection[title=Overload protection]

\topicindex {macros+overloading}

There is an experimental overload protection mechanism that we will test for a
while before declaring it stable. The reason for that is that we need to adapt
the \CONTEXT\ code base in order to test its usefulness. Protection is achieved
via prefixes. Depending on the value of the \prm {overloadmode} variable
warnings or errors will be triggered. Examples of usage can be found in some
documents that come with \CONTEXT, so here we just stick to the basics.

\starttyping
\mutable  \def\foo{...}
\immutable\def\foo{...}
\permanent\def\foo{...}
\frozen   \def\foo{...}
\aliased  \def\foo{...}
\stoptyping

A \prm {mutable} macro can always be changed contrary to an \prm {immutable} one.
For instance a macro that acts as a variable is normally \prm {mutable}, while a
constant can best be immutable. It makes sense to define a public core macro as
\prm {permanent}. Primives start out a \prm {permanent} ones but with a primitive
property instead.

\startbuffer
          \let\relaxone  \relax 1: \meaningfull\relaxone
\aliased  \let\relaxtwo  \relax 2: \meaningfull\relaxtwo
\permanent\let\relaxthree\relax 3: \meaningfull\relaxthree
\stopbuffer

\typebuffer

The \prm {meaningfull} primitive is like \prm {meaning} but report the
properties too. The \prm {meaningless} companion reports the body of a macro.
Anyway, this typesets:

\startlines \tttf \getbuffer \stoplines

So, the \prm {aliased} prefix copies the properties. Keep in mind that a macro
package can redefine primitives, but \prm {relax} is an unlikely candidate.

There is an extra prefix \prm {noaligned} that flags a macro as being valid
for \prm {noalign} compatible usage (which means that the body must contain that
one. The idea is that we then can do this:

\starttyping
\permanent\protected\noaligned\def\foo{\noalign{...}} % \foo is unexpandable
\stoptyping

that is: we can have protected macros that don't trigger an error in the parser
where there is a look ahead for \prm {noalign} which is why normally protection
doesn't work well. So: we have macro flagged as permanent (overload protection),
being protected (that is, not expandable by default) and a valid equivalent of
the noalign primitive. Of course we can also apply the \prm {global} and \prm
{tolerant} prefixes here. The complete repertoire of extra prefixes is:

\starttabulate
\HL
\NC \type {frozen}     \NC a macro that has to be redefined in a managed way \NC \NR
\NC \type {permanent}  \NC a macro that had better not be redefined \NC \NR
\NC \type {primitive}  \NC a primitive that normally will not be adapted \NC \NR
\NC \type {immutable}  \NC a macro or quantity that cannot be changed, it is a constant \NC \NR
\NC \type {mutable}    \NC a macro that can be changed no matter how well protected it is \NC \NR
\HL
\NC \type {instance}   \NC a macro marked as (for instance) be generated by an interface \NC \NR
\HL
\NC \type {noaligned}  \NC the macro becomes acceptable as \type {\noalign} alias \NC \NR
\HL
\NC \type {overloaded} \NC when permitted the flags will be adapted \NC \NR
\NC \type {enforced}   \NC all is permitted (but only in zero mode or ini mode) \NC \NR
\NC \type {aliased}    \NC the macro gets the same flags as the original \NC \NR
\HL
\NC \type {untraced}   \NC the macro gets a different treatment in tracing \NC \NR
\HL
\stoptabulate

The not yet discussed \prm {instance} is just a flag with no special meaning
which can be used as classifier. The \prm {frozen} also protects against overload
which brings amount of blockers to four.

To what extent the engine will complain when a property is changed in a way that
violates the flags depends on the parameter \prm {overloadmode}. When this
parameter is set to zero no checking takes place. More interesting are values
larger than zero. If that is the case, when a control sequence is flagged as
mutable, it is always permitted to change. When it is set to immutable one can
never change it. The other flags determine the kind of checking done. Currently
the following overload values are used:

\starttabulate[|l|l|c|c|c|c|c|]
    \NC   \NC         \BC immutable \BC permanent \BC primitive \BC frozen \BC instance \NC \NR
    \NC 1 \NC warning \NC \star     \NC \star     \NC \star     \NC        \NC          \NC \NR
    \NC 2 \NC error   \NC \star     \NC \star     \NC \star     \NC        \NC          \NC \NR
    \NC 3 \NC warning \NC \star     \NC \star     \NC \star     \NC \star  \NC          \NC \NR
    \NC 4 \NC error   \NC \star     \NC \star     \NC \star     \NC \star  \NC          \NC \NR
    \NC 5 \NC warning \NC \star     \NC \star     \NC \star     \NC \star  \NC \star    \NC \NR
    \NC 6 \NC error   \NC \star     \NC \star     \NC \star     \NC \star  \NC \star    \NC \NR
\stoptabulate

The even values (except zero) will abort the run. A value of 255 will freeze this
parameter. At level five and above the \prm {instance} flag is also checked but
no drastic action takes place. We use this to signal to the user that a specific
instance is redefined (of course the definition macros can check for that too).

The \prm {overloaded} prefix can be used to overload a frozen macro. The \prm
{enforced} is more powerful and forces an overload but that prefix is only
effective in ini mode or when it's embedded in the body of a macro or token list
at ini time unless of course at runtime the mode is zero.

So far for a short explanation. More details can be found in the \CONTEXT\
documentation where we can discuss it in a more relevant perspective. It must be
noted that this feature only makes sense a controlled situation, that is: user
modules or macros of unpredictable origin will probably suffer from warnings and
errors when de mode is set to non zero. In \CONTEXT\ we're okay unless of course
users redefine instances but there a warning or error is kind of welcome.

There is an extra prefix \prm {untraced} that will suppress the meaning when
tracing so that the macro looks more like a primitive. It is still somewhat
experimental so what gets displayed might change.

The \prm {letfrozen}, \prm {unletfrozen}, \prm {letprotected} and \prm
{unletprotected} primitives do as their names advertise. Of course the \prm
{overloadmode} must be set so that it is permitted.

\stopsubsection

\startsubsection[title={Swapping meaning}]

\topicindex {macros+swapping}

The \prm {swapcsvalues} will swap the values of two control sequences of the same
type. This is a somewhat tricky features because it can interfere with grouping.

\startbuffer
\scratchcounterone 1 \scratchcountertwo 2
(\the\scratchcounterone,\the\scratchcountertwo)
\swapcsvalues \scratchcounterone \scratchcountertwo
(\the\scratchcounterone,\the\scratchcountertwo)
\swapcsvalues \scratchcounterone \scratchcountertwo
(\the\scratchcounterone,\the\scratchcountertwo)

\scratchcounterone 3 \scratchcountertwo 4
(\the\scratchcounterone,\the\scratchcountertwo)
\bgroup
\swapcsvalues \scratchcounterone \scratchcountertwo
(\the\scratchcounterone,\the\scratchcountertwo)
\egroup
(\the\scratchcounterone,\the\scratchcountertwo)
\stopbuffer

\typebuffer

We get similar results:

\startlines
\getbuffer
\stoplines

\stopsubsection

\stopsection

\startsection[title=Quantities]

\startsubsection[title={Constants with \prm{integerdef}, \prm {dimensiondef},
\prm {gluespecdef} and \prm {mugluespecdef}}]

It is rather common to store constant values in a register or character
definition.

\starttyping
\newcount\MyConstantA \MyConstantA 123
\newdimen\MyConstantB \MyConstantB 123pt
\chardef \MyConstantC \MyConstantC 123
\stoptyping

But in \LUAMETATEX\ we also can do this:

\starttyping
\integerdef    \MyConstantI 456
\dimensiondef  \MyConstantD 456pt
\gluespecdef   \MyConstantG 987pt minus 654pt plus 321pt
\mugluespecdef \MyConstantG 3mu plus 2mu minus 1mu
\stoptyping

These two are stored as efficient as a register but don't occupy a register slot.
They can be set as above, need \prm {the} for serializations and are seen as
valid number or dimension when needed. They do behave like registers so one can
use for instance \prm {advance} and assign values but keep in mind that an alias
(made by for instance \prm {let} clones the value and that clone will not follow
a change in the original. For that, registers can be used because there we use an
indirect reference.

Experiments with constant strings made the engine source more complex than I
wanted so that features was rejected. Of course we can use the prefixes mentioned
in a previous section.

\stopsubsection

\startsubsection[title={Getting internal indices with \prm {indexofcharacter} and \prm {indexofregister}}]

When you have defined a register with one of the \tex {...def} primitives but for
some reasons needs to know the register index you can query that:

\startbuffer
\the\indexofregister \scratchcounterone,
\the\indexofregister \scratchcountertwo,
\the\indexofregister \scratchwidth,
\the\indexofregister \scratchheight,
\the\indexofregister \scratchdepth,
\the\indexofregister \scratchbox
\stopbuffer

\typebuffer

We lie a little here because in \CONTEXT\ the box index \tex {scratchbox} is
actually defined as: \normalexpanded {\typ {\meaningasis \scratchbox}} but it
still is a number so it fits in.

\getbuffer

A similar primitive gives us the (normally \UNICODE) value of a character:

\startbuffer
\chardef\MyCharA=65
\the\indexofcharacter A
\the\indexofcharacter \MyCharA
\stopbuffer

\typebuffer

The result is equivalent to \type {\number `A} but avoids the back quote:
\inlinebuffer.

\stopsubsection

\startsubsection[title={Serialization with \prm {todimension}, \prm {toscaled}, \prm {tohexadecimal} and \prm {tointeger}}]

These serializers take a verbose or symbolic quantity:

\starttyping
\todimension   10pt   \todimension   \scratchdimen    % with unit
\toscaled      10pt   \toscaled      \scratchdimen    % without unit
\tointeger     10     \tointeger     \scratchcounter
\tohexadecimal 10     \tohexadecimal \scratchcounter
\stoptyping

This is particularly handy in cases where you don't know what you deal with, for instance
when a value is stored in a macro. Using \type {\the} could fail there while:

\starttyping
\the\dimexpr10pt\relax
\stoptyping

is often overkill and gives more noise in a trace.

\stopsubsection

\startsubsection[title={Serialization with \prm {thewithoutunit}, \prm {tosparsedimension} and \prm {tosparsescaled}}]

\topicindex {units}

By default \TEX\ lets \type {1pt} come out as \type {1.0pt} which is why we also have
two sparse variants:

\startbuffer
\todimension    10pt\quad\tosparsedimension  10pt
\todimension   1.2pt\quad\tosparsedimension 1.2pt
\toscaled       10pt\quad\tosparsescaled     10pt
\toscaled      1.2pt\quad\tosparsescaled    1.2pt
\stopbuffer

\typebuffer

This time trailing zeros (and a trailing period) will be dropped:

\startlines \getbuffer \stoplines

The \prm {thewithoutunit} primitive is like \prm {the} on a dimension but it
omits the unit.

\stopsubsection

\startsubsection[title={Units}]

The familiar \TEX\ units like \type {pt} and \type {cm} are supported but since
the 2021 \CONTEXT\ meeting we also support the Knuthian Potrzebie, cf.\ \typ
{en.wikipedia.org/wiki/Potrzebie}. The two character acronym is \type {dk}. One
\type {dk} is 6.43985pt. This unit is particularly suited for offsets in framed
examples.

In 2023 we added the Edith (\type {es}) and Tove (\type {ts}) as metric
replacements for the inch (\type {in}). As with the \type {dk} more background
information can be found in documents that come with \CONTEXT\ and user group
journals. The \type {eu} unit starts out as one \type {es} but can be scaled with
\prm {eufactor}.

\startbuffer
\localcontrolledloop -5 55 5 {
    \eufactor=\currentloopiterator
    \dontleavehmode\strut
    \vrule height .1es depth .25ts width 1dk\relax\quad
    \vrule height .1es depth .25ts width 1eu\relax\quad
    \the\currentloopiterator
    \par
}
\stopbuffer

\typebuffer

This example code shows all four new units. Watch how \prm {eufactor} is clipped
to a value in the range $1-50$. The default factor of $10$ makes the European
Unit equivalent to ten Toves or one Edith.

\startpacked
\startcolor[darkgray]
\getbuffer
\stopcolor
\stoppacked

\stopsubsection

\stopsection

\startsection[title=Registers]


\startsubsection[title={32 bit floats}]

The engine has native float registers which means that we have a similar set of
primitives as for dimensions and integers: \prm {float}, \prm {floatdef}, \prm
{floatexpr} and \prm {iffloat}. In the context if an integer a rounded value is
used, in the context of a dimensions the floating point number is interpreted as
points. Internally floats are stored as so called posits, which gives more accuracy
for smaller values.

\stopsubsection

\startsubsection[title={32 bit posits}]

{\em This is a playground. It will stay but might evolve.}

\stopsubsection

\startsubsection[title={Extra features}]

There are \prm {ifabsnum}, \prm {ifabsdim} and \prm {ifabsfloat} that compare
absolute values of quantities. The primitives \prm{ifzeronum}, \prm{ifzerodim},
\prm{ifzerofloat} do a fast test for zero. The \prm {ifintervalnum}, \prm
{ifintervaldim} and \prm {ifintervalfloat} primitives take a delta and two values
and check if these values overlap within the ranges dedicated by the delta.

\stopsubsection

\stopsection

\startsection[title=Expressions]

\startsubsection[title={Rounding and scaling}]

\topicindex {expressions+traditional}

The \type {*expr} parsers now accept \type {:} as operator for integer division
(the \type {/} operators does rounding. This can be used for division compatible
with \type {\divide}. I'm still wondering if adding a couple of bit operators
makes sense (for integers).

The \prm{numericscale} parser is kind of special (and might evolve). For now it
converts a following number in a scale value as often used in \TEX, where 1000
means scaling by~1.0. The trick is in the presence of a digit (or comma): 1.234
becomes 1234 but 1234 stays 1234 and from this you can deduce that 12.34 becomes
123400. Internally \TEX\ calculates with integers, but this permits the macro
package to provide an efficient mix.

\stopsubsection

\startsubsection[title={Enhanced expressions}]

\topicindex {expressions+enhanced}

The \ETEX\ expression primitives are handy but have some limitations. Although
the parsers have been rewritten in \LUAMETATEX\ and somewhat more efficient the
only extension we have is support for an integer division with \type {:}. After
experimenting for a while and pondering how to make \prm {dimexpr} and \prm
{numexpr} more powerful I decided to come up with alternatives in order not to
introduce incompatibilities.

The \prm {numexpression} and \prm {dimexpression} primitives are equivalent but
offer more. The first one operates in the integer domain and the second one
assumes scaled values. Often the second one can act like the first when
serialized with \prm {number} in front. This is because when \TEX\ sees a
symbolic reference to an integer or dimension it can treat them as it likes.

The set of operators that we have to support is the following. Most have
alternatives so that we can get around catcode issues.

\starttabulate[||cT|cT|]
\DB action    \BC symbol               \BC keyword \NC \NR
\TB
\NC add       \NC +                    \NC         \NC \NR
\NC subtract  \NC -                    \NC         \NC \NR
\NC multiply  \NC *                    \NC         \NC \NR
\NC divide    \NC / :                  \NC         \NC \NR
\NC mod       \NC \letterpercent       \NC mod     \NC \NR
\NC band      \NC &                    \NC band    \NC \NR
\NC bxor      \NC ^                    \NC bxor    \NC \NR
\NC bor       \NC \letterbar \space v  \NC bor     \NC \NR
\NC and       \NC &&                   \NC and     \NC \NR
\NC or        \NC \letterbar\letterbar \NC or      \NC \NR
\NC setbit    \NC <undecided>          \NC bset    \NC \NR
\NC resetbit  \NC <undecided>          \NC breset  \NC \NR
\NC left      \NC <<                   \NC         \NC \NR
\NC right     \NC >>                   \NC         \NC \NR
\NC less      \NC <                    \NC         \NC \NR
\NC lessequal \NC <=                   \NC         \NC \NR
\NC equal     \NC = ==                 \NC         \NC \NR
\NC moreequal \NC >=                   \NC         \NC \NR
\NC more      \NC >                    \NC         \NC \NR
\NC unequal   \NC <> != \lettertilde = \NC         \NC \NR
\NC not       \NC ! \lettertilde       \NC not     \NC \NR
\LL
\stoptabulate

Here are some things that \prm {numexpr} is not suitable for:

\starttyping
\scratchcounter = \numexpression
    "00000 bor "00001 bor "00020 bor "00400 bor "08000 bor "F0000
\relax

\ifcase \numexpression
    (\scratchcounterone > 5) && (\scratchcountertwo > 5)
\relax yes\else nop\fi
\stoptyping

You can get an idea what the engines sees by setting \prm {tracingexpressions}
to a value larger than zero. It shows the expression in rpn form.

\starttyping
\dimexpression 4pt * 2   + 6pt   \relax
\dimexpression 2   * 4pt + 6pt   \relax
\dimexpression 4pt * 2.5 + 6pt   \relax
\dimexpression 2.5 * 4pt + 6pt   \relax
\numexpression 2 * 4 + 6         \relax
\numexpression (1 + 2) * (3 + 4) \relax
\stoptyping

The \prm {relax} is mandate simply because there are keywords involved so the
parser needs to know where to stop scanning. It made no sense to be more clever
and introduce fuzziness (so there is no room for exposing in|-|depth \TEX\
insight and expertise here). In case you wonder: the difference in performance
between the \ETEX\ expression mechanism and the more extended variant will
normally not be noticed, probably because they both use a different approach and
because the \ETEX\ variant also has been optimized. \footnote {I might add some
features in the future.}

The if|-|test shown before can be done using the new primitives \prm
{ifdimexpression} and \prm {ifnumexpression} which are boolean tests with zero
being \type {false}.

\stopsubsection

\startsubsection[title={Calculations with \prm {advanceby}, \prm {multiplyby} and
\prm {divideby}.}]

The \prm {advance}, \prm {multiply} and \prm {divide} primitives accept an
optional keyword \type {by}. In \CONTEXT\ we never use that feature and as a
consequence the scanner has to push back a scanned token after checking for the
\type {b} or \type {B}. These three new primitives avoid that and therefore
perform better, but that is (as usual with such enhancements) only noticeable in
demanding scenarios.

The original three plus these new ones also operate on the \quote {constant}
integers, dimensions etc.

\stopsubsection

\stopsection

\startsection[title=Loops]

\topicindex {loops}

There is actually not that much to tell about the three loop primitives \prm
{expandedloop}, \prm {unexpandedloop} and \prm {localcontrolledloop}. They are
used like:

\startbuffer
\unexpandedloop 1 10 1 {
    [!]
}
\stopbuffer

\typebuffer

This will give 10 snippets.

\getbuffer

So what will the next give?

\startbuffer
\edef\TestA{\unexpandedloop 1 10 1 {!}}\meaning\TestA
\edef\TestB{\expandedloop   1 10 1 {!}}\meaning\TestB
\stopbuffer

\typebuffer

We see no difference in results between the two loops:

\startlines \tt
\getbuffer
\stoplines

But the the next variant shows that they do:

\startbuffer
\edef\TestA{\unexpandedloop 1 10 1 {\the\currentloopiterator}}\meaning\TestA
\edef\TestB{\expandedloop   1 10 1 {\the\currentloopiterator}}\meaning\TestB
\stopbuffer

\typebuffer

The unexpanded variants sort of delays:

\startlines \tt
\getbuffer
\stoplines

You can nest loops and query the nesting level:

\startbuffer
\expandedloop 1 10 1 {%
    \ifodd\currentloopiterator\else
      [\expandedloop 1 \currentloopiterator 1 {%
        \the\currentloopnesting
      }]
    \fi
}
\stopbuffer

\typebuffer

Here we use the two numeric state primitives \prm {currentloopiterator} and \prm
{currentloopnesting}. This results in:

\getbuffer

The \prm {quitloop} primitive makes it possible to prematurely exit a loop (at
the next step), although of course in the next case one can just adapt the final
iterator value instead. Here we step by 2:

\startbuffer
\expandedloop 1 20 2 {%
    \ifnum\currentloopiterator>10
        \quitloop
    \else
        [!]
    \fi
}
\stopbuffer

\typebuffer

This results in:

\getbuffer

The \prm {lastloopiterator} primitive keeps the last iterator value and is a global
one as all \type {\last...} primitives. The loops also work with negative values.

A special case is \prm {localcontrolledloop} which fits into the repertoire of
local control primitives. In that case the loop body gets expanded in a nested
main loop which can come in handy in tricky cases where full expansion is mixed
with for instance assignments but of course users should then be aware of
out|-|of|-|order side effects when you push back something in the input. Consider
it a playground.

\stopsection

\stopchapter

\stopcomponent

% \bgroup \unprotect
%     \catcode`<= \lettercatcode
%     \ifnum 1  = 2 n \else y \fi
%     \ifnum 1  > 2 n \else y \fi
%     \ifnum 1  < 2 y \else n \fi
%     \ifnum 1 != 2 y \else n \fi
%     \ifnum 1 !> 2 y \else n \fi
%     \ifnum 1 !< 2 n \else y \fi
%     \ifnum 1 ≠  2 y \else n \fi
%     \ifnum 1 ≥  2 n \else y \fi
%     \ifnum 1 ≤  2 y \else n \fi
%     \ifnum 1 ≱  2 n \else y \fi
%     \ifnum 1 ≰  2 y \else n \fi
%     \ifnum 1 ∈  3 y \else n \fi
%     \ifnum 1 ∉  3 n \else y \fi
%     \ifdim 10pt ∈ 10.5pt y \else n \fi
%     \ifdim 10pt ∉ 10.5pt n \else y \fi
%     \ifdim 11pt ∈ 10.5pt n \else y \fi
%     \ifdim 11pt ∉ 10.5pt y \else n \fi
% \protect \egroup