% language=us runpath=texruns:manuals/lowlevel

% Extending the macro argument parser happened stepwise and at each step a bit of
% \CONTEXT\ code was adapted for testing. At the beginning of October the 20201010
% version of \LUAMETATEX\ was more of less complete, and I decided to adapt some
% more and more intrusive too. Of course that resulted in some more files than I
% had intended so mid October about 100 files were adapted. When this works out
% well, I'll do some more. In the process many macros got the frozen property so
% that was also a test and we'll see how that works out (as it can backfire). As
% usual, here is a musical timestamp: working on this happened when Pineapple Thief
% released \quotation {Versions of the Truth} which again a magnificent drumming by
% Gavin Harrison.

% \permanent\tolerant\protected\def\xx[#1]#*#;[#2]#:#3% loops .. todo

\usemodule[system-tokens]
\usemodule[system-syntax]

\environment lowlevel-style

\startdocument
  [title=macros,
   color=middleorange]

\startsectionlevel[title=Preamble]

This chapter overlaps with other chapters but brings together some extensions to
the macro definition and expansion parts. As these mechanisms were stepwise
extended, the other chapters describe intermediate steps in the development.

Now, in spite of the extensions discussed here the main ides is still that we
have \TEX\ act like before. We keep the charm of the macro language but these
additions make for easier definitions, but (at least initially) none that could
not be done before using more code.

\stopsectionlevel

\startsectionlevel[title=Definitions]

A macro definition normally looks like like this: \footnote {The \type
{\dontleavehmode} command make the examples stay on one line.}

\startbuffer[definition]
\def\macro#1#2%
  {\dontleavehmode\hbox to 6em{\vl\type{#1}\vl\type{#2}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

Such a macro can be used as:

\startbuffer[example]
\macro {1}{2}
\macro {1} {2}  middle space gobbled
\macro 1 {2}    middle space gobbled
\macro {1} 2    middle space gobbled
\macro 1 2      middle space gobbled
\stopbuffer

\typebuffer[example][option=TEX]

We show the result with some comments about how spaces are handled:

\startlines \getbuffer[example] \stoplines

A definition with delimited parameters looks like this:

\startbuffer[definition]
\def\macro[#1]%
  {\dontleavehmode\hbox to 6em{\vl\type{#1}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

When we use this we get:

\startbuffer[example]
\macro [1]
\macro [ 1]    leading space kept
\macro [1 ]    trailing space kept
\macro [ 1 ]   both spaces kept
\stopbuffer

\typebuffer[example][option=TEX]

Again, watch the handling of spaces:

\startlines \getbuffer[example] \stoplines

Just for the record we show a combination:

\startbuffer[definition]
\def\macro[#1]#2%
  {\dontleavehmode\hbox to 6em{\vl\type{#1}\vl\type{#2}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

With this:

\startbuffer[example]
\macro [1]{2}
\macro [1] {2}
\macro [1] 2
\stopbuffer

\typebuffer[example][option=TEX]

we can again see the spaces go away:

\startlines \getbuffer[example] \stoplines

A definition with two separately delimited parameters is given next:

\startbuffer[definition]
\def\macro[#1#2]%
  {\dontleavehmode\hbox to 6em{\vl\type{#1}\vl\type{#2}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

When used:

\startbuffer[example]
\macro [12]
\macro [ 12]     leading space gobbled
\macro [12 ]     trailing space kept
\macro [ 12 ]    leading space gobbled, trailing space kept
\macro [1 2]     middle space kept
\macro [ 1 2 ]   leading space gobbled, middle and trailing space kept
\stopbuffer

\typebuffer[example][option=TEX]

We get ourselves:

\startlines \getbuffer[example] \stoplines

These examples demonstrate that the engine does some magic with spaces before
(and therefore also between multiple) parameters.

We will now go a bit beyond what traditional \TEX\ engines do and enter the
domain of \LUAMETATEX\ specific parameter specifiers. We start with one that
deals with this hard coded space behavior:

\startbuffer[definition]
\def\macro[#^#^]%
  {\dontleavehmode\hbox to 6em{\vl\type{#1}\vl\type{#2}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

The \type {#^} specifier will count the parameter, so here we expect again two
arguments but the space is kept when parsing for them.

\startbuffer[example]
\macro [12]
\macro [ 12]
\macro [12 ]
\macro [ 12 ]
\macro [1 2]
\macro [ 1 2 ]
\stopbuffer

\typebuffer[example][option=TEX]

Now keep in mind that we could deal well with all kind of parameter handling in
\CONTEXT\ for decades, so this is not really something we missed, but it
complements the to be discussed other ones and it makes sense to have that level
of control. Also, availability triggers usage. Nevertheless, some day the \type
{#^} specifier will come in handy.

\startlines \getbuffer[example] \stoplines

We now come back to an earlier example:

\startbuffer[definition]
\def\macro[#1]%
  {\dontleavehmode\hbox spread 1em{\vl\type{#1}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

When we use this we see that the braces in the second call are removed:

\startbuffer[example]
\macro [1]
\macro [{1}]
\stopbuffer

\typebuffer[example][option=TEX] \getbuffer[example]

This can be prohibited by the \type {#+} specifier, as in:

\startbuffer[definition]
\def\macro[#+]%
  {\dontleavehmode\hbox spread 1em{\vl\type{#1}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

As we see, the braces are kept:

\startbuffer[example]
\macro [1]
\macro [{1}]
\stopbuffer

\typebuffer[example][option=TEX]

Again, we could easily get around that (for sure intended) side effect but it just makes nicer
code when we have a feature like this.

\getbuffer[example]

Sometimes you want to grab an argument but are not interested in the results. For this we have
two specifiers: one that just ignores the argument, and another one that keeps counting but
discards it, i.e.\ the related parameter is empty.

\startbuffer[definition]
\def\macro[#1][#0][#3][#-][#4]%
  {\dontleavehmode\hbox spread 1em
     {\vl\type{#1}\vl\type{#2}\vl\type{#3}\vl\type{#4}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

The second argument is empty and the fourth argument is simply ignored which is why we need
\type {#4} for the fifth entry.

\startbuffer[example]
\macro [1][2][3][4][5]
\stopbuffer

\typebuffer[example][option=TEX]

Here is proof that it works:

\getbuffer[example]

The reasoning behind dropping arguments is that for some cases we get around the
nine argument limitation, but more important is that we don't construct token
lists that are not used, which is more memory (and maybe even \CPU\ cache)
friendly.

Spaces are always kind of special in \TEX, so it will be no surprise that we have
another specifier that relates to spaces.

\startbuffer[definition]
\def\macro[#1]#*[#2]%
  {\dontleavehmode\hbox spread 1em{\vl\type{#1}\vl\type{#2}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

This permits usage like the following:

\startbuffer[example]
\macro [1][2]
\macro [1] [2]
\stopbuffer

\typebuffer[example][option=TEX] \getbuffer[example]

Without the optional \quote {grab spaces} specifier the second line would
possibly throw an error. This because \TEX\ then tries to match \type{][} so the
\type {] [} in the input is simply added to the first argument and the next
occurrence of \type {][} will be used. That one can be someplace further in your
source and if not \TEX\ complains about a premature end of file. But, with the
\type {#*} option it works out okay (unless of course you don't have that second
argument \type {[2]}.

Now, you might wonder if there is a way to deal with that second delimited
argument being optional and of course that can be programmed quite well in
traditional macro code. In fact, \CONTEXT\ does that a lot because it is set up
as a parameter driven system with optional arguments. That subsystem has been
optimized to the max over years and it works quite well and performance wise
there is very little to gain. However, as soon as you enable tracing you end up
in an avalanche of expansions and that is no fun.

This time the solution is not in some special specifier but in the way a macro
gets defined.

\startbuffer[definition]
\tolerant\def\macro[#1]#*[#2]%
  {\dontleavehmode\hbox spread 1em{\vl\type{#1}\vl\type{#2}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

The magic \type {\tolerant} prefix with delimited arguments and just quits when
there is no match. So, this is acceptable:

\startbuffer[example]
\macro [1][2]
\macro [1] [2]
\macro [1]
\macro
\stopbuffer

\typebuffer[example][option=TEX] \getbuffer[example]

We can check how many arguments have been processed with a dedicated conditional:

\startbuffer[definition]
\tolerant\def\macro[#1]#*[#2]%
  {\ifarguments 0\or 1\or 2\or ?\fi: \vl\type{#1}\vl\type{#2}\vl}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

We use this test:

\startbuffer[example]
\macro [1][2] \macro [1] [2] \macro [1] \macro
\stopbuffer

\typebuffer[example][option=TEX]

The result is: \inlinebuffer[example]\ which is what we expect because we flush
inline and there is no change of mode. When the following definition is used in
display mode, the leading \type {n=} can for instance start a new paragraph and
when code in \type {\everypar} you can loose the right number when macros get
expanded before the \type {n} gets injected.

\starttyping[option=TEX]
\tolerant\def\macro[#1]#*[#2]%
  {n=\ifarguments 0\or 1\or 2\or ?\fi: \vl\type{#1}\vl\type{#2}\vl}
\stoptyping

In addition to the \type {\ifarguments} test primitive there is also a related
internal counter \type {\lastarguments} set that you can consult, so the \type
{\ifarguments} is actually just a shortcut for \typ {\ifcase \lastarguments}.

We now continue with the argument specifiers and the next two relate to this optional
grabbing. Consider the next definition:

\startbuffer[definition]
\tolerant\def\macro#1#*#2%
  {\dontleavehmode\hbox spread 1em{\vl\type{#1}\vl\type{#2}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

With this test:

\startbuffer[example]
\macro {1} {2}
\macro {1}
\macro
\stopbuffer

\typebuffer[example][option=TEX]

We get:

\getbuffer[example]

This is okay because the last \type {\macro} is a valid (single token) argument. But, we
can make the braces mandate:

\startbuffer[definition]
\tolerant\def\macro#=#*#=%
  {\dontleavehmode\hbox spread 1em{\vl\type{#1}\vl\type{#2}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

Here the \type {#=} forces a check for braces, so:

\startbuffer[example]
\macro {1} {2}
\macro {1}
\macro
\stopbuffer

\typebuffer[example][option=TEX]

gives this:

\getbuffer[example]

However, we do loose these braces and sometimes you don't want that. Of course when you pass the
results downstream to another macro you can always add them, but it was cheap to add a related
specifier:

\startbuffer[definition]
\tolerant\def\macro#_#*#_%
  {\dontleavehmode\hbox spread 1em{\vl\type{#1}\vl\type{#2}\vl\hss}}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

Again, the magic \type {\tolerant} prefix works will quit scanning when there is
no match. So:

\startbuffer[example]
\macro {1} {2}
\macro {1}
\macro
\stopbuffer

\typebuffer[example][option=TEX]

leads to:

\getbuffer[example]

When you're tolerant it can be that you still want to pick up some argument
later on. This is why we have a continuation option.

\startbuffer[definition]
\tolerant\def\foo      [#1]#*[#2]#:#3{!#1!#2!#3!}
\tolerant\def\oof[#1]#*[#2]#:(#3)#:#4{!#1!#2!#3!#4!}
\tolerant\def\ofo      [#1]#:(#2)#:#3{!#1!#2!#3!}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

Hopefully the next example demonstrates how it works:

\startbuffer[example]
\foo{3} \foo[1]{3} \foo[1][2]{3}
\oof{4} \oof[1]{4} \oof[1][2]{4}
\oof[1][2](3){4} \oof[1](3){4} \oof(3){4}
\ofo{3} \ofo[1]{3}
\ofo[1](2){3} \ofo(2){3}
\stopbuffer

\typebuffer[example][option=TEX]

As you can see we can have multiple continuations using the \type {#:} directive:

\startlines \getbuffer[example] \stoplines

The last specifier doesn't work well with the \type {\ifarguments} state because
we no longer know what arguments were skipped. This is why we have another test
for arguments. A zero value means that the next token is not a parameter
reference, a value of one means that a parameter has been set and a value of two
signals an empty parameter. So, it reports the state of the given parameter as
a kind if \type {\ifcase}.

\startbuffer[definition]
\def\foo#1#2{ [\ifparameter#1\or(ONE)\fi\ifparameter#2\or(TWO)\fi] }
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

\startbuffer[example]
\foo{1}{2} \foo{1}{} \foo{}{2} \foo{}{}
\stopbuffer

Of course the test has to be followed by a valid parameter specifier:

\typebuffer[example][option=TEX]

The previous code gives this:

\getbuffer[example]

A combination check \type {\ifparameters}, again a case, matches the first
parameter that has a value set.

We could add plenty of specifiers but we need to keep in ind that we're not
talking of an expression scanner. We need to keep performance in mind, so nesting
and backtracking are no option. We also have a limited set of useable single
characters, but here's one that uses a symbol that we had left:

\startbuffer[definition]
\def\startfoo[#/]#/\stopfoo{ [#1](#2) }
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

\startbuffer[example]
\startfoo [x ] x \stopfoo
\startfoo [ x ] x \stopfoo
\startfoo [ x] x \stopfoo
\startfoo [ x] \par x \par \par \stopfoo
\stopbuffer

The slash directive removes leading and trailing so called spacers as well as tokens
that represent a paragraph end:

\typebuffer[example][option=TEX]

So we get this:

\getbuffer[example]

The next directive, the quitter \type {#;}, is demonstrated with an example. When
no match has occurred, scanning picks up after this signal, otherwise we just
quit.

\startbuffer[example]
\tolerant\def\foo[#1]#;(#2){/#1/#2/}

\foo[1]\quad\foo[2]\quad\foo[3]\par
\foo(1)\quad\foo(2)\quad\foo(3)\par

\tolerant\def\foo[#1]#;#={/#1/#2/}

\foo[1]\quad\foo[2]\quad\foo[3]\par
\foo{1}\quad\foo{2}\quad\foo{3}\par

\tolerant\def\foo[#1]#;#2{/#1/#2/}

\foo[1]\quad\foo[2]\quad\foo[3]\par
\foo{1}\quad\foo{2}\quad\foo{3}\par

\tolerant\def\foo[#1]#;(#2)#;#={/#1/#2/#3/}

\foo[1]\quad\foo[2]\quad\foo[3]\par
\foo(1)\quad\foo(2)\quad\foo(3)\par
\foo{1}\quad\foo{2}\quad\foo{3}\par
\stopbuffer

\typebuffer[example][option=TEX] \startpacked \getbuffer[example] \stoppacked

I have to admit that I don't really need it but it made some macros that I was
redefining behave better, so there is some self|-|interest here. Anyway, I
considered some other features, like picking up a detokenized argument but I
don't expect that to be of much use. In the meantime we ran out of reasonable
characters, but some day \type {#?} and \type {#!} might show up, or maybe I find
a use for \type {#<} and \type {#>}. A summary of all this is given here:

\starttabulate[|T|i2l|]
\FL
\NC +   \NC keep the braces \NC \NR
\NC -   \NC discard and don't count the argument \NC \NR
\NC /   \NC remove leading an trailing spaces and pars \NC \NR
\NC =   \NC braces are mandate \NC \NR
\NC _   \NC braces are mandate and kept \NC \NR
\NC ^   \NC keep leading spaces \NC \NR
\ML
\NC 1-9 \NC an argument \NC \NR
\NC 0   \NC discard but count the argument \NC \NR
\ML
\NC *   \NC ignore spaces \NC \NR
\NC :   \NC pick up scanning here  \NC \NR
\NC ;   \NC quit scanning \NC \NR
\ML
\NC .   \NC ignore pars and spaces \NC \NR
\NC ,   \NC push back space when quit \NC \NR
\LL
\stoptabulate

The last two have not been discussed and were added later. The period
directive gobbles space and par tokens and discards them in the
process. The comma directive is like \type {*} but it pushes back a space
when the matching quits.

\startbuffer
\tolerant\def\FooA[#1]#*[#2]{(#1/#2)} % remove spaces
\tolerant\def\FooB[#1]#,[#2]{(#1/#2)} % push back space

/\FooA/ /\FooA / /\FooA[1]/ /\FooA[!] / /\FooA[1] [2]/ /\FooA[1] [2] /\par
/\FooB/ /\FooB / /\FooB[1]/ /\FooB[!] / /\FooB[1] [2]/ /\FooB[1] [2] /\par
\stopbuffer

\typebuffer[example][option=TEX] \startpacked \getbuffer[example] \stoppacked

Gobbling spaces versus pushing back is an interface design decision because it
has to do with consistency.

\stopsectionlevel

\startsectionlevel[title=Runaway arguments]

There is a particular troublesome case left: a runaway argument. The solution is
not pretty but it's the only way: we need to tell the parser that it can quit.

\startbuffer[definition]
\tolerant\def\foo[#1=#2]%
  {\ifarguments 0\or 1\or 2\or 3\or 4\fi:\vl\type{#1}\vl\type{#2}\vl}
\stopbuffer

\typebuffer[definition][option=TEX] \getbuffer[definition]

\startbuffer[example]
\dontleavehmode \foo[a=1]
\dontleavehmode \foo[b=]
\dontleavehmode \foo[=]
\dontleavehmode \foo[x]\ignorearguments
\stopbuffer

The outcome demonstrates that one still has to do some additional checking for sane
results and there are alternative way to (ab)use this mechanism. It all boils down
to a clever combination of delimiters and \type {\ignorearguments}.

\typebuffer[example][option=TEX]

All calls are accepted:

\startlines \getbuffer[example] \stoplines

Just in case you wonder about performance: don't expect miracles here. On the one
hand there is some extra overhead in the engine (when defining macros as well as
when collecting arguments during a macro call) and maybe using these new features
can sort of compensate that. As mentioned: the gain is mostly in cleaner macro
code and less clutter in tracing. And I just want the \CONTEXT\ code to look
nice: that way users can look in the source to see what happens and not drown in
all these show|-|off tricks, special characters like underscores, at signs,
question marks and exclamation marks.

For the record: I normally run tests to see if there are performance side effects
and as long as processing the test suite that has thousands of files of all kind
doesn't take more time it's okay. Actually, there is a little gain in \CONTEXT\
but that is to be expected, but I bet users won't notice it, because it's easily
offset by some inefficient styling. Of course another gain of loosing some
indirectness is that error messages point to the macro that the user called for
and not to some follow up.

\stopsectionlevel

\startsectionlevel[title=Introspection]

A macro has a meaning. You can serialize that meaning as follows:

\startbuffer[definition]
\tolerant\protected\def\foo#1[#2]#*[#3]%
  {(1=#1) (2=#3) (3=#3)}

\meaning\foo
\stopbuffer

\typebuffer[definition][option=TEX]

The meaning of \type {\foo} comes out as:

\startnarrower \getbuffer[definition] \stopnarrower

When you load the module \type {system-tokens} you can also say:

\startbuffer[example]
\luatokentable\foo
\stopbuffer

\typebuffer[example][option=TEX]

This produces a table of tokens specifications:

{\getbuffer[definition]\getbuffer[example]}

A token list is a linked list of tokens. The magic numbers in the first column
are the token memory pointers. and because macros (and token lists) get recycled
at some point the available tokens get scattered, which is reflected in the order
of these numbers. Normally macros defined in the macro package are more sequential
because they stay around from the start. The second and third row show the so
called command code and the specifier. The command code groups primitives in
categories, the specifier is an indicator of what specific action will follow, a
register number a reference, etc. Users don't need to know these details. This
macro is a special version of the online variant:

\starttyping[option=TEX]
\showluatokens\foo
\stoptyping

That one is always available and shows a similar list on the console. Again, users
normally don't want to know such details.

\stopsectionlevel

\startsectionlevel[title=nesting]

You can nest macros, as in:

\startbuffer
\def\foo#1#2{\def\oof##1{<#1>##1<#2>}}
\stopbuffer

\typebuffer[option=TEX] \getbuffer

At first sight the duplication of \type {#} looks strange but this is what
happens. When \TEX\ scans the definition of \type {\foo} it sees two arguments.
Their specification ends up in the preamble that defines the matching. When the
body is scanned, the \type {#1} and \type {#2} are turned into a parameter
reference. In order to make nested macros with arguments possible a \type {#}
followed by another \type {#} becomes just one \type {#}. Keep in mind that the
definition of \type {\oof} is delayed till the macro \type {\foo} gets expanded.
That definition is just stored and the only thing that get's replaced are the two
references to a macro parameter

\luatokentable\foo

Now, when we look at these details, it might become clear why for instance we
have \quote {variable} names like \type {#4} and not \type {#whatever} (with or
without hash). Macros are essentially token lists and token lists can be seen as
a sequence of numbers. This is not that different from other programming
environments. When you run into buzzwords like \quote {bytecode} and \quote
{virtual machines} there is actually nothing special about it: some high level
programming (using whatever concept, and in the case of \TEX\ it's macros)
eventually ends up as a sequence of instructions, say bytecodes. Then you need
some machinery to run over that and act upon those numbers. It's something you
arrive at naturally when you play with interpreting languages. \footnote {I
actually did when I wrote an interpreter for some computer assisted learning
system, think of a kind of interpreted \PASCAL, but later realized that it was a a
bytecode plus virtual machine thing. I'd just applied what I learned when playing
with eight bit processors that took bytes, and interpreted opcodes and such.
There's nothing spectacular about all this and I only realized decades later that
the buzzwords describes old natural concepts.}

So, internally a \type {#4} is just one token, a operator|-|operand combination
where the operator is \quotation {grab a parameter} and the operand tells
\quotation {where to store} it. Using names is of course an option but then one
has to do more parsing and turn the name into a number \footnote {This is kind of
what \METAPOST\ does with parameters to macros. The side effect is that in
reporting you get \type {text0}, \type {expr2} and such reported which doesn't
make things more clear.}, add additional checking in the macro body, figure out
some way to retain the name for the purpose of reporting (which then uses more
token memory or strings). It is simply not worth the trouble, let alone the fact
that we loose performance, and when \TEX\ showed up those things really mattered.

It is also important to realize that a \type {#} becomes either a preamble token
(grab an argument) or a reference token (inject the passed tokens into a new
input level). Therefore the duplication of hash tokens \type {##} that you see in
macro nested bodies also makes sense: it makes it possible for the parser to
distinguish between levels. Take:

\starttyping[option=TEX]
\def\foo#1{\def\oof##1{#1##1#1}}
\stoptyping

Of course one can think of this:

\starttyping[option=TEX]
\def\foo#fence{\def\oof#text{#fence#text#fence}}
\stoptyping

But such names really have to be unique then! Actually \CONTEXT\ does have an
input method that supports such names, but discussing it here is a bit out of
scope. Now, imagine that in the above case we use this:

\starttyping[option=TEX]
\def\foo[#1][#2]{\def\oof##1{#1##1#2}}
\stoptyping

If you're a bit familiar with the fact that \TEX\ has a model of category codes
you can imagine that a predictable \quotation {hash followed by a number} is way
more robust than enforcing the user to ensure that catcodes of \quote {names} are
in the right category (read: is a bracket part of the name or not). So, say that
we go completely arbitrary names, we then suddenly needs some escaping, like:

\starttyping[option=TEX]
\def\foo[#{left}][#{right}]{\def\oof#{text}{#{left}#{text}#{right}}}
\stoptyping

And, if you ever looked into macro packages, you will notice that they differ in
the way they assign category codes. Asking users to take that into account when
defining macros makes not that much sense.

So, before one complains about \TEX\ being obscure (the hash thing), think twice.
Your demand for simplicity for your coding demand will make coding more
cumbersome for the complex cases that macro packages have to deal with. It's
comparable using \TEX\ for input or using (say) mark down. For simple documents
the later is fine, but when things become complex, you end up with similar
complexity (or even worse because you lost the enforced detailed structure). So,
just accept the unavoidable: any language has its peculiar properties (and for
sure I do know why I dislike some languages for it). The \TEX\ system is not the
only one where dollars, percent signs, ampersands and hashes have special
meaning.

\stopsectionlevel

\startsectionlevel[title=Prefixes]

Traditional \TEX\ has three prefixes that can be used with macros: \type {\global},
\type {\outer} and \type {\long}. The last two are no|-|op's in \LUAMETATEX\ and
if you want to know what they do (did) you can look it up in the \TEX book. The
\ETEX\ extension gave us \type {\protected}.

In \LUAMETATEX\ we have \type {\global}, \type {\protected}, \type {\tolerant}
and overload related prefixes like \type {\frozen}. A protected macro is one that
doesn't expand in an expandable context, so for instance inside an \type {\edef}.
You can force expansion by using the \type {\expand} primitive in front which is
also something \LUAMETATEX.

% A protected macro can be made expandable by \typ {\unletprotected} and can be
% protected with \typ {\letprotected}.
%
% \startbuffer[example]
%                \def\foo{foo} \edef\oof{oof\foo} 1: \meaning\oof
%      \protected\def\foo{foo} \edef\oof{oof\foo} 2: \meaning\oof
% \unletprotected    \foo      \edef\oof{oof\foo} 3: \meaning\oof
% \stopbuffer
%
% \typebuffer[example][option=TEX]
%
% \startlines \getbuffer[example] \stoplines

Frozen macros cannot be redefined without some effort. This feature can to some
extent be used to prevent a user from overloading, but it also makes it harder
for the macro package itself to redefine on the fly. You can remove the lock with
\typ {\unletfrozen} and add a lock with \typ {\letfrozen} so in the end users
still have all the freedoms that \TEX\ normally provides.

\startbuffer[example]
                 \def\foo{foo} 1: \meaning\foo
          \frozen\def\foo{foo} 2: \meaning\foo
     \unletfrozen    \foo      3: \meaning\foo
\protected\frozen\def\foo{foo} 4: \meaning\foo
     \unletfrozen    \foo      5: \meaning\foo
\stopbuffer

\typebuffer[example][option=TEX]

\startlines \overloadmode0 \getbuffer[example] \stoplines

This actually only works when you have set \type {\overloadmode} to a value that
permits redefining a frozen macro, so for the purpose of this example we set it
to zero.

A \type {\tolerant} macro is one that will quit scanning arguments when a
delimiter cannot be matched. We saw examples of that in a previous section.

These prefixes can be chained (in arbitrary order):

\starttyping[option=TEX]
\frozen\tolerant\protected\global\def\foo[#1]#*[#2]{...}
\stoptyping

There is actually an additional prefix, \type {\immediate} but that one is there
as signal for a macro that is defined in and handled by \LUA. This prefix can
then perform the same function as the one in traditional \TEX, where it is used
for backend related tasks like \type {\write}.

Now, the question is of course, to what extent will \CONTEXT\ use these new
features. One important argument in favor of using \type {\tolerant} is that it
gives (hopefully) better error messages. It also needs less code due to lack of
indirectness. Using \type {\frozen} adds some safeguards although in some places
where \CONTEXT\ itself overloads commands, we need to defrost. Adapting the code
is a tedious process and it can introduce errors due to mistypings, although
these can easily be fixed. So, it will be used but it will take a while to adapt
the code base.

One problem with frozen macros is that they don't play nice with for instance
\typ {\futurelet}. Also, there are places in \CONTEXT\ where we actually do
redefine some core macro that we also want to protect from redefinition by a
user. One can of course \typ {\unletfrozen} such a command first but as a bonus
we have a prefix \typ {\overloaded} that can be used as prefix. So, one can easily
redefine a frozen macro but it takes a little effort. After all, this feature is
mainly meant to protect a user for side effects of definitions, and not as final
blocker. \footnote {As usual adding features like this takes some experimenting
and we're now at the third variant of the implementation, so we're getting there.
The fact that we can apply such features in large macro package like \CONTEXT\
helps figuring out the needs and best approaches.}

A frozen macro can still be overloaded, so what if we want to prevent that? For
this we have the \typ {\permanent} prefix. Internally we also create primitives
but we don't have a prefix for that. But we do have one for a very special case
which we demonstrate with an example:

\startbuffer[example]
\def\FOO % trickery needed to pick up an optional argument
  {\noalign{\vskip10pt}}

\noaligned\protected\tolerant\def\OOF[#1]%
  {\noalign{\vskip\iftok{#1}\emptytoks10pt\else#1\fi}}

\starttabulate[|l|l|]
    \NC test \NC test \NC \NR
    \NC test \NC test \NC \NR
    \FOO
    \NC test \NC test \NC \NR
    \OOF[30pt]
    \NC test \NC test \NC \NR
    \OOF
    \NC test \NC test \NC \NR
\stoptabulate
\stopbuffer

\typebuffer[example][option=TEX]

When \TEX\ scans input (from a file or token list) and starts an alignment, it
will pick up rows. When a row is finished it will look ahead for a \type
{\noalign} and it expands the next token. However, when that token is protected,
the scanner will not see a \type {\noalign} in that macro so it will likely start
complaining when that next macro does get expanded and produces a \type
{\noalign} when a cell is built. The \type {\noaligned} prefix flags a macro as
being one that will do some \type {\noalign} as part of its expansion. This trick
permits clean macros that pick up arguments. Of course it can be done with
traditional means but this whole exercise is about making the code look nice.

The table comes out as:

\getbuffer[example]

One can check the flags with \type {\ifflags} which takes a control sequence and
a number, where valid numbers are:

\starttabulate[|r|lw(8em)|r|lw(8em)|r|lw(8em)|r|lw(8em)|]
\NC \the\frozenflagcode    \NC frozen
\NC \the\permanentflagcode \NC permanent
\NC \the\immutableflagcode \NC immutable
\NC \the\primitiveflagcode \NC primitive  \NC \NR
\NC \the\mutableflagcode   \NC mutable
\NC \the\noalignedflagcode \NC noaligned
\NC \the\instanceflagcode  \NC instance
\NC                        \NC            \NC \NR
\stoptabulate

The level of checking is controlled with the \type {\overloadmode} but I'm still
not sure about how many levels we need there. A zero value disables checking,
the values 1 and 3 give warnings and the values 2 and 4 trigger an error.

\stopsectionlevel

\startsectionlevel[title=Arguments]

The number of arguments that a macro takes is traditionally limited to nine (or
ten if one takes the trailing \type {#} into account). That this is enough for
most cases is demonstrated by the fact that \CONTEXT\ has only a handful of
macros that use \type {#9}. The reason for this limitation is in part a side
effect of the way the macro preamble and arguments are parsed. However, because
in \LUAMETATEX\ we use a different implementation, it was not that hard to permit
a few more arguments, which is why we support upto 15 arguments, as in:

\starttyping[option=TEX]
\def\foo#1#2#3#4#5#6#7#8#9#A#B#C#D#E#F{...}
\stoptyping

We can support the whole alphabet without much trouble but somehow sticking to
the hexadecimal numbers makes sense. It is unlikely that the core of \CONTEXT\
will use this option but sometimes at the user level it can be handy. The penalty
in terms of performance can be neglected.

\starttyping[option=TEX]
\tolerant\def\foo#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=%
  {(#1)(#2)(#3)(#4)(#5)(#6)(#7)(#8)(#9)(#A)(#B)(#C)(#D)(#E)(#F)}

\foo{1}{2}
\stoptyping

In the previous example we have 15 optional arguments where braces are mandate
(otherwise we the scanner happily scoops up what follows which for sure gives some
error).

\stopsectionlevel

\startsectionlevel[title=Constants]

The \LUAMETATEX\ engine has lots of efficiency tricks in the macro parsing and
expansion code that makes it not only fast but also let is use less memory.
However, every time that the body of a macro is to be injected the expansion
machinery kicks in. This often means that a copy is made (pushed in the input and
used afterwards). There are however cases where the body is just a list of
character tokens (with category letter or other) and no expansion run over the
list is needed.

It is tempting to introduce a string data type that just stores strings and
although that might happen at some point it has the disadvantage that one need to
tokenize that string in order to be able to use it, which then defeats the gain.
An alternative has been found in constant macros, that is: a macro without
parameters and a body that is considered to be expanded and never freed by
redefinition. There are two variants:

\starttyping[option=TEX]
\cdef      \foo          {whatever}
\cdefcsname foo\endcsname{whatever}
\stoptyping

These are actually just equivalents to

\starttyping[option=TEX]
\edef      \foo          {whatever}
\edefcsname foo\endcsname{whatever}
\stoptyping

just to make sure that the body gets expanded at definition time but they are
also marked as being constant which in some cases might give some gain, for
instance when used in csname construction. The gain is less then one expects
although there are a few cases in \CONTEXT\ where extreme usage of parameters
benefits from it. Users are unlikely to use these two primitives.

Another example of a constant usage is this:

\starttyping[option=TEX]
\lettonothing\foo
\stoptyping

which gives \type {\foo} an empty body. That one is used in the core, if only because
it gives a bit smaller code. Performance is no that different from

\starttyping[option=TEX]
\let\foo\empty
\stoptyping

but it saves one token (8 bytes) when used in a macro. The assignment itself is
not that different because \type {\foo} is made an alias to \type {\empty} which
in turn only needs incrementing a reference counter.

\stopsectionlevel

\startsectionlevel[title=Passing parameters]

When you define a macro, the \type {#1} and more parameters are embedded as a
reference to a parameter that is passed. When we have four parameters, the
parameter stack has four entries and when an entry is eventually accessed a new
input level is pushed and tokens are fetched from that list. This has some side
effects when we check a parameter. This can happen multiple times, depending on
how often we access a parameter. Take the following:

\startbuffer
\def\oof#1{#1}

\tolerant\def\foo[#1]#*[#2]%
  {1:\ifparameter#1\or Y\else N\fi\quad
   2:\ifparameter#2\or Y\else N\fi\quad
   \oof{3:\ifparameter #1\or Y\else N\fi\quad
        4:\ifparameter #2\or Y\else N\fi\quad}%
   \par}

\foo \foo[] \foo[][] \foo[A] \foo[A][B]
\stopbuffer

\typebuffer

This gives:

\startpacked \tttf
\inlinebuffer
\stoppacked

as you probably expect. However the first two checks are different from the
embedded checks because they can check against the parameter reference. When we
expand \type {\oof} its argument gets passed to the macro as a list and when the
scanner collects the next token it will then push the parameter content on the
input stack. So, then, instead of a reference we get the referenced parameter
list. Internally that means that in 3 and 4 we check for a token and not for the
length of the list (as in case 1 & 2). This means that

\starttyping
\iftok{#1}\emptytoks Y\else N\fi
\ifparameter#1\or    Y\else N\fi
\stoptyping

are different. In the first case we have a proper token list and nested
conditionals in that list are okay. In the second case we just look ahead to see
if there is an \type {\or}, \type {\else} or other condition related command and
if so we decide that there is no parameter. So, if \type {\ifparameter} is a
suitable check for empty depends on the need for expansion.

When you define macros that themselves call macros that should operate on the
arguments of its parent you can easily pass these:

\startbuffer[test-1]
\def\foo#1#2%
  {\oof{#1}{#2}{P}%
   \oof{#1}{#2}{Q}%
   \oof{#1}{#2}{R}}

\def\oof#1#2#3%
  {[#1][#1]%
   #3%
   [#2][#2]}
\stopbuffer

\typebuffer[test-1]

Here the nested call to \type {\oof} involved three passed parameters. You can
avoid that as follows:

\startbuffer[test-2]
\def\foo#1#2%
  {\def\MyIndexOne{#1}%
   \def\MyIndexTwo{#2}%
   \oof{P}\oof{Q}\oof{R}}

\def\oof#1%
  {(\MyIndexOne)(\MyIndexOne)%
   #1%
   (\MyIndexTwo)(\MyIndexTwo)}
\stopbuffer

\typebuffer[test-2]

You can also do this:

\startbuffer[test-3]
\def\foo#1#2%
  {\def\oof##1%
     {/#1/#2/%
     ##1%
     /#1//#2/}%
   \oof{P}\oof{Q}\oof{R}}
\stopbuffer

\typebuffer[test-3]

These parameters indicated by \type {#} in the macro body are in fact references.
When we call for instance \type {\foo {1}{2}} the two parameters get pushed on a
parameter stack and the embodied references point to these stack entries. By the
time that body gets expanded \TEX\ bumps the input level and pushes the parameter
list onto the input stack. It then continues expansion. The parameter is not
copied, because it can't be changed anyway. The only penalty in terms of
performance and memory usage is the pushing and popping of the input. So how does
that work out for these three cases?

When in the first case the \type {\oof{#1}{#2}{P}} is seen, \TEX\ starts expanding
the \type {\oof} macro. That one expects three arguments. The \type {#1} reference is
seen and in this case a copy of that parameter is passed. The same is true for the
other two. Then, inside \type {\oof} expansion happens on the parameters on the stack
and no copies have to be made there.

The second case defines two macros so again two copies are made that make the bodies
of these macros. This comes at the cost of some runtime and memory. However, this
time with \type {\oof{P}} only one argument gets passed and instead expansion of the
macros happen in there.

Normally macro arguments are not that large but there can be situations where we
really want to avoid useless copying. This not only saves memory but also can give a
bit better performance. In the examples above the second variant is some 10\percent
faster than the first one. We can gain another 10\percent with the following trick:

\startbuffer[test-4]
\def\foo#1#2%
  {\parameterdef\MyIndexOne\plusone % 1
   \parameterdef\MyIndexTwo\plustwo % 2
   \oof{P}\oof{Q}\oof{R}\norelax}

\def\oof#1%
  {<\MyIndexOne><\MyIndexOne>%
   #1%
   <\MyIndexTwo><\MyIndexTwo>}
\stopbuffer

\typebuffer[test-4]

Here we define an explicit parameter reference that we access later on. There is
the overhead of a definition but it can be neglected. We use that reference
(abstraction) in \type {\oof}. Actually you can use that reference in any call
down the chain.

When applied to \type {\foo{1}{2}} the four variants above give us:

\startpacked
\startlines \tt
\getbuffer[test-1]\foo{1}{2}
\getbuffer[test-2]\foo{1}{2}
\getbuffer[test-3]\foo{1}{2}
\getbuffer[test-4]\foo{1}{2}
\stoplines
\stoppacked

Before we had \type {parameterdef} we had this:

\startbuffer
\def\foo#1#2%
  {\integerdef\MyIndexOne\parameterindex\plusone % 1
   \integerdef\MyIndexTwo\parameterindex\plustwo % 2
   \oof{P}\oof{Q}\oof{R}\norelax}

\def\oof#1%
  {<\expandparameter\MyIndexOne><\expandparameter\MyIndexOne>%
   #1%
   <\expandparameter\MyIndexTwo><\expandparameter\MyIndexTwo>}
\stopbuffer

\typebuffer

It involves more tokens, is a bit less abstract, but as it is a cheap extension
we kept it. It actually demonstrates that one can access parameters in the stack
by index, but it one then needs to keep track of where access takes place. In
principle one can debug the call chain this way.

To come back to performance and memory usage, when the arguments become larger
the fourth variant with the \type {\parameterdef} quickly gains over the others.
But it only shows in exceptional usage. This mechanism is more about abstraction:
it permits us to efficiently turn arguments into local variables without the
overhead involved in creating macros. You can test if a parameter is set

\startbuffer
\tolerant\protected\def\MyMacro[#1]#:#2%
  {\parameterdef\MyArgumentOne\plusone
   \parameterdef\MyArgumentTwo\plustwo
   \ifparameter\MyArgumentOne\or
     (\MyArgumentOne)
   \fi
   /\MyArgumentTwo/}

\MyMacro[one]{two}
\MyMacro{two}
\stopbuffer

\typebuffer

Indeed we get:

\getbuffer

Of course \type {\ifparameter#1\or...} is more efficient but once you use named
parameters like this it's probably not something you're worry too much about,

\stopsectionlevel

\startsectionlevel[title=Nesting]

We also have a few preamble features that relate to nesting. Although we can do
without (as shown for years in \LMTX) they do have some benefits. They are
discussed as group here and because they are only useful for low level
programming we stick to simple examples. The \type {#L} and \type {#R} use the
following token as delimiters. Here we use \type {[} and \type {]} but they can
be a \type {\cs} as well. Nested delimiters are handled well.

The \type {#S} grabs the argument till the next final square bracket \type {]}
but in the process will grab nested with it sees a \type {[}. The \type {#P} does
the same for parentheses and \type {#X} for angle brackets. In the next examples
the \type {#*} just gobbles optional spaces but we've seen that one already.

The \type {#G} argument just registers the next token as delimiter but it will
grab multiple of them. The \type {#M} gobbles more: in addition to the delimiter
spaces are gobbled.

\startbuffer
\tolerant\def\fooA               [#1]{(#1)}
\tolerant\def\fooB          [#L[#R]#1{(#1)}
\tolerant\def\fooC               #S#1{(#1)}
\tolerant\def\fooE              #S#1,{(#1)}
\tolerant\def\fooF         #S#1#*#S#2{(#1/#2)}
\tolerant\def\fooG [#1]#S[#2]#*#S[#3]{(#1/#2/#3)}
\tolerant\def\fooH [#1][#S#2]#*[#S#3]{(#1/#2/#3)}
\tolerant\def\fooI           #1=#2#G,{(#1=#2)}
\tolerant\def\fooJ           #1=#2#M,{(#1=#2)}
\stopbuffer

\typebuffer

\getbuffer

\starttabulate[|T|T|T||]
\NC \type{\fooA [x]}            \NC \fooA [x]             \NC (x)           \NC \NR
\NC \type{\fooB [x]}            \NC \fooB [x]             \NC (x)           \NC \NR
\NC \type{\fooC [1[2]3[4]5]}    \NC \fooC [1[2]3[4]5]     \NC (1[2]3[4]5)   \NC \NR
\NC \type{\fooE X[,]X,}         \NC \fooE X[,]X,          \NC (X[,]X)       \NC \NR
\NC \type{\fooF [A] [B]}        \NC \fooF [A] [B]         \NC (A/B)         \NC \NR
\NC \type{\fooF [] []}          \NC \fooF [] []           \NC (/)           \NC \NR
\NC \type{\fooG [a][b][c]}      \NC \fooG [a][b][c]       \NC (a/b/c)       \NC \NR
\NC \type{\fooG [a][b]}         \NC \fooG [a][b]          \NC (a/b/)        \NC \NR
\NC \type{\fooG [a]}            \NC \fooG [a]             \NC (a//)         \NC \NR
\NC \type{\fooG [a][x[x]x][c]}  \NC \fooG [a][x[x]x][c]   \NC (a/x[x]x/c)   \NC \NR
\NC \type{\fooH [a][x[x]x][c]}  \NC \fooH [a][x[x]x][c]   \NC (a/x[x]x/c)   \NC \NR
\NC \type{\fooI X=X,,,}         \NC \fooI X=X,,,          \NC (X=X)         \NC \NR
\NC \type{\fooJ X=X, , ,}       \NC \fooJ X=X, , ,        \NC (X=X)         \NC \NR
\stoptabulate

These features make it possible to support nested setups more efficiently and
also makes it possible to accept values that contain balanced brackets in setup
commands without additional overhead. Although it has never been an issue to let
users specify:

\starttyping
\defineoverlay[whatever][{some \command[withparameters] here}]

\setupfoo[before={\blank[big]}]
\stoptyping

it might be less confusing to permit:

\starttyping
\defineoverlay[whatever][some \command[withparameters] here]

\setupfoo[before=\blank[big]]
\stoptyping

as well, if only because occasionally users get hit by this.

\stopsectionlevel

\startsectionlevel[title=Duplicate hashes]

In \TEX\ every character has a so called category code. Most characters are
classified as \quote {letter} (they make up words) or as \quote {other}. In
\UNICODE\ we distinguish symbols, punctuation, and more, but in \TEX\ these are
all of category \quote {other}. In math however we can classify them differently
but in this perspective we ignore that. The backslash has category \quote
{escape} and it starts a control sequence. The curly braces are (internally) of
category \quote {left brace} and \quote {right brace} aka \quote {begin group}
and \quote {end group} but, no matter what they are called, they begin and end
something: a group, argument, token list, box, etc. Any character can have those
categories. Although it would look strange to a \TEX\ user, this can be made
valid:

\startbuffer
!protected !gdef !weird¶1
B
    something: ¶1
E
!weird BhereE
\stopbuffer

\typebuffer

In such a setup spaces can be of category \quote {invisible}. The paragraph
symbol takes the place of the hash as parameter identifier. The next code shows
how this is done. Here we wrap all in a macro so that we don't get catcode
interference in the document source.

\startbuffer[demo]
\def\NotSoTeX
  {\begingroup
   \catcode `B \begingroupcatcode
   \catcode `E \endgroupcatcode
   \catcode `¶ \parametercatcode
   \catcode `! \escapecatcode
   \catcode 32 \ignorecatcode
   \catcode 13 \ignorecatcode
   % this buffer has a definition:
   \getbuffer
   % which is now known globally
   \endgroup}
\NotSoTeX
\weird{there}
\stopbuffer

\typebuffer[demo]

This results in:

\startlines
\getbuffer [demo]
\stoplines

In the first line the \type {!}, \type {B} and \type {E} are used as escape and
argument delimiters, in the second one we use the normal characters. When we show
the \type {\meaningasis} we get:

\startlines \tt
\meaningasis\weird
\stoplines

or in more detail:

\start \tt
\luatokentable\weird
\stop

So, no matter how we set up the system, in the end we get some generic
representation. When we see \type {#1} in \quote {print} it can be either two
tokens, \type {#} (catcode parameter) followed by \type {1} with catcode other,
or one token referring to parameter \type {1} where the character \type {1} is
the opcode of an internal \quote {reference command}. In order to distinguish a
reference from the two token case, parameter hash tokens get shown as doubles.

\start

\catcode `¶=\parametercatcode
\catcode `§=\parametercatcode

\startbuffer
\def\test #1{x#1x##1x####1x}
\def\tset ¶1{x¶1x¶¶1x¶¶¶¶1x}
\stopbuffer

\typebuffer \getbuffer

And with \type {\meaning} we get, consistent with the input:

\startlines \tt
\meaning\test
\meaning\tset
\stoplines

These are equivalent, apart from the parameter character in the body of the
definition:

\startlines \tt
\luatokentable\test
\luatokentable\tset
\stoplines

\stop

Watch how every \quote {parameter} is just a character with the \UNICODE\ index
of the used input character as property. Let us summarize the process. When a
single parameter character is seen in the input, the next characer determines how
it will be interpreted. If there is a digit then it becomes a reference to a
parameter in the preamble, and when followed by another parameter character it
will be appended to the body of the macro and that second one is dropped. So, two
parameter characters become one, and four become two. One parameter character
becomes a reference and from that you can guess what three in a row become.
However, when \TEX\ is showing the macro definition (using \type {meaning}) the
hashes get duplicated in order to distinguish parameter references from parameter
characters that were kept (e.g.\ for nested definitions). One can make an
argument for \type {\parameterchar} as we also have \type {\escapechar} but by
now this convention is settled and it doesn't look that bad anyway.

We now come to the more tricky part with respect to the doubling of hashes. When
\TEX\ was written its application landscape looked a bit different. For instance,
fonts were limited and therefore it was natural to access special characters by
name. Using \type {\#} to get a hash in the text was not that problematic, if one
needed that character at all. The same can be said for the braces, backslash and
even the dollar (after all \TEX\ is free software).

But what if we have more visualization and|/|or serialization than meanings and
tracing? When we opened op the internals in \LUATEX\ and even more in
\LUAMETATEX\ the duplicating of hashes became a bit of a problem. There we don't
need to distinguish between a parameter reference and a parameter character
because by that time these references are resolved. All hashes that we encounter
are just that: hashes. And this is why in \LUAMETATEX\ we disable the duplication
for those cases where it serves no purpose.

When the engine scans a macro definition it starts with picking up the name of
the macro. Then it starts scanning the preamble up to the left brace. In the
preamble of a macro the scanner converts hashes followed by another token into
single match token. Then when the macro body is scanned single hashes followed by
a number become a reference, while double hashes become one hash and get
interpreted at expansion time (possibly triggering an error when not followed by
a valid specifier like a number). In traditional \TEX\ we basically had this:

\starttyping
\def\test#1{#1}
\def\test#1{##}
\def\test#1{#X}
\def\test#1{##1}
\stoptyping

There can be a trailing \type {#} in the preamble for special purposes but we
forget about that now. The first definition is valid, the second definition is
invalid when the macro is expanded and the third definition triggers an error at
definition time. The last definition will again trigger an error at expansion
time.

However, in \LUAMETATEX\ we have an extended preamble where the following
preamble parameters are handled (some only in tolerant mode):

\starttabulate[|c|||]
\NC \type{#n} \NC parameter                                   \NC index \type{1} upto \type{E} \NC \NR
\TB
\NC \type{#0} \NC throw away parameter                        \NC increment index              \NC \NR
\NC \type{#-} \NC ignore parameter                            \NC keep index                   \NC \NR
\TB
\NC \type{#*} \NC gobble white space                          \NC                              \NC \NR
\NC \type{#+} \NC keep (honor) the braces                     \NC                              \NC \NR
\NC \type{#.} \NC ignore pars and spaces                      \NC                              \NC \NR
\NC \type{#,} \NC push back space when no match               \NC                              \NC \NR
\NC \type{#/} \NC remove leading and trailing spaces and pars \NC                              \NC \NR
\NC \type{#=} \NC braces are mandate                          \NC                              \NC \NR
\NC \type{#^} \NC keep leading spaces                         \NC                              \NC \NR
\NC \type{#_} \NC braces are mandate and kept (obey)          \NC                              \NC \NR
\TB
\NC \type{#@} \NC par delimiter                               \NC only for internal usage      \NC \NR
\TB
\NC \type{#:} \NC pick up scanning here                       \NC                              \NC \NR
\NC \type{#;} \NC quit scanning                               \NC                              \NC \NR
\TB
\NC \type{#L} \NC left delimiter token                        \NC followed by token            \NC \NR
\NC \type{#R} \NC right delimiter token                       \NC followed by token            \NC \NR
\TB
\NC \type{#G} \NC gobble token                                \NC followed by token            \NC \NR
\NC \type{#M} \NC gobble token and spaces                     \NC followed by token            \NC \NR
\TB
\NC \type{#S} \NC nest square brackets                        \NC only inner pairs             \NC \NR
\NC \type{#X} \NC nest angle brackets                         \NC only inner pairs             \NC \NR
\NC \type{#P} \NC nest parentheses                            \NC only inner pairs             \NC \NR
\stoptabulate

As mentioned these will become so called match tokens and only when we show the
meaning the hash will show up again.

\startbuffer
\def\test[#1]#*[*S#2]{.#1.#2.}
\stopbuffer

\typebuffer \getbuffer

\startlines \tt
\luatokentable\test
\stoplines

This means that in the body of a macro you will not see \type {#*} show up. It is
just a directive that tells the macro parser that spaces are to be skipped. The
\type {#S} directive makes the parser for the second parameter handle nested
square bracket. The only hash that we can see end up in the body is the one that
we entered as double hash (then turned single) followed by (in traditional terms)
a number that when all gets parsed with then become a reference: the sequence
\type {##1} internally is \type {#1} and becomes \quote {reference to parameter
1} assuming that we define a macro in that body. If no number is there, an error
is issued. This opens up the possibility to add more variants because it will
only break compatibility with respect to what is seen as error. As with the
preamble extensions, old documents that have them would have crashed before they
became available.

So, this means that in the body, and actually anywhere in the document apart from
preambles, we now support the following general parameter specifiers. Keep in
mind that they expand in an expansion context which can be tricky when they
overlap with preamble entries, like for instance \type {#R} in such an expansion.
Future extensions can add more so {\em any} hashed shortcut is sensitive for
that.

\starttabulate[|l|||]
\NC \type{#I} \NC current iterator     \NC \type {\currentloopiterator}    \NC \NR
\NC \type{#P} \NC parent iterator      \NC \type {\previousloopiterator 1} \NC \NR
\NC \type{#G} \NC grandparent iterator \NC \type {\previousloopiterator 2} \NC \NR
\TB
\NC \type{#H} \NC hash escape          \NC \type {#}  \NC \NR
\NC \type{#S} \NC space escape         \NC \ruledhbox to \interwordspace{\novrule height .8\strutht} \NC \NR
\NC \type{#T} \NC tab escape           \NC \type {\t} \NC \NR
\NC \type{#L} \NC newline escape       \NC \type {\n} \NC \NR
\NC \type{#R} \NC return escape        \NC \type {\r} \NC \NR
\NC \type{#X} \NC backslash escape     \NC \tex  {}   \NC \NR
\TB
\NC \type{#N} \NC nbsp \NC \type {U+00A0} (under consideration) \NC \NR
\NC \type{#Z} \NC zws  \NC \type {U+200B} (under consideration) \NC \NR
%NC \type{#-} \NC zwnj \NC \type {U+200C} (under consideration) \NC \NR
%NC \type{#+} \NC zwj  \NC \type {U+200D} (under consideration) \NC \NR
%NC \type{#>} \NC l2r  \NC \type {U+200E} (under consideration) \NC \NR
%NC \type{#<} \NC r2l  \NC \type {U+200F} (under consideration) \NC \NR
\stoptabulate

Some will now argue that we already have \type {^^} escapes in \TEX\ and \type
{^^^^} and \type {^^^^^^} in \LUATEX\ and that is true. However, these can be
disabled, and in \CONTEXT\ they are, where we instead enable the prescript,
postscript, and index features in mathmode and there type {^} and \type {_} are
used. Even more: in \CONTEXT\ we just let \type {^}, \type {_} and \type {&} be
what they are. Occasionally I consider \type {$} to be just that but as I don't
have dollars I will happily leave that for inline math. When users are not
defining macros or are using the alternative definitions we can consider making
the \type {#} a hash. An excellent discussion of how \TEX\ reads it's input and
changes state accordingly can be found in Victor Eijkhouts \quotation {\TEX\ By
Topic}, section 2.6: when \type {^^} is followed by a character with $v < 128$
the interpreter will inject a character with code $v - 64$. When followed by two
(!) lowercase hexadecimal characters, the corresponding character will be
injected. Anyway, it not only looks kind of ugly, it also is somewhat weird
because what follows is interpreted mixed way. The substitution happens early on
(which is okay). But, how about the output? Traditional \TEX\ serializes special
characters with a similar syntax but that has become optional when eight bit mode
was added to the engines, it is configurable in \LUATEX\ and has been dropped in
\LUAMETATEX: we operate in a \UTF\ universum.

\stopsectionlevel

\stopdocument

% freezing pitfalls:
%
% - \futurelet  : \overloaded needed
% - \let        : \overloaded sometimes needed
%
% primitive protection:
%
% \newif\iffoo \footrue \foofalse : problem when we make iftrue and iffalse
% permanent ... they inherit, so we can't let them, we need a not permanent
% alias which is again tricky ... something native?
%
% immutable : still \count000 but we can consider blocking that, for instance
% by \def\count{some error}
%
% \defcsname
% \edefcsname
% \letcsname

% {
%     \scratchdimenone 10pt \the\currentstacksize\par
%     \scratchdimentwo 10pt \the\currentstacksize\par
%     \scratchdimenone 20pt \the\currentstacksize\par
%     \scratchdimentwo 20pt \the\currentstacksize\par
%     \scratchdimenone 10pt \the\currentstacksize\par
%     {
%         \scratchdimenone 10pt \the\currentstacksize\par
%         \scratchdimentwo 20pt \the\currentstacksize\par
%     }
% }

% Experiment
%
% \def\testID#i#*#d{[\the#1][\the#2]}
% \def\testII#i#*#i{[\the#1][\number#2]}
%
% \startlines
% \testID 123 123pt
% \testID {123}{123pt}
% \testID {123}{655360sp}
% \testID {123}{1ex}
% \testID {123}{2ex}
% \testID {123}{6ex}
% \testID {123}{10ex}
% \testID {123}{20ex}
% \testID {123}{123pt}
% \stoplines