% language=us runpath=texruns:manuals/musings % musical timestamp: listening to FLUX (jazz trio) in Januari 2023 \startcomponent musings-speed \environment musings-style \startchapter[title={Speeding up \TEX}] \startsection[title={Introduction}] Recently a couple of cordless phones that I use gave up as soon as I used them for a minute or so. The first time that happened I figured that after all these years the batteries had gone bad and after some testing I decided to replace them. I got some of these high end batteries that discharge slowly and store a lot of power. Within a year they were dead too. Then I went for the more regular and cheaper ones, again with a lot of capacity. And yes, these also gave up, that is: only in the phones that were hardly used. The batteries lasted longer in phones that were discharged by usage daily. When I went out for new batteries I was asked if I needed them for cordless phones and, surprise, was given special ones that actually stored less but were guaranteed to work for at least 6 years. The package explicitly mentioned use in cordless phones. So here performance doesn't come with the most high end ones, based on specifications that impress. This is also true for computers that are used to process \TEX\ documents. More cores amount to much accumulated processing power but for a single core \TEX\ process, a few fast cores are more relevant than plenty slower ones that run in parallel. More memory helps but compared to other processes \TEX\ actually doesn't use that much memory. And disk speed matters but less so when the operating system caches files. What does play a role are cpu caches because \TEX\ is very memory intense and processing is not concentrated in a few functions. But a large cache shared among many (busy) cores makes for a less impressive performance. So what matters really? In the next sections we will explore a few points of view. It's not some advertisement for a specific engine, but much more about putting it into perspective (as one can run into ridiculous arguments on the web). It is not only the hardware and software that matters but also how one uses it. \stopsection \startsection[title=The engine] There are various ways to compare engines and each has its own characteristics. The \PDFTEX\ engine is closest to the original. It directly produces the output which can give it an edge. It is eight bit and therefore uses small fonts and internally all that is related to fonts and characters is also small. This means that there is little overhead in typesetting a paragraph: hyphenation, ligature building and kerning are interwoven and perform well. The \XETEX\ engine supports wide fonts and \UNICODE\ and therefore can be seen as 32 bit. I never looked into the code so I can't tell how far that goes but performance is definitely less than \PDFTEX. The rendering of text is delegated to a library (there were some changes in that along its development) which is less efficient than the built in \PDFTEX\ route. But it is also more powerful. The \LUATEX\ engine is mostly 32 bit and delegates non standard font handling to \LUA\ which comes with a performance penalty but also adds a lot of flexibility. Also, the fact that one can call out to \LUA\ in many places makes that one can not really blame the engine for performance hits. The fact that hyphenation, ligature building and kerning is split comes at a small price too. We have larger nodes so compared to \PDFTEX\ more memory is used and accessed. Some mechanisms are actually more efficient, like font expansion and protrusion. The \LUAMETATEX\ engine lacks a font loader (but it does have the traditional renderer on board) and it has no backend. So even more is delegated to \LUA, which in turn makes this the slowest of the lot. And, again more data travels with nodes. In some modes of operation much more calculations take place. However, because it has an enriched macro processor, additional primitives, and plenty deep down \quote {improvements} it can perform better than \LUATEX\ (and even \LUAJITTEX, the \LUATEX\ version with a faster but limited \LUA\ virtual machine). And as with \LUATEX, there are usage patterns that make it faster than \PDFTEX. So, in general the order of performance is \PDFTEX, \XETEX, \LUAJITTEX\ (kind of obsolete), \LUATEX, \LUAMETATEX. But then, how come that \CONTEXT\ users never complain about performance? The reasons is simple: performance is quite okay and as it is relative to what one does, a user will accept a drop in performance when more has to be done. When we moved on from \LUATEX\ to \LUAMETATEX\ there definitely was a drop in performance, simply because of the \LUA\ backend. Because upgrading happened in small (but continuous) steps, right from the start the new engine was good enough to be used in production which is why most users switched to \LMTX\ as soon as became clear that this is where the progress is made. There were no real complaints about the upto 15\percent\ initial performance drop which indicates that for most users it doesn't matter that much. As the engine evolved we could gain some back and now \LUAMETATEX\ ends up between \PDFTEX\ and \LUATEX\ and in many modern scenarios even comes out first. The fact that in the meantime we can be much faster than \LUATEX\ did get noticed (when asked). However, as development takes years updating a machine in the meantime puts discussions about performance in a different (causality) perspective anyway. \stopsection \startsection[title=The coding] Performance can increase when native engine features are used instead of complex macros that have to work around limitations. It can also decrease when new features are used that add complex functionality. And when an engine extends existing functionality that is likely to come at a price. So where \LUAMETATEX\ provides a more rich programming environment, it also had a more complex par builder, page builder, insert, mark and adjust handling, plenty of extra character, rule and box features and all of that definitely adds some overhead. Quite often a gain in simplicity (nicer and more efficient macros) compensate the more complex features. That is because on the average the engine doesn't do that much (tens of thousands of the same) complex macro expansion and also doesn't demand that much complex low level typesetting. A gain here is often compensated by a loss there. This is one reason why during the years \LUAMETATEX\ could sustain a decent performance. Personally I don't accept a drop in performance easily which is why in practice most mechanism, even when extended, probably perform better but I'm not going to prove that observation. One important reason why \CONTEXT\ \LMTX\ with \LUAMETATEX\ is faster than its ancestors is that we got rid of some intermediate programming layers. Most users have never seen the auxiliary macros or implementation details but plenty were used in \MKII\ and \MKIV. Of course we kept them because often they are nicer than many lines of primitive code, but only a few (and less in the future) are used in the core. Examples are multi step macros (that pick up arguments) that became single step and complex if tests that became inline native tests. Because \CONTEXT\ always had a high level of abstraction consistency of the interface also makes that we don't need many helpers. When some features (like for instance box manipulation) got extended one could expect a performance hit due to more extensive optional keyword scanning in the engine but that was compensated by improved scanners. The same is true for scanning numbers and dimensions. So, more functionality doesn't always come at a price. To summarize this: although the engine went a bit more \quote {cisc} than \type {risc} the macro package went more \quote {risc}. It reminds me a bit of the end of the previous century when there was much talk of fourth generation languages, something on top of the normal languages. In the end it were scripting languages that became the fashion while traditional languages like \CCODE\ remained relatively stable and unchanged for implementing them (and more). A similar observation can be made for \CONTEXT\ itself. Whenever some new feature gets added to an existing mechanism I try to not cripple performance and thanks to the way \CONTEXT\ is set up it works out okay. Let's look at an example. In \MKII\ we can compare two \quote {strings} with the macro \type {doifelse}. Its definition is as follows: \starttyping \long\def\doifelse#1#2% {\let\donottest\dontprocesstest \edef\!!stringa{#1}% \edef\!!stringb{#2}% \let\donottest\doprocesstest \ifx\!!stringa\!!stringb \expandafter\firstoftwoarguments \else \expandafter\secondoftwoarguments \fi} \stoptyping This macro takes two arguments that gets expanded inside two helpers that we then compare with a primitive \type {\ifx}. Depending on the outcome we expand one of the two following arguments but first we get rid of the interfering \type {\else} and \type {\fi}. The pushing and popping of \type {\donottest} takes care of protection of unwanted expansion in an \type {\edef}. Many functional macros are what we call protected: then expand in two steps depending on the embedded \type {\donottest} macro. Think of (simplified): \starttyping \def\realfoo{something is done here} \def\usedfoo{\donottest\realfoo} \stoptyping Normally \type {\donottest} is doing nothing so \type {\realfoo} gets expanded but there are cases where we (for instance) \type {\let} it be \type {\string} which then serializes the macro. This is something that happens when writing to the multi pass data file. It can also be used for overloading, for instance in the backend or when converting something. This protection against expansion has always been a \CONTEXT\ feature, which in turn made it pretty robust in multi pass scenarios, but it definitely came with performance penalty. When \PDFTEX\ got the \ETEX\ extensions we could use the \type {\protected} prefix to replace this trickery. That means that \MKII\ will use a different definition of \type {\doifelse} when that primitive is known: \starttyping \long\def\doifelse#1#2% {\edef\!!stringa{#1}% \edef\!!stringb{#2}% \ifx\!!stringa\!!stringb \expandafter\firstoftwoarguments \else \expandafter\secondoftwoarguments \fi} \stoptyping This works okay because we now do this: \starttyping \protected\def\usedfoo{something is done here} \stoptyping The \type {\doifelse} helper itself is not protected in \MKII\ (non \ETEX\ mode) It would be a performance hit. I won't bore the reader with the tricks needed to do the opposite, that is: expand a protected macro. It is seldom needed anyway. The \MKIV\ definition used with \LUATEX\ is not much different, only the \type {\long} prefix is missing. That one is needed when one wants \type {#1} and|/|or \type {#2} to be tolerant with respect to embedded \type {\par} equivalents. In \LUAMETATEX\ we can disable that check and in \CONTEXT\ all macros are thereby \type {\long}. Users won't notice because in \CONTEXT\ most macros were always defined the long way; we also suppress \type {\outer} errors. \starttyping \protected\def\doifelse#1#2% {\edef\m_syst_string_one{#1}% \edef\m_syst_string_two{#2}% \ifx\m_syst_string_one\m_syst_string_two \expandafter\firstoftwoarguments \else \expandafter\secondoftwoarguments \fi} \stoptyping Implementation wise a macro, once scanned and stored, carries the long property in its command code so that has overhead. However because \LUATEX\ is compatible we cannot make all normal macros long by default when \type {\suppresslongerror} is used. Therefore checking for an argument running into a \type {\par} is still checked but the message is suppressed based on the setting of the mentioned parameter. Performance wise, not using \type {\long} comes a the cost of checking a parameter which means an additional memory access and comparison. Unless we otherwise gain something in the engine it comes at a cost. In \LUAMETATEX\ the \type {\long} and \type {\outer} prefixes are ignored. Even better, protected macros are also implemented a bit more efficiently. In the end the definition of \type {\doifelse} in \LMTX\ looks a bit different: \starttyping \permanent\protected\def\doifelse#1#2% {\iftok{#1}{#2}% \expandafter\firstoftwoarguments \else \expandafter\secondoftwoarguments \fi} \stoptyping The \typ {\permanent} prefix flags this macro as such. Depending on the value of \typ {\overloadmode} a redefinition is permitted, comes with a warning or results in a fatal error. Of course this comes at a price when we define macros or values of quantities but this is rather well compensated by all kind of improvements in handling macros: defining, expansion, saving and restoring, etc. More interesting is the use of \type {\iftok} here. It saves us defining two helper macros. Of course the content still needs to be expanded before comparison but we no longer have various macro management overhead. In scenarios where we don't need to jump over the \type {\else} or \type {\fi} we can use this test in place which saves passing two arguments and grabbing one argument later on. Actually, grabbing is also different, compare: \starttyping \def\firstoftwoarguments #1#2{#1} % MkII and MkIV \permanent\def\firstoftwoarguments #1#-{#1} % MkXL aka LMTX \def\secondoftwoarguments#1#2{#1} % MkII and MkIV \permanent\def\secondoftwoarguments#-#1{#1} % MkXL aka LMTX \stoptyping In the case of \LUAMETATEX\ the \type {#-} makes that we don't even bother to store the argument as it is ignored. Where \type {#0} does the same it also increments the argument counter which is why here even the second arguments has number ~1. Now, if this more efficient? Sure, but how often does it really happen? The engine still needs to scan (which comes at a cost) but we save on temporary token list storage. Because \TEX\ is so fast already, measuring only shows differences when one has many (and here a real lot) iterations. However, all these small bits add up which is what we've seen in 2022 in \CONTEXT: it is the reason why we are now faster than \MKIV\ with \LUATEX, even with more functionality in the engine. I can probably write hundreds of pages in explaining what was added, changed, made more flexible and what side effects it had|/|has on performance but I bet no one is really interested in that. In fact, the previous exploration is just a side effect of a question that triggered it, so maybe future questions will trigger more explanations. It anyhow demonstrates what I meant when I said that \LUAMETATEX\ is meant to be leaner and meaner. Of course the code base and binary is smaller but that also gets compensated by more functionality. It also means that we can make the \CONTEXT\ code base nicer because for me a good looking source (which of course is subjective) is pretty important. \stopsection \startsection[title=Compatibility] There are non \CONTEXT\ users who seem to love to stress that successive versions of \CONTEXT\ are incompatible. Other claims are that it is developed in a commercial setting. While it is true that there are changes and it is also true that \CONTEXT\ is used in commercial settings, it is not that different from other open source projects. The majority of the code is written without compensation and it is offered without advertisements or request for support. It is true that when we can render better, it will be done. But the user interfaces only change when there is a reason and there are few cases where some functionality became obsolete, think of input and font encodings. Most such changes directly relate to the engine: in \PDFTEX\ and \MKII\ we emulate \UTF-8\ wile in \LUATEX\ is comes natively. In \PDFTEX\ eight bit (\TYPEONE) fonts are used while \LUATEX\ adds support for \OPENTYPE. Other macro packages support that by additional packages while \CONTEXT\ has it integrated. That is why the system evolves over time. Just a users adapt to (yearly) operating system interfaces, mobile phones, all kinds of hardware, cars, clothing, media and so on, the \CONTEXT\ users have no problem adapting to an evolving \TEX\ ecosystem. I guess claims about changes (being a disadvantage) can only point to a lack of development elsewhere. The main reason for mentioning this is that when \CONTEXT\ users move on to newer engines, the older ones are seldom used. So, few users compare a \LMTX\ run with one using \PDFTEX\ or \LUATEX. They naturally expect \LUAMETATEX\ to perform well and maybe even to perform better over time. They just don't complain. And unless one hacks (overloads) system macros compatibility is not really an issue. What can be an issue is that updates and adaptations to a newer engine come with bugs but those are solved. So, the fact that we compare incompatible engines with likely different low level macro implementations of otherwise stable features of a macro package makes comparison hard. For instance, maybe there are speedups possible in frozen \MKII, although it is unlikely, which makes that it might even perform better than reported. In a similar fashion, the fact that \OPENTYPE\ is more demanding for sure makes that \LUATEX\ rendering is slower than \PDFTEX. It anyhow makes a discussion about performance within and between macro packages even more ridiculous. Just don't buy those claims and|/|or ask on the \CONTEXT\ mailing list for clarification. \stopsection \startsection[title=The job] So, say that we now have an efficient and powerful engine and a matching macro package. Does that make all jobs faster? For sure, the ones that I use as benchmark run much smoother. The 360 page \LUAMETATEX\ manual runs in less than 8.4 seconds on a Dell Precision laptop with (mobile) Intel(R) Xeon(R) CPU E3-1505M v6 @ 3.00GHz, 2TB fast Samsung pro SSD, and 48 GB of memory, running Windows 10. The \METAFUN\ manual with many more pages and thousands of \METAPOST\ graphics needs a bit more than 12 seconds. So you don't hear me complain. This chapter takes 7.5 seconds plus 0.5 is for the runner, not enough time to get coffee. Nowadays I tend to measure performance in terms of pages per second, because in the end that is what users experience. For me more important are the gains for my colleague who processes documents of 400 pages from hundreds of small \XML\ files with multiple graphics per page. Given different output variants a lot of processing takes place, so there a gain from 20 pages per second to 25 pages per second is welcome. Anyway, here are a few measurements of a {\em simple} test suite per January 7, 2023. We use this as test text: \starttyping \def\Knuth{%% Thus, I came to the conclusion that the designer of a new system must not only be the implementer and first large||scale user; the designer should also write the first user manual. \par The separation of any of these four components would have hurt \TeX\ significantly. If I had not participated fully in all these activities, literally hundreds of improvements would never have been made, because I would never have thought of them or perceived why they were important. \par But a system cannot be successful if it is too strongly influenced by a single person. Once the initial design is complete and fairly robust, the real test begins as people with many different viewpoints undertake their own experiments. } \stoptyping Now keep in mind that these are simple examples. On more complex documents the \LUAMETATEX\ engine with \LMTX\ is relatively faster: think \XML, plenty \METAPOST, complex tables, advanced math, dozens of fonts in combination with the new compact font mode. The tests themselves are simple: we switch fonts (because fonts bring overhead), we add some color (because we use different methods), we process some graphics (to show what embedding \METAPOST\ brings), we do some tables (because that can be stressful). Each sample is run 50, 500 or 1000 times, and each set is run a couple of times so that we compensate for caching and fluctuating system load. The tests are more about signaling a trend than about absolute numbers. For what it's worth, I used a \LUA\ script to run the samples. When you run an experiment that measures performance, keep in mind that performance not only depends on the engine, but also on for instance logging. When I run the \CONTEXT\ test suite it takes 1250 seconds if the console takes the full screen on a 2560 by 1600 display and 30 seconds more on a 3840 by 2160 display and it even depends on how large the font is set. On the 1920 by 1200 monitor I get to 1230. Of course these times change when we add more to the test suite so it's always a momentary measurement. Similar differences can be observed when running in an editor. A good test is making a \CONTEXT\ format: 2.2 seconds goes down to below 1.8 when the output is piped to a file. On a decent 2023 desktop those times are probably half but I don't have one at hand. \startsubsubject[title={sample 1, number of runs: 2}] \starttyping \starttext \dorecurse {%s} { \Knuth \par } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 0.63 \NC 0.83 \NC 1.07 \NC \NR \BC luatex \NC 0.95 \NC 1.86 \NC 2.94 \NC \NR \BC luametatex \NC 0.61 \NC 1.49 \NC 2.48 \NC \NR \HL \stoptabulate \stopsubsubject \startsubsubject[title={sample 2, number of runs: 2}] \starttyping \starttext \dorecurse {%s} { \tf \Knuth \bf \Knuth \it \Knuth \bs \Knuth \par } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 0.70 \NC 1.73 \NC 2.80 \NC \NR \BC luatex \NC 1.37 \NC 5.37 \NC 9.92 \NC \NR \BC luametatex \NC 1.04 \NC 5.06 \NC 9.73 \NC \NR \HL \stoptabulate \stopsubsubject \startsubsubject[title={sample 3, number of runs: 2}] \starttyping \starttext \dorecurse {%s} { \tf \Knuth \it knuth \bf \Knuth \bs knuth \it \Knuth \tf knuth \bs \Knuth \bf knuth \par } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 0.71 \NC 1.81 \NC 2.98 \NC \NR \BC luatex \NC 1.41 \NC 5.84 \NC 10.77 \NC \NR \BC luametatex \NC 1.05 \NC 5.71 \NC 10.60 \NC \NR \HL \stoptabulate \stopsubsubject \startsubsubject[title={sample 4, number of runs: 2}] \starttyping \setupcolors[state=start] \starttext \dorecurse {%s} { {\red \tf \Knuth \green \it knuth} {\red \bf \Knuth \green \bs knuth} {\red \it \Knuth \green \tf knuth} {\red \bs \Knuth \green \bf knuth} \par } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 0.73 \NC 1.91 \NC 3.64 \NC \NR \BC luatex \NC 1.39 \NC 5.82 \NC 12.58 \NC \NR \BC luametatex \NC 1.07 \NC 5.57 \NC 11.85 \NC \NR \HL \stoptabulate \stopsubsubject \startsubsubject[title={sample 5, number of runs: 2}] \starttyping \starttext \dorecurse {%s} { \null \page } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 0.62 \NC 1.12 \NC 1.68 \NC \NR \BC luatex \NC 0.90 \NC 1.39 \NC 1.98 \NC \NR \BC luametatex \NC 0.58 \NC 0.99 \NC 1.46 \NC \NR \HL \stoptabulate \stopsubsubject \startsubsubject[title={sample 6, number of runs: 2}] \starttyping \starttext \dorecurse {%s} { %% nothing } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 0.55 \NC 0.54 \NC 0.56 \NC \NR \BC luatex \NC 0.79 \NC 0.81 \NC 0.82 \NC \NR \BC luametatex \NC 0.54 \NC 0.52 \NC 0.53 \NC \NR \HL \stoptabulate \stopsubsubject \startsubsubject[title={sample 7, number of runs: 2}] \starttyping \starttext \dontleavehmode \dorecurse {%s} { \framed[width=1cm,height=1cm,offset=2mm]{x} } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 0.58 \NC 0.65 \NC 0.71 \NC \NR \BC luatex \NC 0.84 \NC 0.96 \NC 1.08 \NC \NR \BC luametatex \NC 0.54 \NC 0.62 \NC 0.72 \NC \NR \HL \stoptabulate \stopsubsubject \startsubsubject[title={sample 8, number of runs: 2}] \starttyping \starttext \dontleavehmode \dorecurse {%s} { \framed [width=1cm,height=1cm,offset=2mm, foregroundstyle=bold,foregroundcolor=red, background=color,backgroundcolor=green] {x} } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 0.59 \NC 0.70 \NC 0.83 \NC \NR \BC luatex \NC 0.87 \NC 1.00 \NC 1.17 \NC \NR \BC luametatex \NC 0.55 \NC 0.66 \NC 0.78 \NC \NR \HL \stoptabulate \stopsubsubject \startsubsubject[title={sample 9, number of runs: 2}] \starttyping \starttext \ifdefined\permanent\else\def\BC{\NC\bf}\fi \dontleavehmode \dorecurse {%s} { \starttabulate[|||||] \NC test \BC test \NC test \NC test \NC \NR \NC test \BC test \NC test \NC test \NC \NR \NC test \BC test \NC test \NC test \NC \NR \NC test \BC test \NC test \NC test \NC \NR \stoptabulate } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 0.62 \NC 1.15 \NC 1.71 \NC \NR \BC luatex \NC 0.94 \NC 1.84 \NC 2.86 \NC \NR \BC luametatex \NC 0.60 \NC 1.19 \NC 1.88 \NC \NR \HL \stoptabulate \stopsubsubject \startsubsubject[title={sample 10, number of runs: 2}] \starttyping \starttext \dontleavehmode \dorecurse {%s} { \startMPcode fill fullcircle scaled 1cm withcolor red ; fill fullsquare scaled 1cm withcolor green ; \stopMPcode \space } \stoptext \stoptyping \starttabulate[||r|r|r|] \HL \BC engine \BC 50 \BC 500 \BC 1000 \NC \NR \HL \BC pdftex \NC 5.73 \NC 50.98 \NC 102.10 \NC \NR \BC luatex \NC 0.93 \NC 1.07 \NC 1.30 \NC \NR \BC luametatex \NC 0.57 \NC 0.71 \NC 0.86 \NC \NR \HL \stoptabulate \stopsection \startsection[title=Final words] Whenever I run into (or get send) remarks of (especially non \CONTEXT) users suggesting that \LUATEX\ is much slower than \PDFTEX\ or that \LUAMETATEX\ seems much faster than \LUATEX, one really has to keep in mind that this is not always true. Among the questions to be asked are \quotation {What engine do you use?}, \quotation {Which macro package do you use?}, \quotation {How well is your style set up?}, \quotation {How complex is the document?}, \quotation {Is your own additional code efficient?}, \quotation {Do you use engine and macro package features the right way?} and of course \quotation {What do you compare with?}, \quotation {What do you expect and why?}, \quotation {Do you actually know what goes on deep down?}. An embarrassing one can be \quotation {Do you have an idea what is involved in fulfilling your request given that we use a flexible adaptive macro language?}. Much probably these questions not get answered properly. Another thing to make clear is that when someone claims for instance that \CONTEXT\ \LMTX\ is fast because of \LUAMETATEX, or that \LUAMETATEX\ is much faster than \LUATEX, a healthy suspicion should kick in: does that someone really knows what happens and matters? The previous numbers do show differences for simple cases but we're often not talking of differences that can be used as an excuse for insufficient coding. In the end it is all about the experience: does performance feel in tune with expectations. Which is not to say that I will make \CONTEXT\ and \LUAMETATEX\ faster because after all there are usage scenarios where one has to process tens of thousands of documents with a reasonable amount of time, on regular infrastructure, and of course with as little as possible energy consumption. If \PDFTEX\ suits your purpose, there is no need to move to \LUATEX. As with rechargeable batteries in cordless phones a higher capacity can make things worse. If \LUATEX\ fits the bill, don't dream about using \LUAMETATEX\ instead because it will half runtime because the adaptations needed in the macro package (like adding a backend) might actually slow it down. Moores law doesn't apply to \TEX\ engines and macro packages and you might get disappointed. Accept that the choice you made for a macro package can come with a price. Quite often it is rather easy to debunk complaints and claims which makes one wonder why claims about perceived or potential are made at all. But then, I'm accustomed to weird remarks and conclusions about \CONTEXT\ as a macro package, or for that matter \LUATEX\ (as it originates in the \CONTEXT\ community) even by people who should know better. Hopefully the above invites to being more careful. \stopsection \stopchapter \stopcomponent