cld-verbatim.tex /size: 12 Kb    last modification: 2021-10-28 13:50
1% language=us runpath=texruns:manuals/cld
2
3\startcomponent cld-verbatim
4
5\environment cld-environment
6
7\startchapter[title=Verbatim]
8
9\startsection[title=Introduction]
10
11\index{verbatim}
12
13If you are familiar with traditional \TEX, you know that some characters have
14special meanings. For instance a \type {$} starts and ends inline math mode:
15
16\starttyping
17$e=mc^2$
18\stoptyping
19
20If we want to typeset math from the \LUA\ end, we can say:
21
22\starttyping
23context.mathematics("e=mc^2")
24\stoptyping
25
26This is in fact:
27
28\starttyping
29\mathematics{e=mc^2}
30\stoptyping
31
32However, if we want to typeset a dollar and use the \type {ctxcatcodes} regime,
33we need to explicitly access that character using \type {\char} or use a command
34that expands into the character with catcode other.
35
36One step further is that we typeset all characters as they are and this is called
37verbatim. In that mode all characters are tokens without any special meaning.
38
39\stopsection
40
41\startsection[title=Special treatment]
42
43The formula in the introduction can be typeset verbatim as follows:
44
45\startbuffer
46context.verbatim("$e=mc^2$")
47\stopbuffer
48
49\typebuffer
50
51This gives:
52
53\ctxluabuffer
54
55You can also do things like this:
56
57\startbuffer
58context.verbatim.bold("$e=mc^2$")
59\stopbuffer
60
61\typebuffer
62
63Which gives:
64
65\ctxluabuffer
66
67So, within the \type {verbatim} namespace, each command gets its arguments
68verbatim.
69
70\startbuffer
71context.verbatim.inframed({ offset = "0pt" }, "$e=mc^2$")
72\stopbuffer
73
74\typebuffer
75
76Here we get: \ctxluabuffer. So, settings and alike are processed as if the user
77had used a regular \type {context.inframed} but the content comes out verbose.
78
79If you wonder why verbatim is needed as we also have the \type {type} function
80(macro) the answer is that it is faster, easier to key in, and sometimes the only
81way to get the desired result.
82
83\stopsection
84
85\startsection[title=Multiple lines]
86
87Currently we have to deal with linebreaks in a special way. This is due to the
88way \TEX\ deals with linebreaks. In fact, when we print something to \TEX, the
89text after a \type {\n} is simply ignored.
90
91For this reason we have a few helpers. If you want to put something in a buffer,
92you cannot use the regular buffer functions unless you make sure that they are
93not overwritten while you're still at the \LUA\ end.
94
95\starttyping
96context.tobuffer("temp",str)
97context.getbuffer("temp")
98\stoptyping
99
100Another helper is the following. It splits the string into lines and feeds them
101piecewise using the \type {context} function and in the process adds a space at
102the end of the line (as this is what \TEX\ normally does.
103
104\starttyping
105context.tolines(str)
106\stoptyping
107
108Catcodes can get in the way when you pipe something to \TEX\ that itself changes
109the catcodes. This happens for instance when you write buffers that themselves
110have buffers or have code that changes the line endings as with \type
111{startlines}. In that case you need to feed back the content as if it were a
112file. This is done with:
113
114\starttyping
115context.viafile(str)
116\stoptyping
117
118The string can contain newlines. The string is written to a virtual file that is
119input. Currently names looks like \type {virtual://virtualfile.1} but future
120versions might have a different name part, so best use the variable instead.
121After all, you don't know the current number in advance anyway.
122
123\stopsection
124
125\startsection[title=Pretty printing]
126
127In \CONTEXT\ \MKII\ there have always been pretty printing options. We needed it
128for manuals and it was also handy to print sources in the same colors as the
129editor uses. Most of those pretty printers work in a line|-|by|-|line basis, but
130some are more complex, especially when comments or strings can span multiple
131lines.
132
133When the first versions of \LUATEX\ showed up, rewriting the \MKII\ code to use
134\LUA\ was a nice exercise and the code was not that bad, but when \LPEG\ showed
135up, I put it on the agenda to reimplement them again.
136
137We only ship a few pretty printers. Users normally have their own preferences and
138it's not easy to make general purpose pretty printers. This is why the new
139framework is a bit more flexible and permits users to kick in their own code.
140
141Pretty printing involves more than coloring some characters or words:
142
143\startitemize[packed]
144\startitem spaces should honoured and can be visualized \stopitem
145\startitem newlines and empty lins need to be honoured as well \stopitem
146\startitem optionally lines have to be numbered but \stopitem
147\startitem wrapped around lines should not be numbered \stopitem
148\stopitemize
149
150It's not much fun to deal with these matters each time that you write a pretty
151printer. This is why we can start with an existing one like the default pretty
152printer. We show several variants of doing the same. We start with a simple clone
153of the default parser. \footnote {In the meantime the lexer of the \SCITE\ editor
154that I used also provides a mechanism for using \LPEG\ based lexers. Although in
155the pretty printing code we need a more liberal one I might backport the lexers I
156wrote for editing \TEX, \METAPOST, \LUA, \CLD, \XML\ and \PDF\ as a variant for
157the ones we use in \MKIV\ now. That way we get similar colorschemes which might
158be handy sometimes.}
159
160\startbuffer
161local P, V = lpeg.P, lpeg.V
162
163local grammar = visualizers.newgrammar("default", {
164  pattern    = V("default:pattern"),
165  visualizer = V("pattern")^1
166} )
167
168local parser = P(grammar)
169
170visualizers.register("test-0", { parser = parser })
171\stopbuffer
172
173\typebuffer \ctxluabuffer
174
175We distinguish between grammars (tables with rules), parsers (a grammar turned
176into an \LPEG\ expression), and handlers (collections of functions that can be
177applied. All three are registered under a name and the verbatim commands can
178refer to that name.
179
180\startbuffer
181\starttyping[option=test-0,color=]
182Test 123,
183test 456 and
184test 789!
185\stoptyping
186\stopbuffer
187
188\typebuffer
189
190Nothing special happens here. We just get straightforward verbatim.
191
192\getbuffer
193
194Next we are going to color digits. We collect as many as possible in a row, so
195that we minimize the calls to the colorizer.
196
197\startbuffer
198local patterns, P, V = lpeg.patterns, lpeg.P, lpeg.V
199
200local function colorize(s)
201  context.color{"darkred"}
202  visualizers.writeargument(s)
203end
204
205local grammar = visualizers.newgrammar("default", {
206  digit      = patterns.digit^1 / colorize,
207  pattern    = V("digit") + V("default:pattern"),
208  visualizer = V("pattern")^1
209} )
210
211local parser = P(grammar)
212
213visualizers.register("test-1", { parser = parser })
214\stopbuffer
215
216\typebuffer \ctxluabuffer
217
218Watch how we define a new rule for the digits and overload the pattern rule. We
219can refer to the default rule by using a prefix. This is needed when we define a
220rule with the same name.
221
222\startbuffer
223\starttyping[option=test-1,color=]
224Test 123,
225test 456 and
226test 789!
227\stoptyping
228\stopbuffer
229
230\typebuffer
231
232This time the digits get colored.
233
234\getbuffer
235
236In a similar way we can colorize letters. As with the previous example, we use
237\CONTEXT\ commands at the \LUA\ end.
238
239\startluacode
240local patterns, P, V = lpeg.patterns, lpeg.P, lpeg.V
241
242local function colorize_lowercase(s)
243  context.color{"darkgreen"}
244  visualizers.writeargument(s)
245end
246local function colorize_uppercase(s)
247  context.color{"darkblue"}
248  visualizers.writeargument(s)
249end
250
251local grammar = visualizers.newgrammar("default", {
252
253  lowercase = patterns.lowercase^1 / colorize_lowercase,
254  uppercase = patterns.uppercase^1 / colorize_uppercase,
255
256  pattern =
257      V("lowercase")
258    + V("uppercase")
259    + V("default:pattern"),
260
261  visualizer = V("pattern")^1
262
263} )
264
265local parser = P(grammar)
266
267visualizers.register("test-2", { parser = parser })
268\stopluacode
269
270\startbuffer
271\starttyping[option=test-2,color=]
272Test 123,
273test 456 and
274test 789!
275\stoptyping
276\stopbuffer
277
278\typebuffer
279
280Again we get some coloring.
281
282\getbuffer
283
284It will be clear that the amount of rules and functions is larger when we use a
285more complex parser. It is for this reason that we can group functions in
286handlers. We can also make a pretty printer configurable by defining handlers at
287the \TEX\ end.
288
289\startbuffer
290\definestartstop
291  [MyDigit]
292  [style=bold,color=darkred]
293
294\definestartstop
295  [MyLowercase]
296  [style=bold,color=darkgreen]
297
298\definestartstop
299  [MyUppercase]
300  [style=bold,color=darkblue]
301\stopbuffer
302
303\typebuffer \getbuffer
304
305The \LUA\ code now looks different. Watch out: we need an indirect call to for
306instance \type {MyDigit} because a second argument can be passed: the settings
307for this environment and you don't want that get passed to \type {MyDigit} and
308friends.
309
310\startluacode
311local patterns, P, V = lpeg.patterns, lpeg.P, lpeg.V
312local pattern = visualizers.pattern
313local verbatim = context.verbatim
314
315local MyDigit     = verbatim.MyDigit
316local MyLowercase = verbatim.MyLowercase
317local MyUppercase = verbatim.MyUppercase
318
319-- local handler = newhandler("default, {
320--   digit     = function(s) MyDigit    (s) end,
321--   lowercase = function(s) MyLowercase(s) end,
322--   uppercase = function(s) MyUppercase(s) end,
323-- } )
324
325local handler = {
326  digit     = function(s) MyDigit    (s) end,
327  lowercase = function(s) MyLowercase(s) end,
328  uppercase = function(s) MyUppercase(s) end,
329}
330
331local grammar = visualizers.newgrammar("default", {
332
333  digit     = pattern(handler,"digit",     patterns.digit    ^1),
334  lowercase = pattern(handler,"lowercase", patterns.lowercase^1),
335  uppercase = pattern(handler,"uppercase", patterns.uppercase^1),
336
337  pattern =
338      V("lowercase")
339    + V("uppercase")
340    + V("digit")
341    + V("default:pattern"),
342
343  visualizer = V("pattern")^1
344
345} )
346
347local parser = P(grammar)
348
349visualizers.register("test-3", { parser = parser, handler = handler })
350\stopluacode
351
352\startbuffer
353\starttyping[option=test-3,color=]
354Test 123,
355test 456 and
356test 789!
357\stoptyping
358\stopbuffer
359
360\typebuffer
361
362We get digits, upper- and lowercase characters colored:
363
364\getbuffer
365
366You can also use parsers that don't use \LPEG:
367
368\startbuffer
369local function parser(s)
370  visualizers.write("["..s.."]")
371end
372
373visualizers.register("test-4", { parser = parser })
374\stopbuffer
375
376\typebuffer \ctxluabuffer
377
378\startbuffer
379\starttyping[option=test-4,space=on,color=darkred]
380Test 123,
381test 456 and
382test 789!
383\stoptyping
384\stopbuffer
385
386\typebuffer
387
388The function \type {visualizer.write} takes care of spaces and newlines.
389
390\getbuffer
391
392We have a few more helpers:
393
394\starttabulate[|||]
395\NC \type{visualizers.write}          \NC interprets the argument and applies methods \NC \NR
396\NC \type{visualizers.writenewline}   \NC goes to the next line (similar to \type {\par} \NC \NR
397\NC \type{visualizers.writeemptyline} \NC inserts an empty line (similer to \type {\blank} \NC \NR
398\NC \type{visualizers.writespace}     \NC inserts a (visible) space \NC \NR
399\NC \type{visualizers.writedefault}   \NC writes the argument verbatim without interpretation \NC \NR
400\stoptabulate
401
402These mechanism have quite some overhead in terms of function calls. In the worst
403case each token needs a (nested) call. However, doing all this at the \TEX\ end
404also comes at a price. So, in practice this approach is more flexible but without
405too large a penalty.
406
407In all these examples we typeset the text verbose: what is keyed in normally
408comes out (either or not with colors), so spaces stay spaces and linebreaks are
409kept.
410
411\startbuffer
412local function parser(s)
413  local s = string.gsub(s,"show","demonstrate")
414  local s = string.gsub(s,"'re"," are")
415  context(s)
416end
417
418visualizers.register("test-5", { parser = parser })
419\stopbuffer
420
421\typebuffer \ctxluabuffer
422
423\startbuffer
424\starttyping[option=test-5,color=darkred,style=]
425This is just some text to show what we can do with this mechanism. In
426spite of what you might think we're not bound to verbose text.
427\stoptyping
428\stopbuffer
429
430We can apply this visualizer as follows:
431
432\typebuffer
433
434This time the text gets properly aligned:
435
436\getbuffer
437
438It often makes sense to use a buffer:
439
440\startbuffer
441\startbuffer[demo]
442This is just some text to show what we can do with this mechanism. In
443spite of what you might think we're not bound to verbose text.
444\stopbuffer
445\stopbuffer
446
447\typebuffer \getbuffer
448
449Instead of processing the buffer in verbatim mode you can then
450process it directly:
451
452\startbuffer
453\setuptyping[file][option=test-5,color=darkred,style=]
454\ctxluabuffer[demo]
455\stopbuffer
456
457\typebuffer
458
459Which gives:
460
461\start \getbuffer \stop
462
463In this case, the space is a normal space and not the fixed verbatim space, which
464looks better.
465
466\stopsection
467
468\stopchapter
469
470\stopcomponent
471