1
2
3\environment stillenvironment
4
5\startcomponent stillsimple
6
7\startchapter[title=Removing something (typeset)]
8
9\startsection[title=Introduction]
10
11The primitive \type {\unskip} often comes in handy when you want to remove a
12space (or more precisely: a glue item) but sometimes you want to remove more.
13Consider for instance the case where a sentence is built up stepwise from data.
14At some point you need to insert some punctuation but as you cannot look ahead it
15needs to be delayed. Keeping track of accumulated content is no fun, and a quick
16and dirty solution is to just inject it and remove it when needed. One way to
17achieve this is to wrap this optional content in a box with special dimensions.
18Just before the next snippet is injected we can look back for that box (that can
19then be recognized by those special dimensions) and either remove it or unbox it
20back into the stream.
21
22To be honest, one seldom needs this feature. In fact I never needed it until
23Alan Braslau and I were messing around with (indeed messy) bibliographic
24rendering and we thought it would be handy to have a helper that could remove
25punctuation. Think of situations like this:
26
27\starttyping
28John Foo, Mary Bar and others.
29John Foo, Mary Bar, and others.
30\stoptyping
31
32One can imagine this list to be constructed programmatically, in which case the
33comma before the \type {and} can be superfluous. So, the \type {and others} can
34be done like this:
35
36\startbuffer
37\def\InjectOthers
38 {\removeunwantedspaces
39 \removepunctuation
40 \space and others}
41
42John Foo, Mary Bar, \InjectOthers.
43\stopbuffer
44
45\typebuffer
46
47Notice that we first remove spaces. This will give:
48
49\blank {\bf \getbuffer} \blank
50
51where the commas after the names are coming from some nottooclever automatism
52or are the side effect of lazy programming. In the sections below I will describe
53a bit more generic mechanism and also present a solution for non\CONTEXT\ users.
54
55\stopsection
56
57\startsection[title=Marked content]
58
59The example above can be rewritten in a more general way. We define a
60couple macros (using \CONTEXT\ functionality):
61
62\startbuffer
63\def\InjectComma
64 {\markcontent
65 [punctuation]
66 {\removeunwantedspaces,\space}}
67
68\def\InjectOthers
69 {\removemarkedcontent[punctuation]
70 \space and others}
71\stopbuffer
72
73\typebuffer \getbuffer
74
75These can be used as:
76
77\startbuffer
78John Foo\InjectComma Mary Bar\InjectComma \InjectOthers.
79\stopbuffer
80
81\typebuffer
82
83Which gives us:
84
85\blank {\bf \getbuffer} \blank
86
87Normally one doesnt need this kind of magic for lists because the length of the
88list is known and injection can be done using the index in the list. Here is a more
89practical example:
90
91\startbuffer
92\def\SomeTitle {Just a title}
93\def\SomeAuthor{Just an author}
94\def\SomeYear {2015}
95\stopbuffer
96
97\typebuffer \getbuffer
98
99We paste the three snippets together:
100
101\startbuffer
102\SomeTitle,\space \SomeAuthor\space (\SomeYear).
103\stopbuffer
104
105\typebuffer \blank {\bf \getbuffer} \blank
106
107But to get even more abstract, we can do this:
108
109\startbuffer
110\def\PlaceTitle
111 {\SomeTitle
112 \markcontent[punctuation]{.}}
113
114\def\PlaceAuthor
115 {\removemarkedcontent[punctuation]
116 \markcontent[punctuation]{,\space}
117 \SomeAuthor
118 \markcontent[punctuation]{,\space}}
119
120\def\PlaceYear
121 {\removemarkedcontent[punctuation]
122 \space(\SomeYear)
123 \markcontent[punctuation]{.}}
124\stopbuffer
125
126\typebuffer \getbuffer
127
128Used as:
129
130\startbuffer
131\PlaceTitle\PlaceAuthor\PlaceYear
132\stopbuffer
133
134\typebuffer
135
136we get the output:
137
138\blank {\bf \getbuffer} \blank
139
140but when we have no author,
141
142\startbuffer
143\def\SomeAuthor{}
144
145\PlaceTitle\PlaceAuthor\PlaceYear
146\stopbuffer
147
148\typebuffer
149
150Now we get:
151
152\blank {\bf \getbuffer} \blank
153
154Even more clever is this:
155
156\def\PlaceYear
157 {\removemarkedcontent[punctuation]
158 \markcontent[punctuation]{\space(\SomeYear).}}
159
160\startbuffer
161\def\SomeAuthor{}
162\def\SomeYear{}
163\def\SomePeriod{\removemarkedcontent[punctuation].}
164
165\PlaceTitle\PlaceAuthor\PlaceYear\SomePeriod
166\stopbuffer
167
168\typebuffer
169
170The output is:
171
172\blank {\bf \getbuffer} \blank
173
174Of course we can just test for a variable like \type {\SomeAuthor} being empty
175before we place punctuation, but there are cases where a period becomes a comma
176or a comma becomes a semicolon. Especially with bibliographies your worst
177typographical nightmares come true, so it is handy to have such a mechanism
178available when its needed.
179
180\stopsection
181
182\startsection[title=A plain solution]
183
184For users of \LUATEX\ who dont want to use \CONTEXT\ I will now present an
185alternative implementation. Of course more clever variants are possible but the
186principle remains. The trick is simple enough to show here as an example of \LUA\
187coding as it doesnt need much help from the infrastructure that the macro
188package provides. The only pitfall is the used signal (attribute number) but you
189can set another one if needed. We use the \type {gadgets} namespace to isolate
190the code.
191
192\startbuffer
193\directlua {
194 gadgets = gadgets or { }
195 local marking = { }
196 gadgets.marking = marking
197
198 local marksignal = 5001
199 local lastmarked = 0
200 local marked = { }
201 local local_par = 6
202 local whatsit_node = 8
203
204 function marking.setsignal(n)
205 marksignal = tonumber(n) or marksignal
206 end
207
208 function marking.mark(str)
209 local currentmarked = marked[str]
210 if not currentmarked then
211 lastmarked = lastmarked + 1
212 currentmarked = lastmarked
213 marked[str] = currentmarked
214 end
215 tex.setattribute(marksignal,currentmarked)
216 end
217
218 function marking.remove(str)
219 local attr = marked[str]
220 if not attr then
221 return
222 end
223 local list = tex.nest[tex.nest.ptr]
224 if list then
225 local head = list.head
226 local tail = list.tail
227 local last = tail
228 if last[marksignal] == attr then
229 local first = last
230 while true do
231 local prev = first.prev
232 if not prev or prev[marksignal] ~= attr or
233 (prev.id == whatsit_node and
234 prev.subtype == local_par) then
235 break
236 else
237 first = prev
238 end
239 end
240 if first == head then
241 list.head = nil
242 list.tail = nil
243 else
244 local prev = first.prev
245 list.tail = prev
246 prev.next = nil
247 end
248 node.flush_list(first)
249 end
250 end
251 end
252}
253\stopbuffer
254\stopluacode
255
256\typebuffer \getbuffer
257
258These functions are called from macros. We use symbolic names for the marked
259snippets. We could have used numbers but meaningful tags can be supported with
260negligible overhead. The remover starts at the end of the current list and
261goes backwards till no matching attribute value is seen. When a valid range is
262found it gets removed.
263
264\startbuffer
265\def\setmarksignal#1
266 {\directlua{gadgets.marking.setsignal(\number#1)}}
267
268\def\marksomething#1#2
269 {{\directlua{gadgets.marking.mark("#1")}{#2}}}
270
271\def\unsomething#1
272 {\directlua{gadgets.marking.remove("#1")}}
273\stopbuffer
274
275\typebuffer \getbuffer
276
277The working of these macros can best be shown from a few examples:
278
279\startbuffer
280before\marksomething{gone}{\em HERE}\unsomething{gone}after
281before\marksomething{kept}{\em HERE}\unsomething{gone}after
282\marksomething{gone}{\em HERE}\unsomething{gone}last
283\marksomething{kept}{\em HERE}\unsomething{gone}last
284\stopbuffer
285
286\typebuffer
287
288This renders as: \blank \startlines\bf\getbuffer\stoplines
289
290The remover needs to look at the beginning of a paragraph marked by a local par
291whatsit. If we removed that, \LUATEX\ would crash because the list head
292(currently) cannot be set to nil. This is no big deal because this macro is not
293meant to clean up across paragraphs.
294
295A close look at the definition of \type {\marksomething} will reveal
296an extra grouping in the definition. This is needed to make content that uses
297\type {\aftergroup} trickery work correctly. Here is another example:
298
299\startbuffer
300\def\SnippetOne {first\marksomething{punctuation}{, }}
301\def\SnippetTwo {second\marksomething{punctuation}{, }}
302\def\SnippetThree{\unsomething{punctuation} and third.}
303\stopbuffer
304
305\typebuffer \getbuffer
306
307We can paste these snippets together and make the last one use \type {and}
308instead of a comma.
309
310\startbuffer
311\SnippetOne \SnippetTwo \SnippetThree\par
312\SnippetOne \SnippetThree\par
313\stopbuffer
314
315\typebuffer
316
317We get: \blank {\bf \getbuffer} \blank
318
319Of course in practice one probably knows how many snippets there are and using a
320counter to keep track of the state is more efficient than first typesetting
321something and removing it afterwards. But still it looks like a cool feature and
322it can come in handy at some point, as with the titleauthoryear example given
323before.
324
325The plain code shown here is in the distribution in the file \type
326{luatexgadgets} and gets preloaded in the \type {luatexplain} format.
327
328\stopsection
329
330\stopchapter
331 |