still-simple.tex /size: 8857 b    last modification: 2023-12-21 09:43
1% language=us
2
3\environment still-environment
4
5\startcomponent still-simple
6
7\startchapter[title=Removing something (typeset)]
8
9\startsection[title=Introduction]
10
11The primitive \type {\unskip} often comes in handy when you want to remove a
12space (or more precisely: a glue item) but sometimes you want to remove more.
13Consider for instance the case where a sentence is built up stepwise from data.
14At some point you need to insert some punctuation but as you cannot look ahead it
15needs to be delayed. Keeping track of accumulated content is no fun, and a quick
16and dirty solution is to just inject it and remove it when needed. One way to
17achieve this is to wrap this optional content in a box with special dimensions.
18Just before the next snippet is injected we can look back for that box (that can
19then be recognized by those special dimensions) and either remove it or unbox it
20back into the stream.
21
22To be honest, one seldom needs this feature. In fact I never needed it until
23Alan Braslau and I were messing around with (indeed messy) bibliographic
24rendering and we thought it would be handy to have a helper that could remove
25punctuation. Think of situations like this:
26
27\starttyping
28John Foo, Mary Bar and others.
29John Foo, Mary Bar, and others.
30\stoptyping
31
32One can imagine this list to be constructed programmatically, in which case the
33comma before the \type {and} can be superfluous. So, the \type {and others} can
34be done like this:
35
36\startbuffer
37\def\InjectOthers
38  {\removeunwantedspaces
39   \removepunctuation
40   \space and others}
41
42John Foo, Mary Bar, \InjectOthers.
43\stopbuffer
44
45\typebuffer
46
47Notice that we first remove spaces. This will give:
48
49\blank {\bf \getbuffer} \blank
50
51where the commas after the names are coming from some not|-|too|-|clever automatism
52or are the side effect of lazy programming. In the sections below I will describe
53a bit more generic mechanism and also present a solution for non|-|\CONTEXT\ users.
54
55\stopsection
56
57\startsection[title=Marked content]
58
59The example above can be rewritten in a more general way. We define a
60couple macros (using \CONTEXT\ functionality):
61
62\startbuffer
63\def\InjectComma
64  {\markcontent
65     [punctuation]
66     {\removeunwantedspaces,\space}}
67
68\def\InjectOthers
69  {\removemarkedcontent[punctuation]%
70   \space and others}
71\stopbuffer
72
73\typebuffer \getbuffer
74
75These can be used as:
76
77\startbuffer
78John Foo\InjectComma Mary Bar\InjectComma \InjectOthers.
79\stopbuffer
80
81\typebuffer
82
83Which gives us:
84
85\blank {\bf \getbuffer} \blank
86
87Normally one doesn't need this kind of magic for lists because the length of the
88list is known and injection can be done using the index in the list. Here is a more 
89practical example:
90
91\startbuffer
92\def\SomeTitle {Just a title}
93\def\SomeAuthor{Just an author}
94\def\SomeYear  {2015}
95\stopbuffer
96
97\typebuffer \getbuffer
98
99We paste the three snippets together:
100
101\startbuffer
102\SomeTitle,\space \SomeAuthor\space (\SomeYear).
103\stopbuffer
104
105\typebuffer \blank {\bf \getbuffer} \blank
106
107But to get even more abstract, we can do this:
108
109\startbuffer
110\def\PlaceTitle
111  {\SomeTitle
112   \markcontent[punctuation]{.}}
113
114\def\PlaceAuthor
115  {\removemarkedcontent[punctuation]%
116   \markcontent[punctuation]{,\space}%
117   \SomeAuthor
118   \markcontent[punctuation]{,\space}}
119
120\def\PlaceYear
121  {\removemarkedcontent[punctuation]%
122   \space(\SomeYear)%
123   \markcontent[punctuation]{.}}
124\stopbuffer
125
126\typebuffer \getbuffer
127
128Used as:
129
130\startbuffer
131\PlaceTitle\PlaceAuthor\PlaceYear
132\stopbuffer
133
134\typebuffer
135
136we get the output:
137
138\blank {\bf \getbuffer} \blank
139
140but when we have no author,
141
142\startbuffer
143\def\SomeAuthor{}
144
145\PlaceTitle\PlaceAuthor\PlaceYear
146\stopbuffer
147
148\typebuffer
149
150Now we get:
151
152\blank {\bf \getbuffer} \blank
153
154Even more clever is this:
155
156\def\PlaceYear
157  {\removemarkedcontent[punctuation]%
158   \markcontent[punctuation]{\space(\SomeYear).}}
159
160\startbuffer
161\def\SomeAuthor{}
162\def\SomeYear{}
163\def\SomePeriod{\removemarkedcontent[punctuation].}
164
165\PlaceTitle\PlaceAuthor\PlaceYear\SomePeriod
166\stopbuffer
167
168\typebuffer
169
170The output is:
171
172\blank {\bf \getbuffer} \blank
173
174Of course we can just test for a variable like \type {\SomeAuthor} being empty
175before we place punctuation, but there are cases where a period becomes a comma
176or a comma becomes a semicolon. Especially with bibliographies your worst
177typographical nightmares come true, so it is handy to have such a mechanism
178available when it's needed.
179
180\stopsection
181
182\startsection[title=A plain solution]
183
184For users of \LUATEX\ who don't want to use \CONTEXT\ I will now present an
185alternative implementation. Of course more clever variants are possible but the
186principle remains. The trick is simple enough to show here as an example of \LUA\
187coding as it doesn't need much help from the infrastructure that the macro
188package provides. The only pitfall is the used signal (attribute number) but you
189can set another one if needed. We use the \type {gadgets} namespace to isolate
190the code.
191
192\startbuffer
193\directlua {
194  gadgets         = gadgets or { }
195  local marking   = { }
196  gadgets.marking = marking
197
198  local marksignal   = 5001
199  local lastmarked   = 0
200  local marked       = { }
201  local local_par    = 6
202  local whatsit_node = 8
203
204  function marking.setsignal(n)
205    marksignal = tonumber(n) or marksignal
206  end
207
208  function marking.mark(str)
209    local currentmarked = marked[str]
210    if not currentmarked then
211      lastmarked    = lastmarked + 1
212      currentmarked = lastmarked
213      marked[str]   = currentmarked
214    end
215    tex.setattribute(marksignal,currentmarked)
216  end
217
218  function marking.remove(str)
219    local attr = marked[str]
220    if not attr then
221      return
222    end
223    local list = tex.nest[tex.nest.ptr]
224    if list then
225      local head = list.head
226      local tail = list.tail
227      local last = tail
228      if last[marksignal] == attr then
229        local first = last
230        while true do
231          local prev = first.prev
232          if not prev or prev[marksignal] ~= attr or
233               (prev.id == whatsit_node and
234                  prev.subtype == local_par) then
235             break
236          else
237            first = prev
238          end
239        end
240        if first == head then
241          list.head = nil
242          list.tail = nil
243        else
244          local prev = first.prev
245          list.tail  = prev
246          prev.next  = nil
247        end
248        node.flush_list(first)
249      end
250    end
251  end
252}
253\stopbuffer
254\stopluacode
255
256\typebuffer \getbuffer
257
258These functions are called from macros. We use symbolic names for the marked
259snippets. We could have used numbers but meaningful tags can be supported with
260negligible overhead. The remover starts at the end of the current list and
261goes backwards till no matching attribute value is seen. When a valid range is
262found it gets removed.
263
264\startbuffer
265\def\setmarksignal#1%
266  {\directlua{gadgets.marking.setsignal(\number#1)}}
267
268\def\marksomething#1#2%
269  {{\directlua{gadgets.marking.mark("#1")}{#2}}}
270
271\def\unsomething#1%
272  {\directlua{gadgets.marking.remove("#1")}}
273\stopbuffer
274
275\typebuffer \getbuffer
276
277The working of these macros can best be shown from a few examples:
278
279\startbuffer
280before\marksomething{gone}{\em HERE}\unsomething{gone}after
281before\marksomething{kept}{\em HERE}\unsomething{gone}after
282\marksomething{gone}{\em HERE}\unsomething{gone}last
283\marksomething{kept}{\em HERE}\unsomething{gone}last
284\stopbuffer
285
286\typebuffer
287
288This renders as: \blank \startlines\bf\getbuffer\stoplines
289
290The remover needs to look at the beginning of a paragraph marked by a local par
291whatsit. If we removed that, \LUATEX\ would crash because the list head
292(currently) cannot be set to nil. This is no big deal because this macro is not
293meant to clean up across paragraphs.
294
295A close look at the definition of \type {\marksomething} will reveal
296an extra grouping in the definition. This is needed to make content that uses
297\type {\aftergroup} trickery work correctly. Here is another example:
298
299\startbuffer
300\def\SnippetOne  {first\marksomething{punctuation}{, }}
301\def\SnippetTwo  {second\marksomething{punctuation}{, }}
302\def\SnippetThree{\unsomething{punctuation} and third.}
303\stopbuffer
304
305\typebuffer \getbuffer
306
307We can paste these snippets together and make the last one use \type {and}
308instead of a comma.
309
310\startbuffer
311\SnippetOne \SnippetTwo  \SnippetThree\par
312\SnippetOne \SnippetThree\par
313\stopbuffer
314
315\typebuffer
316
317We get: \blank {\bf \getbuffer} \blank
318
319Of course in practice one probably knows how many snippets there are and using a
320counter to keep track of the state is more efficient than first typesetting
321something and removing it afterwards. But still it looks like a cool feature and
322it can come in handy at some point, as with the title|-|author|-|year example given
323before.
324
325The plain code shown here is in the distribution in the file \type
326{luatex-gadgets} and gets preloaded in the \type {luatex-plain} format.
327
328\stopsection
329
330\stopchapter
331