xml-mkiv-lookups.tex /size: 7968 b    last modification: 2021-10-28 13:50
1% language=us runpath=texruns:manuals/xml
2
3\environment xml-mkiv-style
4
5\startcomponent xml-mkiv-lookups
6
7\startchapter[title={Lookups using lpaths}]
8
9\startsection[title={introduction}]
10
11There is not that much system in the following examples. They resulted from tests
12with different documents. The current implementation evolved out of the
13experimental code. For instance, I decided to add the multiple expressions in row
14handling after a few email exchanges with Jean|-|Michel Huffen.
15
16One of the main differences between the way \XSLT\ resolves a path and our way is
17the anchor. Take:
18
19\starttyping
20/something
21something
22\stoptyping
23
24The first one anchors in the current (!) element so it will only consider direct
25children. The second one does a deep lookup and looks at the descendants as well.
26Furthermore we have a few extra shortcuts like \type {**} in \type {a/**/b} which
27represents all descendants.
28
29The expressions (between square brackets) has to be valid \LUA\ and some
30preprocessing is done to resolve the built in functions. So, you might use code
31like:
32
33\starttyping
34my_lpeg_expression:match(text()) == "whatever"
35\stoptyping
36
37given that \type {my_lpeg_expression} is known. In the examples below we use the
38visualizer to show the steps. Some are shown more than once as part of a set.
39
40\stopsection
41
42\startsection[title={special cases}]
43
44\xmllshow{}
45\xmllshow{*}
46\xmllshow{.}
47\xmllshow{/}
48
49\stopsection
50
51\startsection[title={wildcards}]
52
53\xmllshow{*}
54\xmllshow{*:*}
55\xmllshow{/*}
56\xmllshow{/*:*}
57\xmllshow{*/*}
58\xmllshow{*:*/*:*}
59
60\xmllshow{a/*}
61\xmllshow{a/*:*}
62\xmllshow{/a/*}
63\xmllshow{/a/*:*}
64
65\xmllshow{/*}
66\xmllshow{/**}
67\xmllshow{/***}
68
69\stopsection
70
71\startsection[title={multiple steps}]
72
73\xmllshow{answer}
74\xmllshow{answer/test/*}
75\xmllshow{answer/test/child::}
76\xmllshow{answer/*}
77\xmllshow{answer/*[tag()='p' and position()=1 and text()!='']}
78
79\stopsection
80
81\startsection[title={pitfals}]
82
83\xmllshow{[oneof(lower(@encoding),'tex','context','ctx')]}
84\xmllshow{.[oneof(lower(@encoding),'tex','context','ctx')]}
85
86\stopsection
87
88\startsection[title={more special cases}]
89
90\xmllshow{**}
91\xmllshow{*}
92\xmllshow{..}
93\xmllshow{.}
94\xmllshow{//}
95\xmllshow{/}
96
97\xmllshow{**/}
98\xmllshow{**/*}
99\xmllshow{**/.}
100\xmllshow{**//}
101
102\xmllshow{*/}
103\xmllshow{*/*}
104\xmllshow{*/.}
105\xmllshow{*//}
106
107\xmllshow{/**/}
108\xmllshow{/**/*}
109\xmllshow{/**/.}
110\xmllshow{/**//}
111
112\xmllshow{/*/}
113\xmllshow{/*/*}
114\xmllshow{/*/.}
115\xmllshow{/*//}
116
117\xmllshow{./}
118\xmllshow{./*}
119\xmllshow{./.}
120\xmllshow{.//}
121
122\xmllshow{../}
123\xmllshow{../*}
124\xmllshow{../.}
125\xmllshow{..//}
126
127\stopsection
128
129\startsection[title={more wildcards}]
130
131\xmllshow{one//two}
132\xmllshow{one/*/two}
133\xmllshow{one/**/two}
134\xmllshow{one/***/two}
135\xmllshow{one/x//two}
136\xmllshow{one//x/two}
137\xmllshow{//x/two}
138
139\stopsection
140
141\startsection[title={special axis}]
142
143\xmllshow{descendant::whocares/ancestor::whoknows}
144\xmllshow{descendant::whocares/ancestor::whoknows/parent::}
145\xmllshow{descendant::whocares/ancestor::}
146\xmllshow{child::something/child::whatever/child::whocares}
147\xmllshow{child::something/child::whatever/child::whocares|whoknows}
148\xmllshow{child::something/child::whatever/child::(whocares|whoknows)}
149\xmllshow{child::something/child::whatever/child::!(whocares|whoknows)}
150\xmllshow{child::something/child::whatever/child::(whocares)}
151\xmllshow{child::something/child::whatever/child::(whocares)[position()>2]}
152\xmllshow{child::something/child::whatever[position()>2][position()=1]}
153\xmllshow{child::something/child::whatever[whocares][whocaresnot]}
154\xmllshow{child::something/child::whatever[whocares][not(whocaresnot)]}
155\xmllshow{child::something/child::whatever/self::whatever}
156
157There is also \type {last-match::} that starts with the last found set of nodes.
158This can save some run time when you do lots of tests combined with a same check
159afterwards. There is however one pitfall: you never know what is done with that
160last match in the setup that gets called nested. Take the following example:
161
162\starttyping
163\startbuffer[test]
164<something>
165    <crap> <crapa> <crapb> <crapc> <crapd>
166        <crape>
167            done 1
168        </crape>
169    </crapd>  </crapc> </crapb>  </crapa>
170    <crap> <crapa> <crapb> <crapc> <crapd>
171        <crape>
172            done 2
173        </crape>
174    </crapd>  </crapc> </crapb>  </crapa>
175    <crap> <crapa> <crapb> <crapc> <crapd>
176        <crape>
177            done 3
178        </crape>
179    </crapd>  </crapc> </crapb>  </crapa>
180</something>
181\stopbuffer
182\stoptyping
183
184One way to filter the content is this:
185
186\starttyping
187\xmldoif {#1} {/crap/crapa/crapb/crapc/crapd/crape} {
188    some action
189}
190\stoptyping
191
192It is not unlikely that you will do something like this:
193
194\starttyping
195\xmlfirst {#1} {/crap/crapa/crapb/crapc/crapd/crape} {
196    \xmlfirst{#1}{/crap/crapa/crapb/crapc/crapd/crape}
197}
198\stoptyping
199
200This means that the path is resolved twice but that can be avoided as
201follows:
202
203\starttyping
204\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{
205    \xmlfirst{#1}{last-match::}
206}
207\stoptyping
208
209But the next is now guaranteed to work:
210
211\starttyping
212\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{
213    \xmlfirst{#1}{last-match::}
214    \xmllast{#1}{last-match::}
215}
216\stoptyping
217
218Because the first one can have done some lookup the last match can be replaced
219and the second call will give unexpected results. You can overcome this with:
220
221\starttyping
222\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{
223    \xmlpushmatch
224    \xmlfirst{#1}{last-match::}
225    \xmlpopmatch
226}
227\stoptyping
228
229Does it pay off? Here are some timings of a 10.000 times text and lookup
230like the previous (on a decent January 2016 laptop):
231
232\starttabulate[|r|l|]
233\NC 0.239 \NC \type {\xmldoif {...} {...}}                                     \NC \NR
234\NC 0.292 \NC \type {\xmlfirst {...} {...}}                                    \NC \NR
235\NC 0.538 \NC \type {\xmldoif {...} {...} + \xmlfirst {...} {...}}             \NC \NR
236\NC 0.338 \NC \type {\xmldoif {...} {...} + \xmlfirst {...} {last-match::}}    \NC \NR
237\NC 0.349 \NC \type {+ \xmldoif {...} {...} + \xmlfirst {...} {last-match::}-} \NC \NR
238\stoptabulate
239
240So, pushing and popping (the last row) is a bit slower than not doing that but it
241is still much faster than not using \type {last-match::} at all. As a shortcut
242you can use \type {=}, as in:
243
244\starttyping
245\xmlfirst{#1}{=}
246\stoptyping
247
248You can even do this:
249
250\starttyping
251\xmlall{#1}{last-match::/text()}
252\stoptyping
253
254or
255
256\starttyping
257\xmlall{#1}{=/text()}
258\stoptyping
259
260
261\stopsection
262
263\startsection[title={some more examples}]
264
265\xmllshow{/something/whatever}
266\xmllshow{something/whatever}
267\xmllshow{/**/whocares}
268\xmllshow{whoknows/whocares}
269\xmllshow{whoknows}
270\xmllshow{whocares[contains(text(),'f') or contains(text(),'g')]}
271\xmllshow{whocares/first()}
272\xmllshow{whocares/last()}
273\xmllshow{whatever/all()}
274\xmllshow{whocares/position(2)}
275\xmllshow{whocares/position(-2)}
276\xmllshow{whocares[1]}
277\xmllshow{whocares[-1]}
278\xmllshow{whocares[2]}
279\xmllshow{whocares[-2]}
280\xmllshow{whatever[3]/attribute(id)}
281\xmllshow{whatever[2]/attribute('id')}
282\xmllshow{whatever[3]/text()}
283\xmllshow{/whocares/first()}
284\xmllshow{/whocares/last()}
285
286\xmllshow{xml://whatever/all()}
287\xmllshow{whatever/all()}
288\xmllshow{//whocares}
289\xmllshow{..[2]}
290\xmllshow{../*[2]}
291
292\xmllshow{/(whocares|whocaresnot)}
293\xmllshow{/!(whocares|whocaresnot)}
294\xmllshow{/!whocares}
295
296\xmllshow{/interface/command/command(xml:setups:register)}
297\xmllshow{/interface/command[@name='xxx']/command(xml:setups:typeset)}
298\xmllshow{/arguments/*}
299\xmllshow{/sequence/first()}
300\xmllshow{/arguments/text()}
301\xmllshow{/sequence/variable/first()}
302\xmllshow{/interface/define[@name='xxx']/first()}
303\xmllshow{/parameter/command(xml:setups:parameter:measure)}
304
305\xmllshow{/(*:library|figurelibrary)/*:figure/*:label}
306\xmllshow{/(*:library|figurelibrary)/figure/*:label}
307\xmllshow{/(*:library|figurelibrary)/figure/label}
308\xmllshow{/(*:library|figurelibrary)/figure:*/label}
309
310\xmlshow {whatever//br[tag(1)='br']}
311
312\stopsection
313
314\stopchapter
315
316\stopcomponent
317