xml-mkiv-converter.tex /size: 8982 b    last modification: 2021-10-28 13:50
1% language=us runpath=texruns:manuals/xml
2
3\environment xml-mkiv-style
4
5\startcomponent xml-mkiv-converter
6
7\startchapter[title={Setting up a converter}]
8
9\startsection[title={from structure to setup}]
10
11We use a very simple document structure for demonstrating how a converter is
12defined. In practice a mapping will be more complex, especially when we have a
13style with complex chapter openings using data coming from all kind of places,
14different styling of sections with the same name, selectively (out of order)
15flushed content, special formatting, etc.
16
17\typefile{xml-mkiv-03.xml}
18
19Say that this document is stored in the file \type {demo.xml}, then the following
20code can be used as starting point:
21
22\starttyping
23\startxmlsetups xml:demo:base
24  \xmlsetsetup{#1}{document|section|p}{xml:demo:*}
25\stopxmlsetups
26
27\xmlregisterdocumentsetup{demo}{xml:demo:base}
28
29\startxmlsetups xml:demo:document
30  \starttitle[title={Contents}]
31    \placelist[chapter]
32  \stoptitle
33  \xmlflush{#1}
34\stopxmlsetups
35
36\startxmlsetups xml:demo:section
37  \startchapter[title=\xmlfirst{#1}{/title}]
38    \xmlfirst{#1}{/content}
39  \stopchapter
40\stopxmlsetups
41
42\startxmlsetups xml:demo:p
43  \xmlflush{#1}\endgraf
44\stopxmlsetups
45
46\xmlprocessfile{demo}{demo.xml}{}
47\stoptyping
48
49Watch out! These are not just setups, but specific \XML\ setups which get an
50argument passed (the \type {#1}). If for some reason your \XML\ processing fails,
51it might be that you mistakenly have used a normal setup definition. The argument
52\type {#1} represents the current node (element) and is a unique identifier. For
53instance a \type {<p>..</p>} can have an identifier {demo::5}. So, we can get
54something:
55
56\starttyping
57\xmlflush{demo::5}\endgraf
58\stoptyping
59
60but as well:
61
62\starttyping
63\xmlflush{demo::6}\endgraf
64\stoptyping
65
66Keep in mind that the references tor the actual nodes (elements) are
67abstractions, you never see those \type {<id>::<number>}'s, because we will use
68either the abstract \type {#1} (any node) or an explicit reference like \type
69{demo}. The previous setup when issued will be like:
70
71\starttyping
72\startchapter[title=\xmlfirst{demo::3}{/title}]
73  \xmlfirst{demo::4}{/content}
74\stopchapter
75\stoptyping
76
77Here the \type {title} is used to typeset the chapter title but also for an entry
78in the table of contents. At the moment the title is typeset the \XML\ node gets
79looked up and expanded in real text. However, for the list it gets stored for
80later use. One can argue that this is not needed for \XML, because one can just
81filter all the titles and use page references, but then one also looses the
82control one normally has over such titles. For instance it can be that some
83titles are rendered differently and for that we need to keep track of usage.
84Doing that with transformations or filtering is often more complex than leaving
85that to \TEX. As soon as the list gets typeset, the reference (\type {demo::#3})
86is used for the lookup. This is because by default the title is stored as given.
87So, as long as we make sure the \XML\ source is loaded before the table of
88contents is typeset we're ok. Later we will look into this in more detail, for
89now it's enough to know that in most cases the abstract \type {#1} reference will
90work out ok.
91
92Contrary to the style definitions this interface looks rather low level (with no
93optional arguments) and the main reason for this is that we want processing to be
94fast. So, the basic framework is:
95
96\starttyping
97\startxmlsetups xml:demo:base
98  % associate setups with elements
99\stopxmlsetups
100
101\xmlregisterdocumentsetup{demo}{xml:demo:base}
102
103% define setups for matches
104
105\xmlprocessfile{demo}{demo.xml}{}
106\stoptyping
107
108In this example we mostly just flush the content of an element and in the case of
109a section we flush explicit child elements. The \type {#1} in the example code
110represents the current element. The line:
111
112\starttyping
113\xmlsetsetup{demo}{*}{-}
114\stoptyping
115
116sets the default for each element to \quote {just ignore it}. A \type {+} would
117make the default to always flush the content. This means that at this point we
118only handle:
119
120\starttyping
121<section>
122  <title>Some title</title>
123  <content>
124    <p>a paragraph of text</p>
125  </content>
126</section>
127\stoptyping
128
129In the next section we will deal with the slightly more complex itemize and
130figure placement. At first sight all these setups may look overkill but keep in
131mind that normally the number of elements is rather limited. The complexity is
132often in the style and having access to each snippet of content is actually
133quite handy for that.
134
135\stopsection
136
137\startsection[title={alternative solutions}]
138
139Dealing with an itemize is rather simple (as long as we forget about
140attributes that control the behaviour):
141
142\starttyping
143<itemize>
144  <item>first</item>
145  <item>second</item>
146</itemize>
147\stoptyping
148
149First we need to add \type {itemize} to the setup assignment (unless we've used
150the wildcard \type {*}):
151
152\starttyping
153\xmlsetsetup{demo}{document|section|p|itemize}{xml:demo:*}
154\stoptyping
155
156The setup can look like:
157
158\starttyping
159\startxmlsetups xml:demo:itemize
160  \startitemize
161    \xmlfilter{#1}{/item/command(xml:demo:itemize:item)}
162  \stopitemize
163\stopxmlsetups
164
165\startxmlsetups xml:demo:itemize:item
166  \startitem
167    \xmlflush{#1}
168  \stopitem
169\stopxmlsetups
170\stoptyping
171
172An alternative is to map item directly:
173
174\starttyping
175\xmlsetsetup{demo}{document|section|p|itemize|item}{xml:demo:*}
176\stoptyping
177
178and use:
179
180\starttyping
181\startxmlsetups xml:demo:itemize
182  \startitemize
183    \xmlflush{#1}
184  \stopitemize
185\stopxmlsetups
186
187\startxmlsetups xml:demo:item
188  \startitem
189    \xmlflush{#1}
190  \stopitem
191\stopxmlsetups
192\stoptyping
193
194Sometimes, a more local solution using filters and \type {/command(...)} makes more
195sense, especially when the \type {item} tag is used for other purposes as well.
196
197Explicit flushing with \type {command} is definitely the way to go when you have
198complex products. In one of our projects we compose math school books from many
199thousands of small \XML\ files, and from one source set several products are
200typeset. Within a book sections get done differently, content gets used, ignored
201or interpreted differently depending on the kind of content, so there is a
202constant checking of attributes that drive the rendering. In that a generic setup
203for a title element makes less sense than explicit ones for each case. (We're
204talking of huge amounts of files here, including multiple images on each rendered
205page.)
206
207When using \type {command} you can pass two arguments, the first is the setup for
208the match, the second one for the miss, as in:
209
210\starttyping
211\xmlfilter{#1}{/element/command(xml:true,xml:false)}
212\stoptyping
213
214Back to the example, this leaves us with dealing with the resources, like
215figures:
216
217\starttyping
218<resource type='figure'>
219  <caption>A picture of a cow.</caption>
220  <content><external file="cow.pdf"/></content>
221</resource>
222\stoptyping
223
224Here we can use a more restricted match:
225
226\starttyping
227\xmlsetsetup{demo}{resource[@type='figure']}{xml:demo:figure}
228\xmlsetsetup{demo}{external}{xml:demo:*}
229\stoptyping
230
231and the definitions:
232
233\starttyping
234\startxmlsetups xml:demo:figure
235  \placefigure
236    {\xmlfirst{#1}{/caption}}
237    {\xmlfirst{#1}{/content}}
238\stopxmlsetups
239
240\startxmlsetups xml:demo:external
241  \externalfigure[\xmlatt{#1}{file}]
242\stopxmlsetups
243\stoptyping
244
245At this point it is good to notice that \type {\xmlatt{#1}{file}} is passed as it
246is: a macro call. This means that when a macro like \type {\externalfigure} uses
247the first argument frequently without first storing its value, the lookup is done
248several times. A solution for this is:
249
250\starttyping
251\startxmlsetups xml:demo:external
252  \expanded{\externalfigure[\xmlatt{#1}{file}]}
253\stopxmlsetups
254\stoptyping
255
256Because the lookup is rather fast, normally there is no need to bother about this
257too much because internally \CONTEXT\ already makes sure such expansion happens
258only once.
259
260An alternative definition for placement is the following:
261
262\starttyping
263\xmlsetsetup{demo}{resource}{xml:demo:resource}
264\stoptyping
265
266with:
267
268\starttyping
269\startxmlsetups xml:demo:resource
270  \placefloat
271    [\xmlatt{#1}{type}]
272    {\xmlfirst{#1}{/caption}}
273    {\xmlfirst{#1}{/content}}
274\stopxmlsetups
275\stoptyping
276
277This way you can specify \type {table} as type too. Because you can define your
278own float types, more complex variants are also possible. In that case it makes
279sense to provide some default behaviour too:
280
281\starttyping
282\definefloat[figure-here][figure][default=here]
283\definefloat[figure-left][figure][default=left]
284\definefloat[table-here] [table] [default=here]
285\definefloat[table-left] [table] [default=left]
286
287\startxmlsetups xml:demo:resource
288  \placefloat
289    [\xmlattdef{#1}{type}{figure}-\xmlattdef{#1}{location}{here}]
290    {\xmlfirst{#1}{/caption}}
291    {\xmlfirst{#1}{/content}}
292\stopxmlsetups
293\stoptyping
294
295In this example we support two types and two locations. We default to a figure
296placed (when possible) at the current location.
297
298\stopsection
299
300\stopchapter
301
302\stopcomponent
303