onandon-editing.tex /size: 13 Kb    last modification: 2023-12-21 09:43
1% language=us
2
3\startcomponent onandon-editing
4
5\environment onandon-environment
6
7\startchapter[title=Editing]
8
9\startsection[title=Introduction]
10
11% This introduction is similar to the workflows chapter.
12
13Some users like the synctex feature that is built in the \TEX\ engines.
14Personally I never use it because it doesn't work well with the kind of documents
15I maintain. If you have one document source, and don't shuffle around (reuse)
16text too much it probably works out okay but that is not our practice. Here I
17will describe how you can enable a more \CONTEXT\ specific synctex support so
18that aware \PDF\ viewers can bring you back to the source.
19
20\stopsection
21
22\startsection[title={The premise}]
23
24Most of the time we provide our customers with an authoring workflow consisting
25of:
26
27\startitemize[packed]
28    \startitem the typesetting engine \CONTEXT \stopitem
29    \startitem the styles to generate the desired \PDF\ files \stopitem
30    \startitem the text editor \SCITE \stopitem
31    \startitem the \SUMATRAPDF\ viewer \stopitem
32\stopitemize
33
34For the \MATHML\ we advice the \MATHTYPE\ editor and we provide them with a
35customized \MATHML\ translator for the copy & paste actions. When \ASCIIMATH\ is
36used to code math no special tools are needed.
37
38What people operate this workflow? Sometimes it's an author, but most of the time
39they are editors with a background in copy|-|editing. We call them \XML\ editors,
40because they are maintaining the large (sets of) \XML\ documents and edit
41directly in the \XML\ sources.
42
43Maybe you'll ask yourself \quotation {Can they do that? Can they edit directly in
44the \XML\ resource?} The answer is yes, because after they have hit the
45processing key they are rewarded with a publishable \PDF\ document in a demanding
46layout.
47
48The \XML\ sources have a dual purpose. They form the basis for:
49
50\startitemize[packed]
51    \startitem
52        all folio products that are generated in \XML\ to \PDF\ workflow(s)
53    \stopitem
54    \startitem
55        the digital web product(s)
56    \stopitem
57\stopitemize
58
59The \XML\ editors do their proofing chapter|-|wise. Sometimes a chapter is one
60big \XML\ file (10.000 lines is no exception when the chapter contains hundreds
61of bloated \MATHML\ snippets). In other projects they have to deal with chapters
62that are made up of hundreds (100 upto 500) of smaller \XML\ files.
63
64\stopsection
65
66\startsection[title={The problem}]
67
68Let's keep it simple: there's a typo. Here's what an \XML\ editor will do:
69
70\startitemize[packed]
71    \startitem
72        start \SCITE
73    \stopitem
74    \startitem
75        open a file
76    \stopitem
77    \startitem
78        correct the typo
79    \stopitem
80    \startitem
81        generate the \PDF
82    \stopitem
83    \startitem
84        proof the \PDF\ and see if his alteration has some undesired side
85        effects like text flow of image floating
86    \stopitem
87\stopitemize
88
89So far so good. When the editor dealing with one big \XML\ file there's no
90problem. Hopefully the filename will indicate the specific chapter. He or she
91opens the file and searches for the typo. And then correction happens. But what
92if there are hundreds of small \XML\ files. How does the editor know in which
93file the typo can be found?
94
95First, let's give a few statistics based on two projects that are in a revision
96stage.
97
98\starttabulate[|c|c|c|c|]
99\HL
100\NC
101    project \NC
102    chapters \NC
103    \# of files \NC
104    average \# of lines \NC \NR
105\HL
106\NC
107    A \NC
108    16 \NC
109    16 \NC
110    11000 \NC \NR
111\NC
112    B \NC
113    132 \NC
114    16000\footnote{132 chapters consisting of $\pm 120$ files.} \NC
115    100 \NC \NR
116\HL
117\stoptabulate
118
119The \XML\ resource passes three stages: a raw, a semi final and a final version.
120The raw \XML\ version originates from a web authoring tool that is used by the
121author. Then the \PDF\ is proofread and the \XML\ editor goes to work.
122
123\starttabulate[|l|c|c|]
124\HL
125\NC
126    workflow \NC
127    \# edit locations and adaptations \NC
128    \# runs\footnote {Maybe you can now see why we put quite some effort in
129       keeping \CONTEXT\ working at a comfortable speed.} \NC \NR
130\HL
131\NC
132    raw to semifinal \NC
133    75 \NC
134    105 \NC \NR
135\NC
136    semifinal to final \NC
137    35 \NC
138    55 \NC \NR
139\HL
140\stoptabulate
141
142Keep in mind that altering text may cause text to flow and images to float in a
143way that an \XML\ editor will have to finetune and needs multiple runs for one
144correction.
145
146Just to give an idea of the work involved. A typical semi final needs some 50
147runs where each run takes 20 seconds (assuming 3 runs to get all cross
148referencing right). The numbers of explicit pagebreaks is about 5, and (related
149to formulas) explicit linebreaks around 8. It takes some 2 hours to get
150everything right, which includes checking in detail, fixing some things and if
151needed moving content a bit around.
152
153Now we broaden the earlier question into: how can we make the work of an \XML\
154editor as easy and efficient as possible?
155
156\stopsection
157
158\startsection[title={Enhancing efficiency}]
159
160Since it is easier to proof content for folio and web via PDF documents we
161generate proof \PDF\ files in which the complete content is shown. The proof can
162be a massive document. A normal 40 page chapter can explode to 140 pages
163visualizing all the content that is coded in the \XML\ file(s).
164
165The content in the proof is shown in an effective way and a functional order.
166Let's give a few examples of how we enhance the \XML\ editors effectiveness:
167
168\startitemize[packed]
169
170\startitem
171    By default the proof \PDF\ file is interactive which serves testing the tocs
172    and the register.
173\stopitem
174\startitem
175    The web hyperlinks are active so their destinatation can be tested.
176\stopitem
177\startitem
178    The questions and their answers are displayed in eachothers proximity. This
179    sounds logical but in folio they are two seperate products (theory and
180    answer books).
181\stopitem
182\startitem
183    Medium specific content (web or folio) is typographically highligthed. For
184    example by colored backgrounds.
185\stopitem
186\startitem
187    When spelling mode is on the \XML\ editor can easily pick out the colored
188    misspelled words.
189\stopitem
190\startitem
191    Images can be active areas although this is of no interest to \XML\ editors.
192    Clicking the image results in opening the image file in its corresponding
193    application for maintenance.
194\stopitem
195\startitem
196    For practical reasons the filenames and paths of the \XML\ files are
197    displayed. The filenames are active links and clicking them results in
198    opening the destination \XML\ file in \SCITE.
199\stopitem
200\stopitemize
201
202Okay. The last option is a nice feature. However, the destination file is opened
203at the top of the file and you still have to find the typo or whatever
204incorrect issue you are looking for.
205
206So a further enhancement in efficiency would be to jump to the typo's
207corresponding line in the \XML\ source. This is where \SYNCTEX\ comes into view.
208This feature, present in the \TEX\ engines, provides a way to go from \PDF\ to
209source by using a secondary file with positions. Unfortunately that mechanism is
210hardly useable for \CONTEXT\ because it assumes a page and file handling model
211different from what we use. However, as \CONTEXT\ uses \LUATEX, it can also
212provide it's own alternative.
213
214\stopsection
215
216% The rest is similar to the workflows chapter.
217
218\startsection[title=What we want]
219
220The \SYNCTEX\ method roughly works as follows. Internally \TEX\ constricts linked
221lists of glyphs, kerns, glue, boxes, rules etc. These elements are called nodes.
222Some nodes carry information about the file and line where they were created. In
223the backend this information gets somehow translated in a (sort of) verbose tree
224that describes the makeup in terms of boxes, glue and kerns. From that
225information the \SYNCTEX\ parser library, hooked into a \PDF\ viewer, can go back
226from a position on the screen to a line in a file. One would expect this to be a
227relative simple rectangle based model, but as far as I can see it's way more
228complex than that. There are some comments that \CONTEXT\ is not supported well
229because it has a layered page model, which indicates that there are some
230assumptions about how macro packages are supposed to work. Also the used
231heuristics not only involve some specific spot (location) but also involve the
232corners and edges. It is therefore not so much a (simple) generic system but a
233mechanism geared for a macro package like \LATEX.
234
235Because we have a couple of users who need to edit complex sets of documents,
236coded in \TEX\ or \XML, I decided to come up with a variant that doesn't use the
237\SYNCTEX\ machinery but manipulates the few \SYNCTEX\ fields directly \footnote {This
238is something that in my opinion should have been possible right from the start
239but it's too late now to change the system and it would not be used beyond
240\CONTEXT\ anyway.} and eventually outputs a straightforward file for the editor.
241Of course we need to follow some rules so that the editor can deal with it. It
242took a bit of trial and error to get the right information in the support file
243needed by the viewer but we got there.
244
245The prerequisites of a decent \CONTEXT\ \quotation {click on preview and goto
246editor} are the following:
247
248\startitemize
249
250\startitem
251    It only makes sense to click on text in the text flow. Headers and footers
252    are often generated from structure, and special typographic elements can
253    originate in macros hooked into commands instead of in the source.
254\stopitem
255
256\startitem
257    Users should not be able to reach environments (styles) and other files
258    loaded from the (normally read|-|only) \TEX\ tree, like modules. We don't
259    want accidental changes in such files.
260\stopitem
261
262\startitem
263    We not only have \TEX\ files but also \XML\ files and these can normally
264    flush in rather arbitrary ways. Although the concept of lines is sort of
265    lost in such a file, there is still a relation between lines and the snippets
266    that make out the content of an \XML\ node.
267\stopitem
268
269\startitem
270    In the case of \XML\ files the overhead related to preserving line
271    numbers should be minimal and have no impact on loading and memory when
272    these features are not used.
273\stopitem
274
275\startitem
276    The overhead in terms of an auxiliary file size and complexity as well
277    as producing that file should be minimal. It should be easy to turn on and
278    off these features. (I'd never turn them on by default.)
279\stopitem
280
281\stopitemize
282
283It is unavoidable that we get more run time but I assume that for the average user
284that is no big deal. It pays off when you have a workflow when a book (or even a
285chapter in a book) is generated from hundreds of small \XML\ files. There is no
286overhead when \SYNCTEX\ is not used.
287
288In \CONTEXT\ we don't use the built|-|in \SYNCTEX\ features, that is: we let
289filename and line numbers be set but often these are overloaded explicitly. The
290output file is not compressed and constructed by \CONTEXT. There is no benefit in
291compression and the files are probably smaller than default \SYNCTEX\ anyway.
292
293\stopsection
294
295\startsection[title=Commands]
296
297Although you can enable this mechanism with directives it makes sense to do it
298using the following command.
299
300\starttyping
301\setupsynctex[state=start]
302\stoptyping
303
304The advantage of using an explicit command instead of some command line option is
305that in an editor it's easier to disable this trickery. Commenting that line will
306speed up processing when needed. This command can also be given in an environment
307(style). On the command line you can say
308
309\starttyping
310context --synctex somefile.tex
311\stoptyping
312
313A third method is to put this at the top of your file:
314
315\starttyping
316% synctex=yes
317\stoptyping
318
319Often an \XML\ files is very structured and although probably the main body of
320text is flushed as a stream, specific elements can be flushed out of order. In
321educational documents flushing for instance answers to exercises can happen out of
322order. In that case we still need to make sure that we go to the right spot in
323the file. It will never be 100\% perfect but it's better than nothing. The
324above command will also enable \XML\ support.
325
326If you don't want a file to be accessed, you can block it:
327
328\starttyping
329\blocksynctexfile[foo.tex]
330\stoptyping
331
332Of course you need to configure the viewer to respond to the request for
333editing. In Sumatra combined with SciTE the magic command is:
334
335\starttyping
336c:\data\system\scite\wscite\scite.exe "%f" "-goto:%l"
337\stoptyping
338
339Such a command is independent of the macro package so you can just consult the
340manual or help info that comes with a viewer, given that it supports this linking
341back to the source at all.
342
343If you enable tracing (see next section) you can what has become clickable.
344Instead of words you can also work with ranges, which not only gives less runtime
345but also much smaller \type {.synctex} files. Use
346
347\starttyping
348\setupsynctex[state=start,method=min]
349\stoptyping
350
351to get words clickable and
352
353\starttyping
354\setupsynctex[state=start,method=max]
355\stoptyping
356
357if you want somewhat more efficient ranges. The overhead for \type {min} is about
35810 percent while \type {max} slows down around 5 percent.
359
360\stopsection
361
362\startsection[title=Tracing]
363
364In case you want to see what gets synced you can enable a tracker:
365
366\starttyping
367\enabletrackers[system.synctex.visualize]
368\enabletrackers[system.synctex.visualize=real]
369\stoptyping
370
371The following tracker outputs some status information about \XML\ flushing. Such
372trackers only make sense for developers.
373
374\starttyping
375\enabletrackers[system.synctex.xml]
376\stoptyping
377
378\stopsection
379
380\startsection[title=Warning]
381
382Don't turn on this feature when you don't need it. This is one of those mechanism
383that hits performance badly.
384
385Depending on needs the functionality can be improved and|/|or extended. Of course
386you can always use the traditional \SYNCTEX\ method but don't expect it to behave
387as described here.
388
389\stopsection
390
391\stopchapter
392
393\stopcomponent
394