% language=us runpath=texruns:manuals/cld \startcomponent cld-backendcode \environment cld-environment % derived from hybrid \startchapter[title={Backend code}] \startsection [title={Introduction}] In \CONTEXT\ we've always separated the backend code in so called driver files. This means that in the code related to typesetting only calls to the \API\ take place, and no backend specific code is to be used. Currently a \PDF\ backend is supported as well as an \XML\ export. \footnote {This chapter is derived from an article on these matters. You can find nore information in \type {hybrid.pdf}.} Some \CONTEXT\ users like to add their own \PDF\ specific code to their styles or modules. However, such extensions can interfere with existing code, especially when resources are involved. Therefore the construction of \PDF\ data structures and resources is rather controlled and has to be done via the official helper macros. \stopsection \startsection [title={Structure}] A \PDF\ file is a tree of indirect objects. Each object has a number and the file contains a table (or multiple tables) that relates these numbers to positions in a file (or position in a compressed object stream). That way a file can be viewed without reading all data: a viewer only loads what is needed. \starttyping 1 0 obj << /Name (test) /Address 2 0 R >> 2 0 obj [ (Main Street) (24) (postal code) (MyPlace) ] \stoptyping For the sake of the discussion we consider strings like \type {(test)} also to be objects. In the next table we list what we can encounter in a \PDF\ file. There can be indirect objects in which case a reference is used (\type{2 0 R}) and direct ones. It all starts in the document's root object. From there we access the page tree and resources. Each page carries its own resource information which makes random access easier. A page has a page stream and there we find the to be rendered content as a mixture of (\UNICODE) strings and special drawing and rendering operators. Here we will not discuss them as they are mostly generated by the engine itself or dedicated subsystems like the \METAPOST\ converter. There we use literal or \type {\latelua} whatsits to inject code into the current stream. \stopsection \startsection [title={Data types}] There are several datatypes in \PDF\ and we support all of them one way or the other. \starttabulate[|l|l|p|] \FL \NC \bf type \NC \bf form \NC \bf meaning \NC \NR \TL \NC constant \NC \type{/...} \NC A symbol (prescribed string). \NC \NR \NC string \NC \type{(...)} \NC A sequence of characters in pdfdoc encoding \NC \NR \NC unicode \NC \type{<...>} \NC A sequence of characters in utf16 encoding \NC \NR \NC number \NC \type{3.1415} \NC A number constant. \NC \NR \NC boolean \NC \type{true/false} \NC A boolean constant. \NC \NR \NC reference \NC \type{N 0 R} \NC A reference to an object \NC \NR \NC dictionary \NC \type{<< ... >>} \NC A collection of key value pairs where the value itself is an (indirect) object. \NC \NR \NC array \NC \type{[ ... ]} \NC A list of objects or references to objects. \NC \NR \NC stream \NC \NC A sequence of bytes either or not packaged with a dictionary that contains descriptive data. \NC \NR \NC xform \NC \NC A special kind of object containing an reusable blob of data, for example an image. \NC \NR \LL \stoptabulate While writing additional backend code, we mostly create dictionaries. \starttyping << /Name (test) /Address 2 0 R >> \stoptyping In this case the indirect object can look like: \starttyping [ (Main Street) (24) (postal code) (MyPlace) ] \stoptyping The \LUATEX\ manual mentions primitives like \type {\pdfobj}, \type {\pdfannot}, \type {\pdfcatalog}, etc. However, in \MKIV\ no such primitives are used. You can still use many of them but those that push data into document or page related resources are overloaded to do nothing at all. In the \LUA\ backend code you will find function calls like: \starttyping local d = lpdf.dictionary { Name = lpdf.string("test"), Address = lpdf.array { "Main Street", "24", "postal code", "MyPlace", } } \stoptyping Equaly valid is: \starttyping local d = lpdf.dictionary() d.Name = "test" \stoptyping Eventually the object will end up in the file using calls like: \starttyping local r = lpdf.immediateobject(tostring(d)) \stoptyping or using the wrapper (which permits tracing): \starttyping local r = lpdf.flushobject(d) \stoptyping The object content will be serialized according to the formal specification so the proper \type {<< >>} etc.\ are added. If you want the content instead you can use a function call: \starttyping local dict = d() \stoptyping An example of using references is: \starttyping local a = lpdf.array { "Main Street", "24", "postal code", "MyPlace", } local d = lpdf.dictionary { Name = lpdf.string("test"), Address = lpdf.reference(a), } local r = lpdf.flushobject(d) \stoptyping \stopsection We have the following creators. Their arguments are optional. \starttabulate[|l|p|] \FL \NC \bf function \NC \bf optional parameter \NC \NR \TL \NC \type{lpdf.null} \NC \NC \NR \NC \type{lpdf.number} \NC number \NC \NR \NC \type{lpdf.constant} \NC string \NC \NR \NC \type{lpdf.string} \NC string \NC \NR \NC \type{lpdf.unicode} \NC string \NC \NR \NC \type{lpdf.boolean} \NC boolean \NC \NR \NC \type{lpdf.array} \NC indexed table of objects \NC \NR \NC \type{lpdf.dictionary} \NC hash with key/values \NC \NR %NC \type{lpdf.stream} \NC indexed table of operators \NC \NR \NC \type{lpdf.reference} \NC string \NC \NR \NC \type{lpdf.verbose} \NC indexed table of strings \NC \NR \LL \stoptabulate \ShowLuaExampleString{tostring(lpdf.null())} \ShowLuaExampleString{tostring(lpdf.number(123))} \ShowLuaExampleString{tostring(lpdf.constant("whatever"))} \ShowLuaExampleString{tostring(lpdf.string("just a string"))} \ShowLuaExampleString{tostring(lpdf.unicode("just a string"))} \ShowLuaExampleString{tostring(lpdf.boolean(true))} \ShowLuaExampleString{tostring(lpdf.array { 1, lpdf.constant("c"), true, "str" })} \ShowLuaExampleString{tostring(lpdf.dictionary { a=1, b=lpdf.constant("c"), d=true, e="str" })} %ShowLuaExampleString{tostring(lpdf.stream("whatever"))} \ShowLuaExampleString{tostring(lpdf.reference(123))} \ShowLuaExampleString{tostring(lpdf.verbose("whatever"))} \stopsection \startsection[title={Managing objects}] Flushing objects is done with: \starttyping lpdf.flushobject(obj) \stoptyping Reserving object is or course possible and done with: \starttyping local r = lpdf.reserveobject() \stoptyping Such an object is flushed with: \starttyping lpdf.flushobject(r,obj) \stoptyping We also support named objects: \starttyping lpdf.reserveobject("myobject") lpdf.flushobject("myobject",obj) \stoptyping A delayed object is created with: \starttyping local ref = pdf.delayedobject(data) \stoptyping The data will be flushed later using the object number that is returned (\type {ref}). When you expect that many object with the same content are used, you can use: \starttyping local obj = lpdf.shareobject(data) local ref = lpdf.shareobjectreference(data) \stoptyping This one flushes the object and returns the object number. Already defined objects are reused. In addition to this code driven optimization, some other optimization and reuse takes place but all that happens without user intervention. Only use this when it's really needed as it might consume more memory and needs more processing time. \startsection [title={Resources}] While \LUATEX\ itself will embed all resources related to regular typesetting, \MKIV\ has to take care of embedding those related to special tricks, like annotations, spot colors, layers, shades, transparencies, metadata, etc. Because third party modules (like tikz) also can add resources we provide some macros that makes sure that no interference takes place: \starttyping \pdfbackendsetcatalog {key}{string} \pdfbackendsetinfo {key}{string} \pdfbackendsetname {key}{string} \pdfbackendsetpageattribute {key}{string} \pdfbackendsetpagesattribute{key}{string} \pdfbackendsetpageresource {key}{string} \pdfbackendsetextgstate {key}{pdfdata} \pdfbackendsetcolorspace {key}{pdfdata} \pdfbackendsetpattern {key}{pdfdata} \pdfbackendsetshade {key}{pdfdata} \stoptyping One is free to use the \LUA\ interface instead, as there one has more possibilities but when code is shared with other macro packages the macro interface makes more sense. The names of the \LUA\ functions are similar, like: \starttyping lpdf.addtoinfo(key,anything_valid_pdf) \stoptyping Currently we expose a bit more of the backend code than we like and future versions will have a more restricted access. The following function will stay public: \starttyping lpdf.addtopageresources (key,value) lpdf.addtopageattributes (key,value) lpdf.addtopagesattributes(key,value) lpdf.adddocumentextgstate(key,value) lpdf.adddocumentcolorspac(key,value) lpdf.adddocumentpattern (key,value) lpdf.adddocumentshade (key,value) lpdf.addtocatalog (key,value) lpdf.addtoinfo (key,value) lpdf.addtonames (key,value) \stoptyping \stopsection \startsection [title={Annotations}] You can use the \LUA\ functions that relate to annotations etc.\ but normally you will use the regular \CONTEXT\ user interface. You can look into some of the \type {lpdf-*} modules to see how special annotations can be dealt with. \stopsection \startsection [title={Tracing}] There are several tracing options built in and some more will be added in due time: \starttyping \enabletrackers [backend.finalizers, backend.resources, backend.objects, backend.detail] \stoptyping As with all trackers you can also pass them on the command line, for example: \starttyping context --trackers=backend.* yourfile \stoptyping The reference related backend mechanisms have their own trackers. When you write code that generates \PDF, it also helps to look in the \PDF\ file so see if things are done right. In that case you need to disable compression: \starttyping \nopdfcompression \stoptyping \stopsection \startsection[title={Analyzing}] The \type {epdf} library that comes with \LUATEX\ offers a userdata interface to \PDF\ files. On top of that \CONTEXT\ provides a more \LUA-ish access, using tables. You can open a \PDF\ file with: \starttyping local mypdf = lpdf.epdf.load(filename) \stoptyping When opening is successful, you have access to a couple of tables: \starttyping \NC \type{pages} \NC indexed \NC \NR \NC \type{destinations} \NC hashed \NC \NR \NC \type{javascripts} \NC hashed \NC \NR \NC \type{widgets} \NC hashed \NC \NR \NC \type{embeddedfiles} \NC hashed \NC \NR \NC \type{layers} \NC indexed \NC \NR \stoptyping These provide efficient access to some data that otherwise would take a bit of code to deal with. Another top level table is the for \PDF\ characteristic \type {Catalog}. Watch the capitalization: as with other native \PDF\ data structures, keys are case sensitive and match the standard. Here is an example of usage: \starttyping local MyDocument = lpdf.epdf.load("somefile.pdf") context.starttext() local pages = MyDocument.pages local nofpages = pages.n context.starttabulate { "|c|c|c|" } context.NC() context("page") context.NC() context("width") context.NC() context("height") context.NR() for i=1, nofpages do local page = pages[i] local bbox = page.CropBox or page.MediaBox context.NC() context(i) context.NC() context(bbox[4]-bbox[2]) context.NC() context(bbox[3]-bbox[1]) context.NR() end context.stoptabulate() context.stoptext() \stoptyping \stopsection \stopchapter \stopcomponent