% language=us runpath=texruns:manuals/cld

\startcomponent cld-backendcode

\environment cld-environment

% derived from hybrid

\startchapter[title={Backend code}]

\startsection [title={Introduction}]

In \CONTEXT\ we've always separated the backend code in so called driver files.
This means that in the code related to typesetting only calls to the \API\ take
place, and no backend specific code is to be used. Currently a \PDF\ backend is
supported as well as an \XML\ export. \footnote {This chapter is derived from an
article on these matters. You can find nore information in \type {hybrid.pdf}.}

Some \CONTEXT\ users like to add their own \PDF\ specific code to their styles or
modules. However, such extensions can interfere with existing code, especially
when resources are involved. Therefore the construction of \PDF\ data structures
and resources is rather controlled and has to be done via the official helper
macros.

\stopsection

\startsection [title={Structure}]

A \PDF\ file is a tree of indirect objects. Each object has a number and the file
contains a table (or multiple tables) that relates these numbers to positions in
a file (or position in a compressed object stream). That way a file can be viewed
without reading all data: a viewer only loads what is needed.

\starttyping
1 0 obj <<
    /Name (test) /Address 2 0 R
>>
2 0 obj [
   (Main Street) (24) (postal code) (MyPlace)
]
\stoptyping

For the sake of the discussion we consider strings like \type {(test)} also to be
objects. In the next table we list what we can encounter in a \PDF\ file. There
can be indirect objects in which case a reference is used (\type{2 0 R}) and
direct ones.

It all starts in the document's root object. From there we access the page tree
and resources. Each page carries its own resource information which makes random
access easier. A page has a page stream and there we find the to be rendered
content as a mixture of (\UNICODE) strings and special drawing and rendering
operators. Here we will not discuss them as they are mostly generated by the
engine itself or dedicated subsystems like the \METAPOST\ converter. There we use
literal or \type {\latelua} whatsits to inject code into the current stream.

\stopsection

\startsection [title={Data types}]

There are several datatypes in \PDF\ and we support all of them one way or the
other.

\starttabulate[|l|l|p|]
\FL
\NC \bf type \NC \bf form \NC \bf meaning \NC \NR
\TL
\NC constant   \NC \type{/...} \NC A symbol (prescribed string). \NC \NR
\NC string     \NC \type{(...)} \NC A sequence of characters in pdfdoc
                   encoding \NC \NR
\NC unicode    \NC \type{<...>} \NC A sequence of characters in utf16
                   encoding \NC \NR
\NC number     \NC \type{3.1415} \NC A number constant. \NC \NR
\NC boolean    \NC \type{true/false} \NC A boolean constant. \NC \NR
\NC reference  \NC \type{N 0 R} \NC A reference to an object \NC \NR
\NC dictionary \NC \type{<< ... >>} \NC A collection of key value pairs
                   where the value itself is an (indirect) object.
                   \NC \NR
\NC array      \NC \type{[ ... ]} \NC A list of objects or references to
                   objects. \NC \NR
\NC stream     \NC \NC A sequence of bytes either or not packaged with
                   a dictionary that contains descriptive data. \NC \NR
\NC xform      \NC \NC A special kind of object containing an reusable
                   blob of data, for example an image. \NC \NR
\LL
\stoptabulate

While writing additional backend code, we mostly create dictionaries.

\starttyping
<< /Name (test) /Address 2 0 R >>
\stoptyping

In this case the indirect object can look like:

\starttyping
[ (Main Street) (24) (postal code) (MyPlace) ]
\stoptyping

The \LUATEX\ manual mentions primitives like \type {\pdfobj}, \type {\pdfannot},
\type {\pdfcatalog}, etc. However, in \MKIV\ no such primitives are used. You can
still use many of them but those that push data into document or page related
resources are overloaded to do nothing at all.

In the \LUA\ backend code you will find function calls like:

\starttyping
local d = lpdf.dictionary {
    Name    = lpdf.string("test"),
    Address = lpdf.array {
        "Main Street", "24", "postal code", "MyPlace",
    }
}
\stoptyping

Equaly valid is:

\starttyping
local d = lpdf.dictionary()
d.Name = "test"
\stoptyping

Eventually the object will end up in the file using calls like:

\starttyping
local r = lpdf.immediateobject(tostring(d))
\stoptyping

or using the wrapper (which permits tracing):

\starttyping
local r = lpdf.flushobject(d)
\stoptyping

The object content will be serialized according to the formal specification so
the proper \type {<< >>} etc.\ are added. If you want the content instead you can
use a function call:

\starttyping
local dict = d()
\stoptyping

An example of using references is:

\starttyping
local a = lpdf.array {
    "Main Street", "24", "postal code", "MyPlace",
}
local d = lpdf.dictionary {
    Name    = lpdf.string("test"),
    Address = lpdf.reference(a),
}
local r = lpdf.flushobject(d)
\stoptyping

\stopsection

We have the following creators. Their arguments are optional.

\starttabulate[|l|p|]
\FL
\NC \bf function \NC \bf optional parameter \NC \NR
\TL
\NC \type{lpdf.null}        \NC \NC \NR
\NC \type{lpdf.number}      \NC number \NC \NR
\NC \type{lpdf.constant}    \NC string \NC \NR
\NC \type{lpdf.string}      \NC string \NC \NR
\NC \type{lpdf.unicode}     \NC string \NC \NR
\NC \type{lpdf.boolean}     \NC boolean \NC \NR
\NC \type{lpdf.array}       \NC indexed table of objects \NC \NR
\NC \type{lpdf.dictionary}  \NC hash with key/values \NC \NR
%NC \type{lpdf.stream}      \NC indexed table of operators \NC \NR
\NC \type{lpdf.reference}   \NC string \NC \NR
\NC \type{lpdf.verbose}     \NC indexed table of strings \NC \NR
\LL
\stoptabulate

\ShowLuaExampleString{tostring(lpdf.null())}
\ShowLuaExampleString{tostring(lpdf.number(123))}
\ShowLuaExampleString{tostring(lpdf.constant("whatever"))}
\ShowLuaExampleString{tostring(lpdf.string("just a string"))}
\ShowLuaExampleString{tostring(lpdf.unicode("just a string"))}
\ShowLuaExampleString{tostring(lpdf.boolean(true))}
\ShowLuaExampleString{tostring(lpdf.array { 1, lpdf.constant("c"), true, "str" })}
\ShowLuaExampleString{tostring(lpdf.dictionary { a=1, b=lpdf.constant("c"), d=true, e="str" })}
%ShowLuaExampleString{tostring(lpdf.stream("whatever"))}
\ShowLuaExampleString{tostring(lpdf.reference(123))}
\ShowLuaExampleString{tostring(lpdf.verbose("whatever"))}

\stopsection

\startsection[title={Managing objects}]

Flushing objects is done with:

\starttyping
lpdf.flushobject(obj)
\stoptyping

Reserving object is or course possible and done with:

\starttyping
local r = lpdf.reserveobject()
\stoptyping

Such an object is flushed with:

\starttyping
lpdf.flushobject(r,obj)
\stoptyping

We also support named objects:

\starttyping
lpdf.reserveobject("myobject")

lpdf.flushobject("myobject",obj)
\stoptyping

A delayed object is created with:

\starttyping
local ref = pdf.delayedobject(data)
\stoptyping

The data will be flushed later using the object number that is returned (\type
{ref}). When you expect that many object with the same content are used, you can
use:

\starttyping
local obj = lpdf.shareobject(data)
local ref = lpdf.shareobjectreference(data)
\stoptyping

This one flushes the object and returns the object number. Already defined
objects are reused. In addition to this code driven optimization, some other
optimization and reuse takes place but all that happens without user
intervention. Only use this when it's really needed as it might consume more
memory and needs more processing time.

\startsection [title={Resources}]

While \LUATEX\ itself will embed all resources related to regular typesetting,
\MKIV\ has to take care of embedding those related to special tricks, like
annotations, spot colors, layers, shades, transparencies, metadata, etc. Because
third party modules (like tikz) also can add resources we provide some macros
that makes sure that no interference takes place:

\starttyping
\pdfbackendsetcatalog       {key}{string}
\pdfbackendsetinfo          {key}{string}
\pdfbackendsetname          {key}{string}

\pdfbackendsetpageattribute {key}{string}
\pdfbackendsetpagesattribute{key}{string}
\pdfbackendsetpageresource  {key}{string}

\pdfbackendsetextgstate     {key}{pdfdata}
\pdfbackendsetcolorspace    {key}{pdfdata}
\pdfbackendsetpattern       {key}{pdfdata}
\pdfbackendsetshade         {key}{pdfdata}
\stoptyping

One is free to use the \LUA\ interface instead, as there one has more
possibilities but when code is shared with other macro packages the macro
interface makes more sense. The names of the \LUA\ functions are similar, like:

\starttyping
lpdf.addtoinfo(key,anything_valid_pdf)
\stoptyping

Currently we expose a  bit more of the backend code than we like and
future versions will have a more restricted access. The following
function will stay public:

\starttyping
lpdf.addtopageresources  (key,value)
lpdf.addtopageattributes (key,value)
lpdf.addtopagesattributes(key,value)

lpdf.adddocumentextgstate(key,value)
lpdf.adddocumentcolorspac(key,value)
lpdf.adddocumentpattern  (key,value)
lpdf.adddocumentshade    (key,value)

lpdf.addtocatalog        (key,value)
lpdf.addtoinfo           (key,value)
lpdf.addtonames          (key,value)
\stoptyping

\stopsection

\startsection [title={Annotations}]

You can use the \LUA\ functions that relate to annotations etc.\ but normally you
will use the regular \CONTEXT\ user interface. You can look into some of the
\type {lpdf-*} modules to see how special annotations can be dealt with.

\stopsection

\startsection [title={Tracing}]

There are several tracing options built in and some more will be added in due
time:

\starttyping
\enabletrackers
  [backend.finalizers,
   backend.resources,
   backend.objects,
   backend.detail]
\stoptyping

As with all trackers you can also pass them on the command line, for example:

\starttyping
context --trackers=backend.* yourfile
\stoptyping

The reference related backend mechanisms have their own trackers. When you write
code that generates \PDF, it also helps to look in the \PDF\ file so see if
things are done right. In that case you need to disable compression:

\starttyping
\nopdfcompression
\stoptyping

\stopsection

\startsection[title={Analyzing}]

The \type {epdf} library that comes with \LUATEX\ offers a userdata interface to
\PDF\ files. On top of that \CONTEXT\ provides a more \LUA-ish access, using
tables. You can open a \PDF\ file with:

\starttyping
local mypdf = lpdf.epdf.load(filename)
\stoptyping

When opening is successful, you have access to a couple of tables:

\starttyping
\NC \type{pages}         \NC indexed \NC \NR
\NC \type{destinations}  \NC hashed  \NC \NR
\NC \type{javascripts}   \NC hashed  \NC \NR
\NC \type{widgets}       \NC hashed  \NC \NR
\NC \type{embeddedfiles} \NC hashed  \NC \NR
\NC \type{layers}        \NC indexed \NC \NR
\stoptyping

These provide efficient access to some data that otherwise would take a bit of
code to deal with. Another top level table is the for \PDF\ characteristic \type
{Catalog}. Watch the capitalization: as with other native \PDF\ data structures,
keys are case sensitive and match the standard.

Here is an example of usage:

\starttyping
local MyDocument = lpdf.epdf.load("somefile.pdf")

context.starttext()

  local pages    = MyDocument.pages
  local nofpages = pages.n

  context.starttabulate { "|c|c|c|" }

    context.NC() context("page")
    context.NC() context("width")
    context.NC() context("height") context.NR()

    for i=1, nofpages do
      local page = pages[i]
      local bbox = page.CropBox or page.MediaBox
      context.NC() context(i)
      context.NC() context(bbox[4]-bbox[2])
      context.NC() context(bbox[3]-bbox[1]) context.NR()
    end

  context.stoptabulate()

context.stoptext()
\stoptyping

\stopsection

\stopchapter

\stopcomponent