%D \module %D [ file=lang-mis, %D version=1997.03.20, % used to be supp-lan.tex %D title=\CONTEXT\ Language Macros, %D subtitle=Compounds, %D author=Hans Hagen, %D date=\currentdate, %D copyright={PRAGMA ADE \& \CONTEXT\ Development Team}] %C %C This module is part of the \CONTEXT\ macro||package and is %C therefore copyrighted by \PRAGMA. See mreadme.pdf for %C details. %D This one will be updated stepwise to \LMTX. See lang-mis.mkiv for previous %D implementations and removed code. \writestatus{loading}{ConTeXt Language Macros / Compounds} %D More or less replaced. %D \gdef\starttest#1\stoptest{\starttabulate[|l|l|p|]#1\stoptabulate} %D \gdef\test #1{\NC\detokenize{#1}\NC\hyphenatedword{#1}\NC#1\NC\NR} \unprotect %D One of \TEX's strong points in building paragraphs is the way hyphenations are %D handled. Although for real good hyphenation of non||english languages some %D extensions to the program are needed, fairly good results can be reached with the %D standard mechanisms and an additional macro, at least in Dutch. %D %D \CONTEXT\ originates in the wish to typeset educational materials, especially in %D a technical environment. In production oriented environments, a lot of compound %D words are used. Because the Dutch language poses no limits on combining words, we %D often favor putting dashes between those words, because it facilitates reading, %D at least for those who are not that accustomed to it. %D %D In \TEX\ compound words, separated by a hyphen, are not hyphenated at all. In %D spite of the multiple pass paragraph typesetting this can lead to parts of words %D sticking into the margin. The solution lays in saying \type %D {spoelwater||terugwinunit} instead of \type {spoelwater-terugwinunit}. By using a %D one character command like \type {|}, delimited by the same character \type {|}, %D we get ourselves both a decent visualization (in \TEXEDIT\ and colored verbatim %D we color these commands yellow) and an efficient way of combining words. %D %D The sequence \type{||} simply leads to two words connected by a hyphen. Because %D we want to distinguish such a hyphen from the one inserted when \TEX\ hyphenates %D a word, we use a bit longer one. %D %D \hyphenation {spoel-wa-ter te-rug-win-unit} %D %D \starttest %D \test {spoelwater||terugwinunit} %D \stoptest %D %D As we already said, the \type{|} is a command. This commands accepts an optional %D argument before it's delimiter, which is also a \type{|}. %D %D \hyphenation {po-ly-meer che-mie} %D %D \starttest %D \test {polymeer|*|chemie} %D \stoptest %D %D Arguments like \type{*} are not interpreted and inserted directly, in contrary to %D arguments like: %D %D \starttest %D \test {polymeer|~|chemie} %D \test {|(|polymeer|)|chemie} %D \test {polymeer|(|chemie|)| } %D \stoptest %D %D Although such situations seldom occur |<|we typeset thousands of pages before we %D encountered one that forced us to enhance this mechanism|>| we also have to take %D care of comma's. %D %D \hyphenation {uit-stel-len} %D %D \starttest %D \test {op||, in|| en uitstellen} %D \stoptest %D %D The next special case (concerning quotes) was brought to my attention by Piet %D Tutelaers, one of the driving forces behind rebuilding hyphenation patterns for %D the dutch language.\footnote{In 1996 the spelling of the dutch language has been %D slightly reformed which made this topic actual again.} We'll also take care of %D this case. %D %D \starttest %D \test {AOW|'|er} %D \test {cd|'|tje} %D \test {ex|-|PTT|'|er} %D \test {rock|-|'n|-|roller} %D \stoptest %D %D Tobias Burnus pointed out that I should also support something like %D %D \starttest %D \test {well|_|known} %D \stoptest %D %D to stress the compoundness of hyphenated words. %D %D Of course we also have to take care of the special case: %D %D \starttest %D \test {text||color and ||font} %D \stoptest %D \macros %D {installdiscretionaries} %D %D The mechanism described here is one of the older inner parts of \CONTEXT. The %D most recent extensions concerns some special cases as well as the possibility to %D install other characters as delimiters. The prefered way of specifying compound %D words is using \type{||}, which is installed by: %D %D \starttyping %D \installdiscretionary | - %D \stoptyping %D %D We used to have an installable mechanism but in the perspective of \MKIV\ and %D especialy \LMTX\ it no longer makes sense to complicate the code, so from now on %D we only deal with the active bar. Older code can be seen in the archives. It also %D means that we now just hardcode the bar. We also deal with math differently. %D \macros %D {compoundhyphen} %D %D Now let's go to the macros. First we define some variables. In the main \CONTEXT\ %D modules these can be tuned by a setup command. Watch the (maybe) better looking %D compound hyphen. % hm why ex \ifdefined\compoundhyphen \else \permanent\protected\def\compoundhyphen{\hbox{-\kern-.10775\emwidth-}} % .25\exheight \fi %D The last two variables are needed for subsentences |<|like this one|>| which we %D did not yet mention. We want to enable breaking but at the same time don't want %D compound characters like |-| or || to be separated from the words. \TEX\ hackers %D will recognise the next two macro's: \ifdefined\prewordbreak \else \permanent\protected\def\prewordbreak {\penalty\plustenthousand\hskip\zeroskip\relax} \fi \ifdefined\postwordbreak\else \permanent\protected\def\postwordbreak {\penalty\zerocount \hskip\zeroskip\relax} \fi \ifdefined\hspaceamount \else \def\hspaceamount#1#2{.16667\emwidth} \fi % will be overloaded \permanent\protected\def\permithyphenation{\ifhmode\wordboundary\fi} % doesn't remove spaces %D \macros %D {beginofsubsentence,endofsubsentence, %D beginofsubsentencespacing,endofsubsentencespacing} %D %D In the previous macros we provided two hooks which can be used to support nested %D sub||sentences. In \CONTEXT\ these hooks are used to insert a small space when %D needed. %D %D The following piece of code is a torture test compound handling. The \type %D {\relax} before the \type {\ifmmode} is needed because of the alignment scanner %D (in \ETEX\ this problem is not present because there a protected macro is not %D expanded. Thanks to Tobias Burnus for providing this example. %D %D \startformula %D \left|f(x_n)-{1\over2}\right| = %D {\cases{|{1\over2}-x_n| &for $0\le x_n < {1\over2}$\cr %D |x_n-{1\over2}| &for ${1\over2}2xxx \def\lang_discretionaries_hyphen_like#1#2% {\ifconditional\spaceafterdiscretionary \wordboundary\hbox{#1}\relax \orelse\ifconditional\punctafterdiscretionary \wordboundary\hbox{#1}\relax \else \wordboundary#2\wordboundary \fi} \definetextmodediscretionary {} {\lang_discretionaries_hyphen_like\textmodehyphen\textmodehyphendiscretionary} \definetextmodediscretionary - {\lang_discretionaries_hyphen_like\normalhyphen\normalhyphendiscretionary} \definetextmodediscretionary _ {\lang_discretionaries_hyphen_like\composedhyphen\composedhyphendiscretionary} \definetextmodediscretionary ) {\lang_discretionaries_hyphen_like{)}{\discretionary{-)}{}{)}}} \definetextmodediscretionary ( {\ifdim\lastskip>\zeropoint (\wordboundary \else \wordboundary\discretionary{}{(-}{(}\wordboundary %\discretionary options \plusthree{}{(-}{(}% \fi} \definetextmodediscretionary ~ {\wordboundary\discretionary{-}{}{\thinspace}\wordboundary} %{\discretionary options \plusthree{-}{}{\thinspace}} \definetextmodediscretionary ' {\wordboundary\discretionary{-}{}{'}\wordboundary} %{\discretionary options \plusthree{-}{}{'}} \definetextmodediscretionary ^ {\wordboundary \discretionary{\hbox{\normalstartimath|\normalstopimath}}{}{\hbox{\normalstartimath|\normalstopimath}}% \wordboundary} % bugged %{\discretionary options \plusthree{\hbox{\normalstartimath|\normalstopimath}}{}{\hbox{\normalstartimath|\normalstopimath}}} \definetextmodediscretionary < {\beginofsubsentence\wordboundary\beginofsubsentencespacing \aftergroup\ignorespaces} % tricky, we need to go over the \nextnextnext \definetextmodediscretionary > {\removeunwantedspaces \endofsubsentencespacing\wordboundary\endofsubsentence} \definetextmodediscretionary = {\removeunwantedspaces \wordboundary\midsentence\wordboundary \aftergroup\ignorespaces} % french \definetextmodediscretionary : {\removeunwantedspaces\wordboundary\kern\hspaceamount\empty{:}:} \definetextmodediscretionary ; {\removeunwantedspaces\wordboundary\kern\hspaceamount\empty{;};} \definetextmodediscretionary ? {\removeunwantedspaces\wordboundary\kern\hspaceamount\empty{?}?} \definetextmodediscretionary ! {\removeunwantedspaces\wordboundary\kern\hspaceamount\empty{!}!} \definetextmodediscretionary * {\wordboundary\discretionary{-}{}{\kern.05\emwidth}\wordboundary} % spanish \definetextmodediscretionary ?? {\wordboundary\questiondown} \definetextmodediscretionary !! {\wordboundary\exclamdown} \permanent\protected\def\defaultdiscretionaryhyphen{\compoundhyphen} %D \macros %D {fakecompoundhyphen} %D %D In headers and footers as well as in active pieces of text we need a dirty hack. %D Try to imagine what is needed to savely break the next text across a line and at %D the same time make the words interactive. %D %D \starttyping %D \goto{Some||Long||Word} %D \stoptyping \permanent\protected\def\fakecompoundhyphen {\enforced\permanent\protected\def\|{\mathortext\vert\lang_compounds_fake_hyphen}} \def\lang_compounds_fake_hyphen {\enforced\permanent\protected\def##1|% {\ifempty{##1}\compoundhyphen\else##1\fi \wordboundary % was a signal \allowbreak}} %D \macros %D {midworddiscretionary} %D %D If needed, one can add a discretionary hyphen using \type %D {\midworddiscretionary}. This macro does the same as \PLAIN\ \TEX's \type {\-}, %D but, like the ones implemented earlier, this one also looks ahead for spaces and %D grouping tokens. \permanent\protected\def\midworddiscretionary {\futurelet\nexttoken\lang_discretionaries_mid_word} \def\lang_discretionaries_mid_word {\ifx\nexttoken\blankspace\orelse \ifx\nexttoken\bgroup \orelse \ifx\nexttoken\egroup \orelse \discretionary{-}{}{}% \fi} % \aliased\let\ignorecompoundcharacter\relax %D \macros %D {disablediscretionaries,disablecompoundcharacter} %D %D Occasionally we need to disable this mechanism. For the moment we assume that %D \type {|} is used. \aliased\let\disablediscretionaries \ignorediscretionaries %aliased\let\disablecompoundcharacters\ignorecompoundcharacter %D \macros %D {normalcompound} %D %D Handy in for instance XML. (Kind of obsolete) \ifdefined\normalcompound \else \aliased\let\normalcompound=| \fi %D \macros %D {compound} %D %D We will overload the already active \type {|} so we have to save its meaning in %D order to be able to use this handy macro. %D %D \starttyping %D so test\compound{}test can be used instead of test||test %D \stoptyping \permanent\protected\gdef\compound#1{|#1|} \appendtoks \enforced\permanent\protected\def|#1|{\ifx#1\empty\empty-\else#1\fi}% \to \everysimplifycommands %D Here we hook some code into the clean up mechanism needed for verbatim data. \appendtoks %disablecompoundcharacters \disablediscretionaries \to \everycleanupfeatures %D Here: %D %D \startbuffer %D {\red somelongword}{\blue \compounddiscretionary}{\green somelongword} %D \stopbuffer %D %D \typebuffer \blank {\hsize3mm\getbuffer\par} \blank \permanent\protected\def\compounddiscretionary {\discretionary options \plusthree {\ifnum\prehyphenchar >\zerocount\char\prehyphenchar \fi}% {\ifnum\posthyphenchar>\zerocount\char\posthyphenchar\fi}% {\ifnum\posthyphenchar>\zerocount\char\posthyphenchar\fi}} % \setcatcodetable\prtcatcodes % because we activated the bar \protect \endinput