Part of the Occam utility. Laurent Siebenmann Master posting 1994, ftp ftp.math.u-psud.fr Alpha version 6-94 subject to change. ====== Occam Syntax and Specifications ====== Occam is a system for extracting from a large macro file exactly those macros required by a given typescript. Its active parts are auditor.tex that determines which macros are necessary, and DefStrip a utility to deletes unnecessary macros. The first created DefStrip utility is a QUEDM script; QUEDM is a editor with macro capabilities that is available on Macintosh computers. (Hopefully a version of this utility which is a ".tex" program like auditor.tex will follow in due time; it would be very analogous to the LaTeX "docstrip.cmd" utility.) (I) About the QUEDM Version of Summer 1994 (preliminary) The Occam syntax is for TeX macro files. Its purpose is to let the "DefStrip" utility delete selected lines of the file with the help of a list audit.lst of "unused" control sequences (mostly macros). These lines come in blocks of roughly two sorts: (a) Material that is to be unconditionally deleted. (b) Blocks surrounding lines beginning (modulo spaces) with one of: \Def (variant of \def) \gDef (variant of \gdef or \global\def) \Let (variant of \let) \gLet (variant of \global\let) \Font (variant of \font) \Mathchardef (variant of \mathchardef) \Newsymbol (variant of \newsymbol) This list may be extended. A particular such block is to be deleted precisely if the macro name is designated in an external list called "audit.lst". output by the TeX utility "auditor.tex". MAIN SPECIFICATIONS of the Occam syntax. ASCII (7-bit) text files only. No tab characters please. The names of macro files conforming to this syntax should involve the suffix "aud" in some form if at all possible. For example, "x.sty" might become "x-aud.sty" or "x.occ", say "x-aud.sty" for future reference. See internal documentation of "audit.tex" to generate a list of macros in "x-aud.sty" that are unnecessary in a given typesetting job "x.tex". In x-aud.sty, the lines %^ This file is formatted by , , % for use of the Occam utility posted on the CTAN archives % (master posting 1994 on ftp ftp.math.u-psud.fr) %% DO NOT ALTER "OCCAM" SIGNS ^ or _ , ^_ %% UNLESS YOU UNDERSTAND THEM! \let\Def\def \let\gDef\gdef \let\Let\let \def\gLet{\global\let} \let\Font\font \let\Mathchardef\mathchardef\let\Newsymbol\newsymbol \let\MATHchardef\mathchardef\let\NEWsymbol\newsymbol % \input auditor.tex %% keep auditor.tex available %% comment out above line to suppress audit function. %_ should appear in the header. Two composite symbols %^ and %_ are employed to designate possible deletions. On its line %^ is always preceeded by spaces only (zero or more); similarly %_ is always followed by spaces only. (A) Unconditionally deleted material: %%^_ Everything from %%^_ to the end of file is then deleted. To delete just a segment use %^ %_ The deleted material can span many lines, but must include no blank line. We have just seen a block of such material above! Note that it may well contain \Def etc. but not %^, %_. The unconditional deletions will occur in the order described, and before conditional deletions are considered. (B) Conditionally deleted material: \Def \somemacro%_ may cause deletion of the block of lines beginning with \Def etc. and ending with %_. This material is really deleted precisely if the macro \somemacro is marked for deletion in the the file "audit.lst". The material must contain no blank line nor %^, %_, \Def etc; but it is otherwise arbitrary; in particular, macro arguments, comments, and auxiliary definitions are OK. Along with this material some additional preceding material is deleted, namely contiguous preceeding lines (if any) that (a) are nonempty and (b) contain no %_ (but \Def etc; are allowed). Typically, such preceeding material might be comments or commands "owned" by the macro being deleted. For example the whole block %_ \ifx\undefined\eightpoint \Def\eightpoint{} \fi %_ will be deleted precisely in case \eightpoint is marked as unused in audit.lst. (The first %_ could be replaced by a blank line.) Note that %_ is not really a closing delimiter since it can exist in arbitrary numbers without belonging to a matching pair. For another example, consider: \Def\amacro ...%_ \newtoks\btoks %_ \Def\cmacro ...%_ Here, the the first two %_ prevent \newtoks\btoks being deleted --- in all circumstances. The example \Def\amacro ... \Def\bmacro ...%_ is incorrect because the block beginning with \Def\amacro ... contains \Def\bmacro. There is a second type of conditional deletion. Suppose \amacro is not used and is so designated in audit.lst. It often occurs that several *disjoint* blocks of lines should be deleted along with \amacro. These blocks should be designated as follows: %/^\amacro %/_ \amacro is called the sentinel (watchman). The sentinel's line %/^... must contain nothing more than %/^\amacro and blank space. The initial and terminal lines will vanish along with . IN SUMMARY: the blocks %^...%_ are unconditionally deleted, while a block signalled by \Def, \gDef, etc. with the help of %_ and/or blank lines is deleted or not according as the macro following \Def etc. is marked for deletion in "audit.lst". Similarly for blocks with sentinel macro. None of these blocks for conditional or unconditional deletion is allowed to contain an empty line nor any extraneous %^,%_,%/^,%/_,%%^_,\Def, \gDef, etc. The blocks introduced by \Def, \gDef, etc. include material extending backward as far as (but not including) a preceding line that is blank or terminated by one of %_,%/_. No such extension for blocks introduced by %^, %/^ is allowed --- nor would it be helpful. Beyond these primary deletions, the utility DefStrip performs a few auxiliary tasks: --- All remaining \Def, \gDef, etc. are converted to \def, \global\def, etc.. Also, if a remaining %_ is alone on its line (spaces ignored), the whole line disappears. And each remaining %_ *not* alone on its line becomes % (this is the only deletion that can affect a line that survives.) --- any empty line sequence (usually created by the deletion of blocks of lines) is reduced to a single empty line. --- Residual appearances in x-aud.sty of macros marked for deletion in audit.lst will be marked by %%[VESTIGE] (on a new following line). They should be considered a failure of the current Occam format". Users may find the vestiges mentioned above hard to deal with. (Can they simply be deleted?) Thus programmers should attempt to set up "Occam" formatting so as to assure that vestiges never occur; for their part, users should report vestiges to the programmers along with the involved audit.lst file from auditor.tex. It is the programmer's or the user's responsibility to assure that the deletions made by the DefStrip utility result in a useful TeX macro file. The DefStrip utility is of little help here since it does not understand the macros. Thus it is expected that programmers take on the task of preparing macro files in Occam format. In most cases, anyone who programs TeX macros at an intermediate level will find it an easy task to put a macro file in Occam format. Beware that a good deal of testing and a bit of cleverness is usually necessary to assure that the Occam formatting does the job desired and in the most efficient way. --------------------------------- The following is documentation for additions made in 1995. Example 2 (with harvmac.occ) illustrates these features. EXPLANATIONS AND EXTRAPOLATIONS. (a) Unnecessary macros nested within macros can also be be eliminated. Currently this is is achieved quite trivially by making several passes through the "defstrip" utility and the example below is conveniently explained in terms of several passes. However, the TeX version of "defstrip" will almost certainly reduce this to a single pass; the audit.tex utility already acts in a single pass.) Here is a generic "example". The original macro file contains: \def\MACRO{% \def\macro{}% } An Occam formatted version is: \Def\MACRO{%#_ \DDef\macro{}% %#_ }%_ In relation to audit.tex, the macro \DDef behaves much like \Def except that the associated tag distinguishing unused macros in audit.lst is ** in place of *. Note that if \MACRO is unused then the whole block vanishes. Suppose not. Then, on first pass of the macro file through "defstrip", \DDef is converted to \Def. There is an accounting procedure set up in audit.lst in terms of * and #. First off all ** become *# (and *** would become *## if there were any, etc.). At the close of the first pass the marks %#_ in WW.sty are converted %_; at the same time the file audit.lst undergoes changes ** ==> *# ==> #* and * ==> #\, i.e. asterisks move right or die on backslash. On the second pass through "defstrip", one is treating: \def\MACRO{%_ \Def\macro{}% %_ }% and in response to an entry #*\macro in audit.lst "defstrip" will conditionally delete the block \Def\macro{}% %_ i.e this block is deleted precisely if \macro is not "used". (Only a programmer can guess whether this elimination is safe!) (b) Often one wants to delete other material along with the block surrounding \macro; for that, the following "sentinel" approach mentioned elsewhere is useful. The provisional syntax for a block to be eliminated with \MACRO is %/^\MACRO %/_ and for \macro it would be %#/^\macro %#/_. It is OK to use _ in place of /_ in the above syntax; But not ^ in place of /^ since that would give an unconditional deletion. (c) There is also a notion of nested *un*conditional deletion useful for deleting nested diagnostic macros as in (d) below. The syntax is: %#^ %#_ (d) For text fonts, \Font works reasonably well. But it fails badly for math font systems. The latter are particularly difficult to minimize because TeX seems not to readily indicate which fonts it is using for math. At a given pointsize the following \everymath device is used in harvmac.occ; it manages to tell whether math mode has been called. %%% Title fonts %#/^\TitlepointMathTest \font\titlerms=cmr7 \tfontsize \font\titlermss=cmr5 \tfontsize \font\titlei=cmmi10 \tfontsize\relax \skewchar\titlei='177 \font\titleis=cmmi7 \tfontsize\relax \skewchar\titleis='177 \font\titleiss=cmmi5 \tfontsize\relax \skewchartitleiss='177 \font\titlesy=cmsy10 \tfontsize\relax \skewchar\titlesy='60 \fonttitlesys=cmsy7 \tfontsize\relax \skewchar\titlesys='60 \font\titlesyss=cmsy5 \tfontsize\relax \skewchar\titlesyss='60 %#/_ %^ \DDef \TitlepointMathTest{\relax} %% a diagnostic that never survives%_ %% note the peculiar nesting: %#^ would be illegal. %% The nesting below is the other way about. %_ \font\titlerm=cmr10 \tfontsize \Def\titlefont{\textfont0=\titlerm \def\rm{\fam0\titlerm}% \rm %#^\TitlepointMathTest \textfont0=\titlerm \scriptfont0=\titlerms \scriptscriptfont0=\titlermss \textfont1=\titlei \scriptfont1=\titleis \scriptscriptfont1=\titleiss \textfont2=\titlesy \scriptfont2=\titlesys \scriptscriptfont2=\titlesyss %#_ %#^ \everymath{\TitlepointMathTest}% %#_ }%_ For witten94.tex this device allowed suppression of the math system at title pointsize, an appreciable saving. \TitlepointMathTest is a diagnostic macro that in no event will survive. Nevertheless it controls the inclusion or omission of two separated code segments. Essentially all the features of the Occam syntax so far mentioned are put to work in these few lines so it is worth pausing to think about it all. (e) This test piece has caused second thoughts about the policy of systematically notifying the user of "vestigeal" macros remaining in the pruned macro file, i.e. unused macros that remain in W.tex but whose definitions have been deleted by "defstrip"; it can be helpful to stumble on an undefined macro with a suggestive name. The notification will be probably be in some sense optional. (f) It is perhaps unwise to wring the last unused font out of an article's font system --- as that may interfere with revision. However, ridiculously wasteful font systems abound, and the problem of analysing math font use is in principle challenging. Below I venture to point out two approaches that were *not* employed in harvmac.occ. Are there others worth trying? (i) To test whether a given math font is really used, replace it by a math font with no characters \nullmath and look for missing character complaints by TeX. This approach is powerful but clumsy and slow because a huge .log file must examined. Here is a construction via initex: \font\nullmath=\nullfont \fontdimen23\nullmath=1pt \skewchar\nullmath='60 \input plain\dump Second construction: *create* font "empty" beforehand as a .dvi file. \font\nullmath=empty at 1 pt \fontdimen23\nullmath=1pt \skewchar\nullmath='60 Making do otherwise: \ifx\undefined\nullmath \font\nullmath=logo10 at 1 pt \fi \fontdimen23\nullmath=1pt \skewchar\nullmath='60 This last way is easy but not perfect as logo has a few characters. (Math fonts are formally required to have up to 23 dimensions). (ii) To discover whether any math font \myfont is needed, examine the ".dvi" file. A list of the used fonts from the .dvi file (.tfm names and sizes) can then be assimilated first off by audit.tex and used to decide whether \myfont should bear an asterisk in audit.lst. Then the power of \ffont to eliminate math (and other) fonts would be comparable to the power of \Def to eliminate unused macros. Perfection seems never attainable: in principle, one can use a font without it showing up in output to the .dvi!. NATURAL THINGS THAT ARE NOT (YET) IN PLACE --- nesting beyond depth two is not yet supported. The only problem is to program for all levels at once. --- at depth two, \FFont and \DDef exist, but \LLet and others do not yet exist.