The main feature of @pack is its pretty-printing capabilities. Two different levels of pretty printing can be reached:
@pack is not a powerful syntactic pretty-printer: it just handles lexical structures, i.e., if in your favorite language
IF IF == THEN THEN THEN := ELSE ELSE ELSE := IF
is legal, then @pack is not the tool you need. Indeed @pack
just looks for some keywords, or some sequences, i.e., sequences
of characters which are opened by a given marker, and ended the same
way. For instance in C
, the comments are opened by `/*',
and ended by `*/'.
It is for the same reason that you can't expect @pack to highlight
the function definitions in C
.
The heuristic @pack uses to find out in which language is written a file is fairly simple:
file(3)
its opinion on the file. If the answer looks
like a language, @pack trusts file
, but for the C
,
since file
has the bad habit to consider that quite everything is
written in C
.
Two things are to retain from this:
file
is wrong on some files, @pack may use bad style sheets.
In this case, do try option `--guess', compare it with the output
of file
, and if the culprit is file
, go and complain to
your system administrator :-), or fix it by defining your own filename
pattern matching rules (see section Your guess rules).
@pack version 4.8.4 supports the following languages:
Ada
C
C++
caml (ml)
Claire (cl)
@url{http://www.ens.fr/~laburthe/claire.html}
coq-vernacular (coq)
Common-lisp (lsp)
Eiffel (e)
Fortran (f)
java
lace (ace)
mailfolder (mail)
Modula-3 (m3)
68000
o2c
Oberon
Objective C
Octave/MATLAB
Pascal (pas)
perl (pl)
PostScript (ps)
PreScript (pre)
prolog (pro)
promela (pml)
python (py)
Sather (sa)
Scheme (scm)
SDL-88 (sdl)
SDL
may be the language for which `--strip-level=2'
is the most useful: it cancels the graphical information left by graphic
editors. Only the pure specification is then printed.
sh
SQL family
sql92
)
sql
)
pks
)
oracle
)
initora
)
tcl
tk
Unity
Verilog
VHDL
zsh
@pack pretty prints a source file thanks to style sheets, one per language. In the following is described how the style sheets are defined. You may skip this section if you don't care how @pack does this, and if you don't expect to implement new styles.
Every style has a unique name (reported with `--list-features'). It has also a list of abbreviations (which should include usual suffix for the language, such as `e' for the Eiffel language).
Any of the name or abbreviations can be used with option `-E'.
When automatic pretty-printing is enabled, first @pack calls
file
to see whether the language is recognizable. Otherwise, the
suffix of the file is compared with every name and abbreviation. On
failure, plain style is used.
@pack needs to know the beginning and the end of a word, especially keywords. Hence it needs two alphabets: the first one specifying by which letters an identifier can begin, and the second one for the rest of the word.
A keyword if recognized, is just written in a special font. A symbol, when recognized, is replaced by the corresponding character in the symbol font. To be recognized, both need to start by a character in the first alphabet and to be immediately followed by a character which does not belong to the second alphabet.
They need not to be preceded and followed by characters belonging to the alphabet. For instance in `caml', `not' is a regular symbol: it represents boolean negation; but in neither `nota' or `notnot', the three first letters represent `not'. On the contrary `<>' always means `not equal', even in `a<>b'.
Both regular and special symbols are transformed only if option `--graphic-symbols' was given.
Sequences are string is a string between two markers. A marker is a fixed string. Typical examples are strings (with usually `"' as opening and closing markers), comments etc. Three fonts are used: one for the initial marker, one for the core of the sequence, and a last one for the final maker.
Escapes are immediately copied when found in a sequence. Their main use is to avoid a sequence from being terminated too soon, e.g., the string `"\""' is legal in C, hence it is necessary to specify in @pack that `\\' and `\"' have to be written has such when in a sequence: they are escapes.
Verbatims are immediately copied wherever met. It has been implemented for ada in which `"'' is the constant character `''. @pack must not understand it as opening of a sequence (first quote), closing of the sequence (second quote), opening of a sequence (third quote). Nor is it possible to specify that `'' is an escape...
If the style is case insensitive, the case does not matter for the keyword, symbol and sequence recognition. Other categories are not concerned by case sensitivity.
Note. A new encoding scheme is used in the following versions of @pack{}, which make it much easier to build and/or fix a style sheet. If you do intend to work on this, the authors strongly recommend that you fetch a version 4.9.2 or above.
To be able to define new style sheets one needs GNU m4
. Also,
never update `styles.c' by hand, but its precursor
`styles.c.in'.
There is a couple of things to know about the process. But in anyway,
please use the already defined languages as examples, and see the
prologue of `styles.c.in' where many m4
macros are defined
to ease the implementation. Read also `styles.h' where information
is given on the struct
used for the style sheets.
The order in which you define the elements of a category (but the
sequences) does not matter. But since @pack sorts them at run time, it
may save time if the alphabetical C
-order is more or less
followed.
For the particular case of the special symbols, note that you may have
to protect the symbols which are prefixed by them. For instance if in
Pascal
you define `=' to be a special symbol, you have to
protect the `=' in `:='. Otherwise @pack will see a `:',
written such as, but will transform the `=' in the symbol you
specified. Hence declare `a2_not_symbol(:=)'.
The opening and closing markers of a sequence may use `^' and `$' escapes: the former to specify beginning of line, and the latter for end of line. Both may loose their special meaning thanks to `\', which is always discarded when in first or/and last position.
E.g., the marker `^---cut-here---$' will only match lines equal to `---cut-here---'. On the other hand `\^money$\', matches `^money$' anywhere in the line, and `\\^money$\\' matches `\^money$\' everywhere in the line.
It is also possible to specify several closing markers, just by putting
them one after the other, for instance (taken for the ada
style
sheet):
a2_sequence(procedure , keyword_strong, label, ` is', keyword_strong), a2_sequence(procedure , keyword_strong, label, ` (', courier), a2_sequence(procedure , keyword_strong, label, `(', courier), a2_sequence(procedure , keyword_strong, label, `$', courier), a2_sequence(procedure , keyword_strong, label, `;', courier),
Please note that:
Please, respect the alphabetical order in which the languages are
introduced. Don't forget to update `STYLE' in `styles.h', and
to add your language in the macro a2_styles
at the bottom of
`styles.c.in'.