Go to the first, previous, next, last section, table of contents.

Pretty printing

The main feature of @pack is its pretty-printing capabilities. Two different levels of pretty printing can be reached:

Syntactic limits

@pack is not a powerful syntactic pretty-printer: it just handles lexical structures, i.e., if in your favorite language

IF IF == THEN THEN THEN := ELSE ELSE ELSE := IF

is legal, then @pack is not the tool you need. Indeed @pack just looks for some keywords, or some sequences, i.e., sequences of characters which are opened by a given marker, and ended the same way. For instance in C, the comments are opened by `/*', and ended by `*/'.

It is for the same reason that you can't expect @pack to highlight the function definitions in C.

Automatic style

The heuristic @pack uses to find out in which language is written a file is fairly simple:

  1. @pack tries the user defined filename matching rules (see section Your guess rules). Upon success, the corresponding language style is used.
  2. @pack tries to find out thanks to the file suffix and its own abbreviation databases (see section Name and abbreviations).
  3. It asks file(3) its opinion on the file. If the answer looks like a language, @pack trusts file, but for the C, since file has the bad habit to consider that quite everything is written in C.
  4. Plain style will be used.

Two things are to retain from this:

  1. @pack won't guess anything for a file given through the standard input.
  2. if file is wrong on some files, @pack may use bad style sheets. In this case, do try option `--guess', compare it with the output of file, and if the culprit is file, go and complain to your system administrator :-), or fix it by defining your own filename pattern matching rules (see section Your guess rules).

Known languages

@pack version 4.8.4 supports the following languages:

Ada
C
C++
caml (ml)
Claire (cl)
More information on this language can be found on
@url{http://www.ens.fr/~laburthe/claire.html}
coq-vernacular (coq)
Common-lisp (lsp)
Eiffel (e)
Fortran (f)
This style was developed by Denis Girou @email{Denis.Girou@idris.fr}.
java
lace (ace)
mailfolder (mail)
Support for mail folders is rather good (see section Interfacing with other programs). This style also suits news files.
Modula-3 (m3)
68000
o2c
Oberon
Objective C
Written by Paul Shum @email{pshum@ali.bc.ca}.
Octave/MATLAB
This was written by Craig P. Earls @email{cpearls@mit.edu}.
Pascal (pas)
perl (pl)
PostScript (ps)
PreScript (pre)
prolog (pro)
promela (pml)
This style was written thanks to Jean-Philippe Cottin @email{cottin@inf.enst.fr}.
python (py)
Sather (sa)
Scheme (scm)
SDL-88 (sdl)
This style was written thanks to Jean-Philippe Cottin @email{cottin@inf.enst.fr}. SDL may be the language for which `--strip-level=2' is the most useful: it cancels the graphical information left by graphic editors. Only the pure specification is then printed.
sh
SQL family
These styles were written by Pierre Mareschal @email{pmaresch@be.oracle.com}.
  1. SQL92 (sql92)
  2. Oracle SQL (sql)
  3. Oracle PL/SQL (pks)
  4. Oracle SQL-PL/SQL-SQL*Plus (oracle)
  5. Oracle Init.ora parameter file (initora)
tcl
tk
Since everything, in those languages, are strings, what prints @pack is not always what you would like (see section Syntactic limits).
Unity
This style was written thanks to Jean-Philippe Cottin @email{cottin@inf.enst.fr}. Note that the graphic conversion of the symbols (`-g') is just perfect for this language.
Verilog
This style was written by Edward Arthur @email{eda@ultranet.com}.
VHDL
This style was written by Thomas Parmelan @email{Thomas.Parmelan@efrei.fr}.
zsh

Definition of the style sheets

@pack pretty prints a source file thanks to style sheets, one per language. In the following is described how the style sheets are defined. You may skip this section if you don't care how @pack does this, and if you don't expect to implement new styles.

Name and abbreviations

Every style has a unique name (reported with `--list-features'). It has also a list of abbreviations (which should include usual suffix for the language, such as `e' for the Eiffel language).

Any of the name or abbreviations can be used with option `-E'. When automatic pretty-printing is enabled, first @pack calls file to see whether the language is recognizable. Otherwise, the suffix of the file is compared with every name and abbreviation. On failure, plain style is used.

Alphabets

@pack needs to know the beginning and the end of a word, especially keywords. Hence it needs two alphabets: the first one specifying by which letters an identifier can begin, and the second one for the rest of the word.

Keywords and regular symbols

A keyword if recognized, is just written in a special font. A symbol, when recognized, is replaced by the corresponding character in the symbol font. To be recognized, both need to start by a character in the first alphabet and to be immediately followed by a character which does not belong to the second alphabet.

Special symbols

They need not to be preceded and followed by characters belonging to the alphabet. For instance in `caml', `not' is a regular symbol: it represents boolean negation; but in neither `nota' or `notnot', the three first letters represent `not'. On the contrary `<>' always means `not equal', even in `a<>b'.

Both regular and special symbols are transformed only if option `--graphic-symbols' was given.

Sequences

Sequences are string is a string between two markers. A marker is a fixed string. Typical examples are strings (with usually `"' as opening and closing markers), comments etc. Three fonts are used: one for the initial marker, one for the core of the sequence, and a last one for the final maker.

Escapes and verbatims

Escapes are immediately copied when found in a sequence. Their main use is to avoid a sequence from being terminated too soon, e.g., the string `"\""' is legal in C, hence it is necessary to specify in @pack that `\\' and `\"' have to be written has such when in a sequence: they are escapes.

Verbatims are immediately copied wherever met. It has been implemented for ada in which `"'' is the constant character `''. @pack must not understand it as opening of a sequence (first quote), closing of the sequence (second quote), opening of a sequence (third quote). Nor is it possible to specify that `'' is an escape...

Case sensitivity

If the style is case insensitive, the case does not matter for the keyword, symbol and sequence recognition. Other categories are not concerned by case sensitivity.

Writing new style sheets

Note. A new encoding scheme is used in the following versions of @pack{}, which make it much easier to build and/or fix a style sheet. If you do intend to work on this, the authors strongly recommend that you fetch a version 4.9.2 or above.

To be able to define new style sheets one needs GNU m4. Also, never update `styles.c' by hand, but its precursor `styles.c.in'.

There is a couple of things to know about the process. But in anyway, please use the already defined languages as examples, and see the prologue of `styles.c.in' where many m4 macros are defined to ease the implementation. Read also `styles.h' where information is given on the struct used for the style sheets.

The order in which you define the elements of a category (but the sequences) does not matter. But since @pack sorts them at run time, it may save time if the alphabetical C-order is more or less followed.

For the particular case of the special symbols, note that you may have to protect the symbols which are prefixed by them. For instance if in Pascal you define `=' to be a special symbol, you have to protect the `=' in `:='. Otherwise @pack will see a `:', written such as, but will transform the `=' in the symbol you specified. Hence declare `a2_not_symbol(:=)'.

The opening and closing markers of a sequence may use `^' and `$' escapes: the former to specify beginning of line, and the latter for end of line. Both may loose their special meaning thanks to `\', which is always discarded when in first or/and last position.

E.g., the marker `^---cut-here---$' will only match lines equal to `---cut-here---'. On the other hand `\^money$\', matches `^money$' anywhere in the line, and `\\^money$\\' matches `\^money$\' everywhere in the line.

It is also possible to specify several closing markers, just by putting them one after the other, for instance (taken for the ada style sheet):

a2_sequence(procedure , keyword_strong, label, ` is', keyword_strong),
a2_sequence(procedure , keyword_strong, label, ` (', courier),
a2_sequence(procedure , keyword_strong, label, `(', courier),
a2_sequence(procedure , keyword_strong, label, `$', courier),
a2_sequence(procedure , keyword_strong, label, `;', courier),

Please note that:

  1. the order between sequences does matter. For instance in java, `/**' introduces strong comments, and `/*' comments. `/**' must be declared before `/*', or it will be hidden.
  2. closing alternatives must be grouped together (no sequence with a different opener must occur while defining alternative closers).
  3. closing markers may have different faces (like ` is' and `(' above).
  4. `$' is used instead of a hard coded `\n'. Some encodings do not use `\n' as end-of-line, hence a style using `\n' would not be portable.

Please, respect the alphabetical order in which the languages are introduced. Don't forget to update `STYLE' in `styles.h', and to add your language in the macro a2_styles at the bottom of `styles.c.in'.


Go to the first, previous, next, last section, table of contents.