Contents of this page
Introduction
At the most elementary level,
a Z specification can be viewed as a sequence of characters.
Given the ubiquity of ASCII keyboards,
preparation of a Z specification may have to begin one step back
from its sequence of characters, using a mark-up language
such as LaTeX or troff.
An understanding of the characters to which mark-up is converted
is useful when preparing mark-up.
The characters comprising a Z specification are those of
ISO/IEC 10646 Universal Multiple-Octet Coded Character Set (UCS).
The code positions of characters in UCS
are the same as their code positions in Unicode.
Characters are classified into several categories,
such as LETTER, DIGIT, SYMBOL and SPECIAL,
according to their UCS general property.
This provides a basis for lexing a specification.
When characters are exchanged between tools,
their code positions are encoded according to
one of several alternative schemes.
A specific scheme can be chosen using one of the following
command-line options to the tools:
-UTF8, -UTF16BE and -UCS4.
The default scheme is UTF8.
No other schemes are yet implemented by CADi
.
The well-known scheme UCS2 is not applicable to Z,
as it cannot encode the
and
characters.
More discussion of these issues may be found in [Toyn02].
The characters are formalized below
using syntactic metalanguage.
ISO Standard characters
Formal definition of characters
This formal definition is public domain material,
and appears as it appears in ISO/IEC 13568:2002 (the Z standard).
ZCHAR | = | DIGIT | LETTER | SPECIAL | SYMBOL ; |
DIGIT | = | DECIMAL |
| | | ?other UCS chars with Number property but Number, Decimal Digit (as supported)? |
| ; | |
DECIMAL | = | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' |
| | | ?any other UCS chars with Number, Decimal Digit property (as supported)? |
| ; | |
LETTER | = | LATIN | GREEK | OTHERLETTER |
| | | ?any characters of the mathematical toolkit with letter property (as supported)? |
| | | ?any other UCS characters with letter property (as supported)? |
| ; | |
LATIN | = | 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H' | 'I' |
| | | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R' |
| | | 'S' | 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z' |
| | | 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i' |
| | | 'j' | 'k' | 'l' | 'm' | 'n' | 'o' | 'p' | 'q' | 'r' |
| | | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z' |
| ; | |
GREEK | = | ' ' | ' ' | ' ' | ' ' | ' ' ; |
OTHERLETTER | = | ' ' | ' ' | ' ' ; |
SPECIAL | = | STROKECHAR | WORDGLUE | BRACKET | BOXCHAR | NLCHAR | SPACE ; |
STROKECHAR | = | ''' | '!' | '?' ; |
WORDGLUE | = | ' ' | ' ' | ' ' | ' ' | '_' ; |
BRACKET | = | '(' | ')' | '[' | ']' | '{' | '}' | ' ' | ' ' | ' ' | ' ' ; |
BOXCHAR | = | ZEDCHAR | AXCHAR | SCHCHAR | GENCHAR | ENDCHAR ; |
SYMBOL | = | 'ampersand' | ' ' | ' ' | ' ' | ' ' | ' ' | ' ' | ' ' | ' ' | '/' | '=' | ' ' | ':' | ';' | ',' | '.' | ' ' | ' ' | '>>' |
| | | ?any characters of the mathematical toolkit with neither letter or |
| | number property (as supported)? |
| | | ?any other UCS characters with neither letter or |
| | number property and that are not in SPECIAL (as supported)? |
| ; | |
CADiZ-specific characters
CADi
``supports'' use of all UCS characters,
each classified according to its general property.
However, CADi
is able to display only some characters
in the most desirable form;
others are displayed as nameplates showing their code numbers.
The CADi
core language uses not only the characters enumerated in
the above formal definition from ISO Standard Z
but also '
', '
', '
' and '"'
as additional characters in the SYMBOL class.
The uses of these additional characters are documented
in extensions.
To check that a Z specification uses only ISO Standard
notations, invoke cadiz with the -ws option.
IT 28-Jan-2002