Syntax¶

A Quex input file is divided into several ‘top-level’ sections. Every top-level section is identified by its section name followed by content in curly braces, as shown in the scheme below.

section-name {
    ...
}

This chapter the three aspects of lexical analyzer definition in terms of these top-level sections, namely:

Behavior Mandatory specification of behavior.

The behavior of a lexical analyzer defines how events in the input stream produce tokens as output. It is the essential part of a lexical analyzer specification. The remaining two sections discuss optional adaptions.

The Lexical Analyzer Class and its Memento Optional adaptions of the lexical analyzer class and its memento.
Token Identifiers and the Token Class Optional definitions of token identifiers and adaptions of the token class.

As an introduction, section syntax-number-format describes the number format which is used in many locations. These categories circumscribe the top-level sections in a Quex source file. In the following, abbreviations are used for generated classes and types. LEXATOM shall represent the type which carries a lexatom. LEXER shall represent the generated lexical analyzer class. MODE is the base class of all generated modes. SIZE is the type used to represent sizes (e.g. size_t in C99).