Manual Token Class Definition¶
When the command line option --token-class-file
is specified, followed by a
header file name, no token class is generated. This section discusses the
requirements on the token class in order to properly interact with the lexer.
At the end of this subsection, the command line arguments are mentioned which
communicate the token class configuration. In the following TOKEN
shall
represent the chosen name of the token class.
First of all, a token class/struct must provide a means to construct, destruct, and copy a token. For that the following three functions must be provided.
function:: void TOKEN_construct(TOKEN* me); function:: void TOKEN_destruct(TOKEN* me); function:: void TOKEN_copy(TOKEN* me, const TOKEN* other);
The me
pointer takes the role of the this
pointer in C++. In C++,
construction and destruction are best implemented in constructors and
destructors of the token class to ensure that all relevant constructors and
destructors of members are called. Then, the TOKEN_construct()
function
should call the constructor via placement new. TOKEN_destruct()
is
best implemented as an explicit call to the destructor.
For the implementation of brief token senders (section sec) the following function must be provided.
- bool TOKEN_take_text(TOKEN* __this,
- const LEXER_lexatom_t* Begin,
- const LEXER_lexatom_t* End);
This function tells the token to store information about the current lexeme.
Begin
points to the first lexatom of the text to be carried by the
token. End
points to the first lexatom after the last lexatom of concern.
As a last requirement, the standard members id
, line_n
, and
column_n
must be provided with their name. To generate proper interactions
with the lexer, Quex requires knowledge about the token class’ configuration.
This information is, either, passed on the command line, or, as shown at the
end of the previous section, in the token class’ header file, embraced by
<<<QUEX-OPTIONS>>>
tags. The command line arguments relevant for external
token class definition explained in the list below.
- --token-class-file name file-name¶
Disables the generation of the default token class and considers
file-name
to contain the name of the user defined header file.
- --token-class name0::name1 ... ::class-name¶
Tells about the token class’ name and namespace. Nested name spaces are mentioned from left to right and separated by
::
. The right-most name is the name of the token class itself. If the token class is located in the root name space, only the class name is specified without any::
.
The following three options define the type of the class members which carry
the token identifier id
, the line number line_n
and the column number
column_n
.
- --token-id-type type-name¶
- --token-line-n-type type-name¶
- --token-column-n-type type-name¶
If the lexer shall support token repetition (token identifiers with the \\repeatable
tag),
then one member in the token class must be specified to carry the repetition number. The
name of this member is communicated via the following option.
- --token-repetition-n-member-name number¶
When brief token senders shall accept lexemes as an input the token must
provide a TOKEN_take_text
function [#f1]_. This is communicated with the
following command line option.
- --token-class-support-take-text¶
In that case, also the lexatom type needs to be specified by the following options.
- --lexatom-type type-name¶
As a starting point, a generated token class might facilitate development.
This also, has the advantage that all command line arguments are pre-specified in
the <<<QUEX-OPTIONS>>>
– ready to be customized.