Mode Transitions¶
Mode changes correspond to changes in the way that the input stream is
interpreted. Such language changes, are sometimes indispensable, sometimes they
are a convenient means to handle situations which are difficult to handle,
otherwise. In C++, for example, a >>
is a-priory equivocal. It may mean
two closing delimiters of template specifications, such as in
or a shifting operator in statements [1] as shown below.
Two dedicated modes for template specifications and statements present
a straightforward solution. The former only knows <
and >
and
the latter also knows <<
and >>
.
In Quex syntax, mode transition commands follow the =>
marker. Mode
transitions are triggered with the commands GOTO
, GOSUB
, and
RETURN
. GOTO
initiates a plain mode transition. GOSUB
stores the
current mode on the mode stack before the transition happens. RETURN
is
the counter part to GOSUB
. It lets the lexer transit to the mode from where
the last GOSUB
happened, i.e. the last mode put on the mode stack. This
corresponds to the elementary implementations of a function call/return
mechanism [2].
GOTO
and GOSUB
take at least one argument, namely the target mode. The
following implements the transition to and from a STRING
interpretation
mode via GOTO
as a reaction to a "
.:
.. code-block:: cpp
mode NORMAL { “"” => GOTO(STRING); } mode STRING { “"” => GOTO(NORMAL); }
With plain GOTO
, STRING
needs to know the mode to which it needs to
return once the string is terminated. With GOSUB
and RETURN
the same
behavior may be implemented as follows.:
.. code-block:: cpp
mode NORMAL { “"” => GOSUB(STRING); } mode STRING { “"” => RETURN(); }
The advantage of GOSUB
/RETURN
is that STRING
may be entered also
from other modes and returns automatically to the caller mode without actually
knowing it. The GOSUB
/RETURN
approach supports the definition of
sub-languages that may be entered from an unspecific caller mode to which the
lexer returns once the sub-language interpretation has terminated. However,
with this approach, there is a potential of mode-stack exhaustion.
mode ONE { ... "x" => GOSUB(TWO); ... }
mode TWO { ... "y" => GOTO(THREE); ... }
mode THREE { ... "z" => GOTO(ONE); ... }
Above, starting from mode ONE
, the lexer pushes the current mode on the
stack as soon as the pattern "x"
matches. It dives into TWO
, from where
it may transit via THREE
to ONE
again. If another x
occurs another
mode is put on the stack. As long as the sequence xyz
repeats, the mode
stack level grows. Modes which are subject to GOSUB
transitions, better do
not transit via GOTO
to other modes. If necessary, the mode stack size
can be adapted with the command line option --mode-stack-size
.
GOTO
, GOSUB
and RETURN
may actually also send tokens alongside. A
token to-be-sent is specified as additional argument in the same manner as in
brief token sending commands. This is demonstrated in the examples below.
"\"" => GOTO(STRING, QUEX_TKN_KEY);
"\"" => GOSUB(STRING, QUEX_TKN_KEY(Lexeme));
"\"" => RETURN(QUEX_TKN_KEY(number=0));
The first GOTO
sends a token with the id QUEX_TKN_KEY
but leaves
all members as they are. The GOSUB
sends a token with the id QUEX_TKN_KEY
but sets the text to LexemeNull
. The RETURN
sends a token with id
QUEX_TKN_KEY
and sets the member number
to zero.
Two mode tags streamline mode transitions, namely <entry:>
and <exit:>
.
Mode names are listed inside those tags are separated by whitespace. When the
<entry:>
tag is specified, only those modes may enter which are mentioned
in its list. Respectively, tag <exit:>
restricts the modes towards the mode
may exit. Entry and exit conditions support firmness and clarity of possible
mode transitions–a feature that comes particularly handy when modes become
rather large constructs.
mode X : <entry: Y> <exit: Y> { ... }
mode Y : <entry: X Z> <exit: X Z> { ... }
mode Z : <entry: Y> <exit: Y> { ... }
The specification above restricts mode transitions, so that X
may only
transit to Y
, but not directly to Z
. Also, X
may only be entered
from Y
. Y
may be entered from X
and exit towards X
. But also,
it may enter Z
and permits the the return from Z
. In other words,
Z
can only be entered via Y
. No matter how many transitions
are specified inside each mode, the entry/exit guards make it possible to
view mode transition from a bird’s eye perspective.
Footnotes