Mode Transitions

Mode changes correspond to changes in the way that the input stream is interpreted. Such language changes, are sometimes indispensable, sometimes they are a convenient means to handle situations which are difficult to handle, otherwise. In C++, for example, a >> is a-priory equivocal. It may mean two closing delimiters of template specifications, such as in

or a shifting operator in statements [1] as shown below.

Two dedicated modes for template specifications and statements present a straightforward solution. The former only knows < and > and the latter also knows << and >>.

In Quex syntax, mode transition commands follow the => marker. Mode transitions are triggered with the commands GOTO, GOSUB, and RETURN. GOTO initiates a plain mode transition. GOSUB stores the current mode on the mode stack before the transition happens. RETURN is the counter part to GOSUB. It lets the lexer transit to the mode from where the last GOSUB happened, i.e. the last mode put on the mode stack. This corresponds to the elementary implementations of a function call/return mechanism [2].

GOTO and GOSUB take at least one argument, namely the target mode. The following implements the transition to and from a STRING interpretation mode via GOTO as a reaction to a ".:

.. code-block:: cpp

mode NORMAL { “"” => GOTO(STRING); } mode STRING { “"” => GOTO(NORMAL); }

With plain GOTO, STRING needs to know the mode to which it needs to return once the string is terminated. With GOSUB and RETURN the same behavior may be implemented as follows.:

.. code-block:: cpp

mode NORMAL { “"” => GOSUB(STRING); } mode STRING { “"” => RETURN(); }

The advantage of GOSUB/RETURN is that STRING may be entered also from other modes and returns automatically to the caller mode without actually knowing it. The GOSUB/RETURN approach supports the definition of sub-languages that may be entered from an unspecific caller mode to which the lexer returns once the sub-language interpretation has terminated. However, with this approach, there is a potential of mode-stack exhaustion.

mode ONE   { ... "x" => GOSUB(TWO);  ... }
mode TWO   { ... "y" => GOTO(THREE); ... }
mode THREE { ... "z" => GOTO(ONE);   ... }

Above, starting from mode ONE, the lexer pushes the current mode on the stack as soon as the pattern "x" matches. It dives into TWO, from where it may transit via THREE to ONE again. If another x occurs another mode is put on the stack. As long as the sequence xyz repeats, the mode stack level grows. Modes which are subject to GOSUB transitions, better do not transit via GOTO to other modes. If necessary, the mode stack size can be adapted with the command line option --mode-stack-size.

GOTO, GOSUB and RETURN may actually also send tokens alongside. A token to-be-sent is specified as additional argument in the same manner as in brief token sending commands. This is demonstrated in the examples below.

"\"" => GOTO(STRING, QUEX_TKN_KEY);
"\"" => GOSUB(STRING, QUEX_TKN_KEY(Lexeme));
"\"" => RETURN(QUEX_TKN_KEY(number=0));

The first GOTO sends a token with the id QUEX_TKN_KEY but leaves all members as they are. The GOSUB sends a token with the id QUEX_TKN_KEY but sets the text to LexemeNull. The RETURN sends a token with id QUEX_TKN_KEY and sets the member number to zero.

Two mode tags streamline mode transitions, namely <entry:> and <exit:>. Mode names are listed inside those tags are separated by whitespace. When the <entry:> tag is specified, only those modes may enter which are mentioned in its list. Respectively, tag <exit:> restricts the modes towards the mode may exit. Entry and exit conditions support firmness and clarity of possible mode transitions–a feature that comes particularly handy when modes become rather large constructs.

mode X : <entry: Y>   <exit: Y>   { ... }
mode Y : <entry: X Z> <exit: X Z> { ... }
mode Z : <entry: Y>   <exit: Y>   { ... }

The specification above restricts mode transitions, so that X may only transit to Y, but not directly to Z. Also, X may only be entered from Y. Y may be entered from X and exit towards X. But also, it may enter Z and permits the the return from Z. In other words, Z can only be entered via Y. No matter how many transitions are specified inside each mode, the entry/exit guards make it possible to view mode transition from a bird’s eye perspective.

Footnotes