Modes

Lexers do everything they do in a mode which determines its behavior. Behavior can be understood as a particular instance of sets of cause and effect relationships. A cause in lexical analysis consists of events of pattern matches and events that occur in the lexer itself. An effect is what happens as reaction to a cause. Cause and effect are associated in terms of pattern-action pairs and event handlers. A lexer can only be in one mode at a given point in time. That is, for any given point in time the cause and effect relationship is deterministic.

Pattern-action pairs and event handlers are specified in a mode’s main section surrounded by curly brackets. The following is a trivial example of a mode matching on lowercase words of less or equal 64 letters, skipping whitespace, and reacting on the end-of-stream event.

mode EXAMPLE : <skip: [ \t\n]>
{
   on_end_of_stream => QUEX_TKN_TERMINATION();
   [a-z]{,64}       => QUEX_TKN_IDENTIFIER(Lexeme);
}

The first two sections deal with patter-action pairs and event handlers. The next section is about ‘patter-action pairs’ to analyse relevant content. It follows the presentation of ‘skippers’ which are designed to ignore irrelevant content. An explanation of line and column number counting lays down the groundwork for the discussion of indentation based parsing, i.e. the so called off-side rule [].

The second major category of cause and effect relationships are events and their handlers. In the according section events that may occur and how to customize their handling. After the description of all mode characteristics, a subsection depicts a mode may inherit characteristics from another. After the explanation of the mechanisms of inheritance, a discussion of pattern re-prioritization follows. A final section pinpoints some pitfalls and indicates methods to avoid them.