Precedence Rules

At a given position in the input stream, it is possible that more than one pattern match at the same time. However, for the lexical analysis step to be deterministic, only one may be the winner, i.e. the one whose according action is executed. That is, if more than one pattern match at the current position, the ‘winning’ pattern is determined by the following rules.

  1. Longest Match: The pattern that matches the longest lexeme wins.

Example:

mode EXAMPLE {
    "for"     => QUEX_TKN_KEYWORD_FOR();
    "forest"  => QUEX_TKN_NOUN_FOREST();
}

Given the input stream forest, then the pattern "forest" wins because it matches six letters and for only the first three.

  1. Positional Precedence: The pattern that was specified first wins.

Example:

mode EXAMPLE {
    [a-z]+   => QUEX_TKN_IDENTIFIER(Lexeme);
    "print"  => QUEX_TKN_COMMAND_PRINT();
}

Given the input stream print both pattern match the exact same lexeme. However, [a-z]+ wins, because it is specified before "print".

In the context of inheritance, the longest match rule implies that a pattern of a derived mode may outrun the pattern of a base class. Thus longest match implies that a derived mode may differ in its behavior from its base mode. The positional precedence rule, however, lets a base mode’s pattern win, since its pattern are considered to be mentioned earlier. The positional precedence rule does not impose a different behavior of derived and base mode.

Positional Precedence Modifications

The rule of longest match is determined by the composition of concurrent pattern. For a given set of patterns to lurk on the input stream, it cannot be adapted. The positional precedence, however, make it possible to prefer one pattern over another by specifying its pattern-action pair at an earlier position in a mode definition. With inheritance, pattern-action pairs of a base mode are pasted in front of the derived mode’s patterns. A base mode’s pattern-action pairs all have positional precedence over the derived mode’s pattern. To avoid undesirable effects for the derived mode, there are the commands DEMOTION and DELETION.

<pattern> DEMOTION;

Denotes a pattern of any higher ranked pattern action pair with a pattern equivalent to <pattern> to the positional precedence of the position where it occurs.

The following example illustrates its usage with an identifier-keyword problem. Mode BASE defines how to match an identifier without considering possible interferences with keyword definitions, such as the keyword print in DERIVED.

mode BASE           { [a-z]+  => QUEX_TKN_IDENTIFIER(Lexeme); }
mode DERIVED : BASE { "print" => QUEX_TKN_KW_PRINT(); }

Here, [a-z]+ matches on any sequence of letters, even when print appears in the input stream. This is so, because [a-z]+ and "print" both match the same length (longest match is not decisive), but [a-z]+ has a higher positional precedence than "print".

Using DEMOTION the positional precedence of the [a-z]+ pattern may be moved behind that of the "print", so that "print" can match. The [a-z]+ pattern’s action is executed for all remaining lowercase character sequences. The usage is shown below.

mode BASE {
  [a-z]+  => QUEX_TKN_IDENTIFIER(Lexeme);
}
mode DERIVED : BASE {
  "print" => QUEX_TKN_KW_PRINT();
  [a-z]+  DEMOTION;
}
<pattern> DELETION

Deletes any higher ranked pattern action pair with a pattern matches <pattern>.

In the following code, fragment DELETION is applied to solve the identifier-keyword problem.

mode BASE {
    [a-z]+  => TKN_IDENTIFIER(Lexeme);
}
mode B : A {
    [a-z]+  DELETION;
    print   => TKN_KW_PRINT();
}

DEMOTION and DELETION re-prioritize or delete only those patterns that have higher precedence and match exactly the same set of lexemes as the pattern for which they are specified. The do not affect patterns that match a subset or a superset.

Positional precedence modifications oppose the idea of inheritance as subtyping, i.e. inheritance is not an is-a relationship. When applied, a derived mode’s matching behavior differs from the matching behavior of its base mode on its very own patterns. The application of DEMOTION and DELETION may constitute a practical temporary solution. The use of DELETION indicates the inheritance of pattern-action pairs which are specified in a base class, but useless in a derived class. Thus, the usage of DELETION indicates the need for redesign of a mode inheritance hierarchy.