Stall Conditions

A lexer may possibly stall, if mode transitions happen without progress in the input stream in the given scenarious below.

  • undo() undoes the current input position progress, but no pattern and no event occurs that ensures progress.

  • Mode transitions in on_entry and on_exit handlers may causes infinite loop transitions between modes.

When undo(), or any seek backward, is called, it must be ensured that the lexer does not drop into the same code that called the undo() again and again. Due to its deterministic behavior, if the lexer remains in the same mode, it will enter the same code again, when it restarts after undo(), such as in the following example.

mode X { "go" { self.undo(); } }

One might either ‘flag’ the undo operation, such as in

mode X {
    ...
    "go" {
        if( false == self.undone ) { self.undo(); self.undone = true; }
        ...
    }
}

But, this behavior it is more appropriate to implement this behavior by regular expressions. A better way is to to transit to a different mode, i.e. a different language, to interpret what cannot be interpreted by the current mode. Then, however, it must be ensured, that the transition back does not happen on something that triggers the original transition.

mode X {
    ...
    "go" { self.undo(); self.enter_mode(Y); }
}
mode Y {
    ... /* nothing matching "go" */
    on_failure { self.undo(); self.enter_mode(X); }
}

In the code above, the lexer being in mode X undoes the match of go to be interpreted by Y. In mode Y there might be many patterns but none matching go, so that it is caught by on_failure. If on_failure undoes the matching of go and reenters X, the loop continues infinitely.

Another cause of infinite loops may occur, when mode transition handlers trigger mode transitions without being aware about the possible consequences, as shown below.

mode X {
   on_entry { self.enter_mode(Y); }
   ...
}
mode Y {
   on_entry { self.enter_mode(X); }
   ...
}

In the case above, X triggers a direct transition to Y as soon as it is entered. Y, then triggers a direct transition to X upon entry. The result is an infinite repetition of calls to the on_entry handlers. This case, of course, is oversimplified. However, if mode transitions are not avoided completely, such possible flows of control may slip into the lexical analyzer.

As a reminder, to prevent stalling a lexical analyzers, the following hints need to be considered.

  • If undo() or a seek() in backwards direction is applied in a handler, it must be made sure, that at all circumstances the lexer starts in a different state where the same undo() or seek() cannot appear.

  • Entry and exit handlers of modes shall not contain mode transitions.

Mode transitions in on_entry or on_exit, actually, render the mode transition commands obscure. If X triggers under some circumstance upon entry to another mode Y, then GOTO(X) means to enter X under some implicit conditions and Y under some implicit conditions. Implicit conditions are the mother of confusion, and thus best avoided.