Inheritance

Subtyping, or type polymorphism, in object-oriented programming is a mechanism to establish a relationship of substitutability. This, so called Liskov Substitution [#f1]_ Liskov1999behavioral, means that if type B is a subtype of A, then any instance of B may be used where an A is expected. Inheritance [#f2]_ is different from subtyping [] as it does not guarantee behavioral substitutability [#f3]_. It is a mechanism to build a new type on the basis of an existing type, i.e. re-using its implementation. Nevertheless, it is common practice to use inheritance to express subtype relationships and commonality Tempero2013programmers.

Modes can be related by inheritance, but not by subtyping. That is, if mode DERIVED is inherits mode BASE, then DERIVED matches all patterns matched by BASE plus some additional ones. It also receives all event handlers defined in BASE. However, DERIVED cannot be expected to play seamlessly the role of BASE because it may match other things. An example is shown below.

mode BASE           { "print" => QUEX_TKN_PRINT(); }
mode DERIVED : BASE { [a-z]+  => QUEX_TKN_IDENTIFIER(Lexeme); }

The mode BASE above is expected to send a PRINT token when the lexeme print occurs and fails on anything else. The derived mode DERIVED sends PRINT upon the occurrence of print, but it does not fail on anything else. Instead on any sequence of lowercase letters it sends the IDENTIFIER token.

Each mode defines an interpretation language. If one mode was to be substituted by another one seamlessly, on the same input sequence, both must be identical. Inheritance relationships are nonsensical in this case. Practically, mode inheritance means that a base mode implements things which are present in multiple derived modes to reduce code duplication and prevent divergence of common features. Features which are present in a base mode are an expression of commonality between all derived modes. Features in a derived mode express a derived mode’s speciality. The following subsections explore how inheritance affects the different mode elements.

Pattern-Action Pairs

For a derived mode, inheriting pattern-action pairs is equivalent to pasting a base mode’s pattern-action pairs in front of its own. Thus, base mode patterns have precedence over a mode’s own patterns [#f4]_.

mode BASE {
    keyword_list { for; while; if; else; }
}
mode DERIVED : BASE {
    [0-9]+ => QUEX_TKN_NUMBER(Lexeme);
}

In the example above, BASE matches some keywords. DERIVED matches the same keywords, but also integer numbers. This definition of DERIVED is equivalent to the one below where patterns are pasted manually.

mode DERIVED {
    keyword_list { for; while; if; else; }
    [0-9]+       => QUEX_TKN_NUMBER(Lexeme);
}

Event Handlers

An event handler in a derived mode cannot overwrite the event handler of a base mode. Also, event handler code is not aggregated. An event handler must only be defined once in the inheritance tree of a mode. Forbidding overwriting and code aggregation follows the spirit of subtyping–assuming that it might be very counter intuitive, if a derived mode and its base mode behave differently on the same event. Code aggregation of event handlers, shattered in the inheritance tree, would cause problems of traceability. The rational conclusion is to impose uniqueness of an event handler in a mode’s inheritance tree.

An emerging need to implement an event handler in a derived mode which has already been defined in a base mode, actually, indicates a need for redesign. In the example below, all modes A, B, and C are derived from BASE. However, C requires to modify on_failure, which it cannot.

mode BASE     { on_failure { /* do something */ } ... }
mode A : BASE { /* inherit BASE's on_failure */ } ... }
mode B : BASE { /* inherit BASE's on_failure */ } ... }
mode C : BASE { on_failure { /* do something else */ } ... } // ERROR

To solve this situation, the base modes may be split in a way so that C shares all commonalities, except for the on_failure handler. An auxiliary mode, BASE_ON_FAILURE behaves like BASE but has the on_failure handler which is required by A and B.

mode BASE              { ... }
mode ON_FAILURE : BASE { on_failure { /* do something */ } ... }
mode A : ON_FAILURE    { /* inherit BASE_2's on_failure */ } ... }
mode B : ON_FAILURE    { /* inherit BASE_2's on_failure */ } ... }
mode C : BASE          { on_failure { /* do something else */ } ... } // OK

Counters

Counters must be defined uniquely in an inheritance tree. Again, it would be repugnant to if a derived mode’s counting behavior was different from its base mode.

Indentation Handlers

Indentation handlers must be uniquely defined in an inheritance tree.

Skippers

Skippers are by their nature aggregative. In a single mode any number of skipper definitions may occur. Adding skippers in a derived mode only adds behavior to what is expected from the base mode.

Entry- and Exit-Permissions

Entry- and exit-permissions are inherited differently depending on the inheritance happening at the beginning or the destination of a mode transition option. Let X be a some mode unrelated to BASE and DERIVED.

../figures/mode-transition-to-x.*

Fig. 27 Transition to a mode X from a base mode and its derived mode.

In figure Transition to a mode X from a base mode and its derived mode. a mode BASE contains a transition to an unrelated mode X. A transition from DERIVED towards X exists, because it is inherited from BASE. In the spirit of data abstraction, the derived mode not have to bother about becoming implicitly the start mode of a transition, though. Consequently, the following is necessary for the inherited transition to work.

  1. BASE : <exit: X> implies that DERIVED is permitted to exit towards X.

  2. X : <entry: BASE> implies that X permits entry from DERIVED.

../figures/mode-transition-from-x.*

Fig. 28 Transition from a mode X to base mode its derived mode.

Figure Transition to a mode X from a base mode and its derived mode. shows transitions from an unrelated mode X to BASE and DERIVED. The terminal mode of a transition is always explicit (e.g. GOTO(DERIVED)). Thus, there is no need to inherit exit and entry restrictions, so it holds that

  1. X : <exit: BASE> does not imply, that X may exit towards DERIVED.

  2. BASE : <entry: X> does not imply, that DERIVED may be entered from X.

Multiple Inheritance

The value of multiple inheritance, i.e. the derivation from more than one base at the same time, is disputed in the computer science community [], []. Quex permits the derivation from more than one base mode. With single inheritance, the sequence of inheritance is always linear. Multiple inheritance, however, implies the possibility of inheritance trees with forks in it. The sequence of collecting mode characteristics in these trees is subject to the linearization rules [#f5]_ below:

  1. A base mode has precedence over its derived mode.

  2. The precedence of multiple base modes follows the sequence of specification.

  3. No mode is treated twice.

The following example explores rules 1 and 2.

mode A : B, C { ... }
mode B : D    { ... }

Since A is derived from B and C, the latter two have precedence over A. According to rule 2 B has precedence over C, because it is mentioned before in the list. According to rule 1, D’s content, however, has precedence over its derived mode B’s content. Consequently, the precedence chain is D, B, C, and A. Multiple inheritance implies the potential of the diamond problem [], such as in the example below.

. . code-block:: cpp

mode A : B, C { … } mode B : D { … } mode C : D { … }

Here, mode A inherits from D via B and via C. If there was no rule 3, then according to rule 1 D has a higher precedence than B, because B is derived from D. According to rule 2 C and its base modes have lesser precedence than B. Therefore, D must have a lower precedence than B. The precedence relation between D and B contradictory. With rule 3, however, D is reached first by diving along mode B. Once its content is aggregated it is no longer considered when reached by the path of C.

The precedence chain for latter example is D, B, C, and A. It is the same as in the previous example. In other words, if a mode A is derived from a base mode which is-a mode Z, then the fact that another base mode is-a Z does not change anything. In the upshot, A is-a Z.

When working with multiple inheritance, it is advisable to have a look at the documentation string which is generated alongside the lexer code. Assume the modes from the diamond problem matching only four patterns, namely alpha, beta, gamma, and delta, then the generated code might contain a table in a comment similar to the following.

/* MODE A:
 ...
 *     PATTERN-ACTION PAIRS:
 *       ( 75) D: "delta"
 *       ( 76) B: "beta"
 *       ( 77) C: "gamma"
 *       ( 78) A: "alpha"
 ...
 */

The first column gives the precedence index that was created for the pattern. The second column tells from what mode the pattern was inherited and the third column displays the pattern itself.

Controlling Inheritance

There is a mode tag to control inheritance: <inheritable: arg>. It accepts three possible settings for arg, as they are namely.

yes

The mode can be used as base mode and it is also implemented. This is the default setting.

only

The mode can only serve as a base mode. It is not implemented.

no

The mode cannot serve as a base mode, but it is implemented.