Binary Graph Operations¶

Set Algebraic Operations¶

The following commands are related to the set of lexemes matched by DFAs. That is, the resulting DFA matches a set of lexemes which results from the operation applied on the set of lexemes matched by its operands. While the operations for union and intersection have been mentioned before, they are listed here again for the sake of completeness.

\\Intersection{A B}: Matches the intersection of the sets of lexemes matched by A and B.

\\Union: Matches the union of the sets of lexemes matched by A and B.

\\Diff: Matches the difference of the sets of lexemes matched A without those matched by B.

\\SymDiff: Matches the set of those lexemes which are matched either by A or by B, but not by both.

Set Filtering¶

The following functions filter from the set of lexemes of a first DFA depending on their beginning, ending, or containing matching lexemes of a second DFA.

\\Begin{A B}: Results in a DFA that matches all lexemes of A which begin with a sublexeme that matches B.

\\End: Results in a DFA that matches all lexemes of A which end with a sublexeme that matches B.

\\In: Results in a DFA that matches all lexemes of A which contain a sublexeme that matches B.

\\NotBegin{A B}: Results in a DFA that matches all lexemes of A which do not begin with a sublexeme that matches B.

\\NotEnd: Results in a DFA that matches all lexemes of A which do not end with a sublexeme that matches B.

\\NotIn: Results in a DFA that matches all lexemes of A which not contain a sublexeme that matches B.

All of those filtering commands are, actually, not more than shorthands. As mentioned earlier R(\Universe), (\Universe)R, (\Universe)R(\Universe) express all sets of lexemes matching R are at the beginning, the end, or somewhere inside. All operations of the above are expressed as combinations of difference of the first pattern with one of these three expresssions. The Not group, applies the complement before intersecting.

Lexeme Pruning¶

Cut operations, cut paths out of a DFA.

\\CutBegin{A B}: Cuts the beginning of all paths out of A which match B. For example, \CutBegin{abc|123 ab} is equivalent to c|123.

\\CutEnd: Cuts the end of all paths out of A which match B. For example, \CutBegin{abc|123 bc} is equivalent to a|123.

\\LeaveBegin: Leaves only those paths from A which begin with something matching B. For example, \LeaveBegin{abc|123cd [0-9]+} is equivalent to 123.

\\LeaveEnd: Leaves only those paths from A which end with something matching B. For example, \LeaveBegin{abc|123cd [a-z]+} is equivalent to abc|cd.

The difference between cutting operations on lexemes of this subsection and the set filtering operations of the previous section seems subtle. With the \lexemes command in a define section, this can be nicely explored.:

define {
   \lexemes \Begin{otto|friedhelm fried}
   \lexemes \NotBegin{otto|friedhelm fried}
   \lexemes \CutBegin{otto|friedhelm fried}
   \lexemes \LeaveBegin{otto|friedhelm fried}
}

The output of this demonstrates

``\Begin{otto|friedhelm fried}``

generates a DFA that matches only those lexemes that start with fried. It matches:

friedhelm

``\NotBegin{otto|friedhelm fried}``

generates a DFA that matches only those lexemes that do not start with fried. Therefore, it matches:

otto

``\CutBegin{otto|friedhelm fried}``

generates a DFA by cutting from all lexemes the beginning what matches fried. The result matches the two lexemes:

otto
helm

``\LeaveBegin{otto|friedhelm fried}``

generates a DFA by cutting from all lexemes the beginning what does not match fried. The result matches:

fried