Binary Graph Operations¶
Set Algebraic Operations¶
The following commands are related to the set of lexemes matched by DFAs. That is, the resulting DFA matches a set of lexemes which results from the operation applied on the set of lexemes matched by its operands. While the operations for union and intersection have been mentioned before, they are listed here again for the sake of completeness.
- \\Intersection{A B}
Matches the intersection of the sets of lexemes matched by
A
andB
.
- \\Union
Matches the union of the sets of lexemes matched by
A
andB
.
- \\Diff
Matches the difference of the sets of lexemes matched
A
without those matched byB
.
- \\SymDiff
Matches the set of those lexemes which are matched either by
A
or byB
, but not by both.
Set Filtering¶
The following functions filter from the set of lexemes of a first DFA depending on their beginning, ending, or containing matching lexemes of a second DFA.
- \\Begin{A B}
Results in a DFA that matches all lexemes of
A
which begin with a sublexeme that matchesB
.
- \\End
Results in a DFA that matches all lexemes of
A
which end with a sublexeme that matchesB
.
- \\In
Results in a DFA that matches all lexemes of
A
which contain a sublexeme that matchesB
.
- \\NotBegin{A B}
Results in a DFA that matches all lexemes of
A
which do not begin with a sublexeme that matchesB
.
- \\NotEnd
Results in a DFA that matches all lexemes of
A
which do not end with a sublexeme that matchesB
.
- \\NotIn
Results in a DFA that matches all lexemes of
A
which not contain a sublexeme that matchesB
.
All of those filtering commands are, actually, not more than shorthands. As
mentioned earlier R(\Universe)
, (\Universe)R
,
(\Universe)R(\Universe)
express all sets of lexemes matching
R
are at the beginning, the end, or somewhere inside. All operations
of the above are expressed as combinations of difference of the first
pattern with one of these three expresssions. The Not
group, applies the
complement before intersecting.
Lexeme Pruning¶
Cut operations, cut paths out of a DFA.
- \\CutBegin{A B}
Cuts the beginning of all paths out of
A
which matchB
. For example,\CutBegin{abc|123 ab}
is equivalent toc|123
.
- \\CutEnd
Cuts the end of all paths out of
A
which matchB
. For example,\CutBegin{abc|123 bc}
is equivalent toa|123
.
- \\LeaveBegin
Leaves only those paths from
A
which begin with something matchingB
. For example,\LeaveBegin{abc|123cd [0-9]+}
is equivalent to123
.
- \\LeaveEnd
Leaves only those paths from
A
which end with something matchingB
. For example,\LeaveBegin{abc|123cd [a-z]+}
is equivalent toabc|cd
.
The difference between cutting operations on lexemes of this subsection and the
set filtering operations of the previous section seems subtle. With the
\lexemes
command in a define
section, this can be nicely explored.:
define {
\lexemes \Begin{otto|friedhelm fried}
\lexemes \NotBegin{otto|friedhelm fried}
\lexemes \CutBegin{otto|friedhelm fried}
\lexemes \LeaveBegin{otto|friedhelm fried}
}
The output of this demonstrates
- ``\Begin{otto|friedhelm fried}``
generates a DFA that matches only those lexemes that start with
fried
. It matches:friedhelm
- ``\NotBegin{otto|friedhelm fried}``
generates a DFA that matches only those lexemes that do not start with
fried
. Therefore, it matches:otto
- ``\CutBegin{otto|friedhelm fried}``
generates a DFA by cutting from all lexemes the beginning what matches
fried
. The result matches the two lexemes:otto helm
- ``\LeaveBegin{otto|friedhelm fried}``
generates a DFA by cutting from all lexemes the beginning what does not match
fried
. The result matches:fried