Quex Lexical Analyzer Generator Logo
0.67.5
  • Make It!
  • Coffee & Cookies
  • Syntax
  • Regular Expressions
  • Modes
    • Pattern-Action Pairs
      • Lexeme Related Variables
      • Brief Commands
      • Mode Transitions
      • brief and keyword_list
      • Source Code Reactions
    • Event Handlers
    • Summary
    • Line and Column Number Counting
    • Indentation Counting / Off-Side Rule
    • Interference with Pattern Matching
    • Tabs vs. Spaces
    • Inheritance
    • Precedence Rules
    • Pitfalls
  • Input
Quex Lexical Analyzer Generator
  • Modes
  • Pattern-Action Pairs
  • Lexeme Related Variables

Lexeme Related Variables¶

Actions related to matches have access to set of variables that describe the matched lexeme, as shown in Fig. 26. LexemeBegin point to the first lexatom of the current lexeme. LexemeEnd points to the first lexatom after the current lexeme. Lexeme points to the same location as LexemeBegin. However, when the keyword Lexeme is used in an action, Quex generates a temporary terminating zero at the location pointed to by LexemeEnd. With this setup, Lexeme is a string which is compatible with traditional string functions such as strlen(), strcpy(), etc. LexemeL carries the length of the current lexeme. LexemeNull is a pointer to an immutable lexeme of zero length (consisting only of a terminating zero). The definition of all lexeme-related variables is listed in listed in Table 9.

../../_images/lexeme-variables.svg

Fig. 26 Variables related to the current match’s lexeme.¶

Table 9 Implicit variables and their type.¶

name

type

meanining

Lexeme

pointer to LEXATOM

pointer to beginning of zero-terminated begin of currently matched lexeme.

LexemeBegin

pointer to LEXATOM

pointer to beginning of currently matched lexeme (without terminating zero).

LexemeEnd

pointer to LEXATOM

pointer to first lexatom after the currently matched lexeme.

LexemeL

SIZE

length of current lexeme in number of lexatoms; shorthand for LexemeEnd - LexemeBegin.

LexemeNull

pointer to LEXATOM

global constant pointing to a lexeme of length zero [#f2]_. This pointer is valid at any time.

Warning

Lexeme, LexemeBegin, and LexemeEnd point into the lexer’s buffer. It can only be assumed to point to the correct position until the event of buffer change. Lexeme can only be assumed to be zero-terminated inside the according action of a match. The terminating zero is removed at the beginning of the next analysis step.

The generated engine will only assign values to LexemeBegin, LexemeEnd, or Lexeme if they are mentioned by in the according action explicitly by their name. Consider the following counter example.

header {
#  define CAMOUFLAGE Lexeme
}
mode EXAMPLE {
   [a-z]+ { self.send_string(CAMOUFLAGE); }
}

The key Lexeme is not visible in the code fragment for pattern [a-z]+. It is hidden behind the expansion of CAMOUFLAGE. Since Quex does not find the name Lexeme in the code fragment, it assumes that there is no reason to pay the computational effort for preparing it. As a consequence, the usage of Lexeme in the code fragment is dysfunctional.

Previous Next

© Copyright 2005-2023, Frank-Rene Schaefer; Licensing of Documentation: Creative Commons BY-NC-ND..

Built with Sphinx using a theme provided by Read the Docs.