The Buffer

A buffer is a fixed-size chunk of adjacent memory cells to be filled with a sequence of lexatoms for fast access by the CPU. The lexer reads the lexatoms from the buffer cells step by step from front to rear–each one triggering a state transition in its DFA. At the buffer’s content’s borders, sentinel lexatoms are placed. When the lexer hits a sentinel in forward direction, this tells that new content is required to be loaded from some input source. When analyzing a pre-context, the lexer may hit a sentinel in backward direction. In this case, a lexer needs to fill the buffer with previous content. The procedures are shown in fig-a and fig-b.

,, _fig:load_forward-backward:

FW: Some content is copied backwards and new content is loaded into

buffer.

BW: Some content is moved forwards and new content is loaded into

buffer.

The sequence diagram in fig:loading_process displays the buffer loading process. When the lexer hits the border of a buffer’s content, it requests more lexatoms from the lexatom loader. The lexatom loader, then requests more input from the byte loader. The byte loader interacts with the input source to receive more bytes. The raw bytes are then passed back to the lexatom loader who prepares lexatoms out of them and fills them into the buffer.

../_images/sequence-diagram-buffer-loading.svg

Fig. 30 The buffer loading process established with lexatom loader and byte loader.

Lexemes inside the buffer remain at their very same position as long as none of the following events occurs:

  • Reload forward or backward.

  • Inclusion where the buffer is shared with an included stream.

  • Destruction.

As mentioned in section sec-event-handlers, the handler on_buffer_before_change may be defined to react on these events.

The relation between a buffer’s content and the input stream is given by the position of the first lexatom–let it be called first_lexatom_position. Given the position offset delta of a lexatom to the beginning of the buffer, the lexatom position in the input stream position can be determined by the equation below:

position = first_lexatom_position + delta

During stream navigation, the position is possibly translated into different understandings of stream positioning. These translations happen behind the scenes of byte loading and lexatom loading.