Make It!¶
The most direct away to setup an input configuration is to pass a file-name to the constructor of a lexer. In C, the constructor is called explicitly as a function, as seen below.
...
myLexer lexer;
myLexer_from_file_name(&lexer, "example.txt", /* Converter */NULL);
...
In C++, the same thing is achieved by passing the arguments to the constructor at the time of initialization.
...
myLexer lexer("example.txt", /* Converter */nullptr);
...
This initiates the lexer to read its input from the file example.txt
in the
current working directory. Here, the lexer uses a byte loader based on the
Standard C or C++ I/O library functions. All related file handles and stream
objects are owned by the lexer and their allocation and deallocation happens
implicitly. If the lexer shall run on converted input, a converter may be
provided. In C, when using the ICU library, this looks like this:
#include <myLexer/lib/quex/converter/icu/Converter_ICU>
#include <myLexer/lib/quex/converter/icu/Converter_ICU.i>
...
const size_t LexatomSize_bit = sizeof(myLexer_lexatom_t)<<3;
myLexer lexer;
myLexer_from_file_name(&lexer, "example.txt",
quex_Converter_ICU_new(LexatomSize_bit, "UTF8", NULL));
The #include
-s of Converter_ICU
and Converter_ICU.i
paste the
declaration and implementation of the ICU converter API into the current file.
The setup above reads content from file example.txt
decodes it from UTF8 to
unicode and stores the unicode characters in the lexer’s buffer. In C++, only
the constructor call differs.
const size_t LexatomSize_bit = sizeof(myLexer_lexatom_t)<<3;
myLexer lexer(&lexer, "example.txt",
quex::Converter_ICU_new(LexatomSize_bit, "UTF8", nullptr));
The functions with names ending with _new
allocate and initialize a
converter. Further, they contain a flag which allows the lexer to delete the
converter when it is no longer needed.
When the standard I/O library is not suitable, an appropriate byte loader may be provided. The following sets up a byte loader for socket communication via POSIX file descriptors.
Or, in C++:
int connected_fd = accept(listen_fd, (struct sockaddr*)NULL ,NULL);
quex::ByteLoader* byte_loader = quex::ByteLoader_POSIX_new(connected_fd);
myLexer lexer(byte_loader, /* Converter */nullptr);
Similarly to the converters, the _new
functions allocate and initialize a byte
loader and let the lexer take over ownership. Further, a byte loader is
greedy, in a sense, that it takes ownership over any resource passed to
its constructor, such as a file descriptor or an input stream. Passing a pointer
to a local variable would, therefore, be a particularly bad idea.
The lexer’s constructor accepting a byte loader can, also, accept a converter.
Enough information is now provided to setup any configuration that relies on provided byte loaders (Standard I/O, POSIX, OSAL, …) and converters (ICU, IConv). The two subsequent sections provide more details for readers who want to put themselves at compfort knowing more about what happens beyond the curtains or who might want to provide more customized solutions. Quex’s design, though, facilitates the implementation of other customized instances. The knowledge to do this is provided in the subsequent sections.