Quex provides support for character encodings by means of converters which are plugged into buffer fillers. The converters currently provided directly by quex are IBM’s ICU library and the GNU IConv library. The latter is present by default on most Unix Machines, the former on Windows Systems and the like. The lexical analyzer engine is not aware of the conversion. It commands the buffer filler to filler to fill the buffer memory and then iterates over the content. It does not know what processes happen in the background. The buffer filler, however, is internally adapted to the library on which it relies. If you want to use character set conversion, for example to parse a file encoded in ISO-8859-3, then you need to have one of the supported libraries installed.
Quex can setup the buffer filler for the converter of your choice by command line arguments. Those are
If you want to use IBM’s ICU library for conversion.
If you want to use GNU IConv for conversion.
The decision which one to choose needs to be made should not have to do anything with the judgement about the clarity of the API of those libraries. The user is completely isolated from the details of that. Questions of concern might be
In order to use the converter you need to pass the name of the encoding to the constructor of the lexical analyzer as shown below. For details consider the class definition of the generated lexical analyzer class.
quex::tiny_lexer qlex("example.txt", "UTF-8");