[Ocaml-pxp-users] lex, ulex, wlex, UTF-8

Alain Frisch Alain.Frisch at inria.fr
Sat Dec 17 09:52:50 PST 2005


Richard Jones wrote:
> Can someone tell me what lex, ulex and wlex are?  What is the
> difference between them?  Which one should I be using?

lex (resp. ulex) (resp. wlex) is an ocamllex (resp. wlex) (resp. ulex)
generated lexer.

wlex and ulex lexers automata works on the stream of Unicode code points
(not bytes as for ocamllex). As a consequence, the transition tables are
much smaller and independant of the Unicode encoding (the extraction of
code points from the stream of bytes is done by another layer).
Executables are thus more compact but somewhat slower.

wlex (the tool, not the wlex-specification in PXP) is no longer
maintained, so I'd discourage using it. ulex is maintained. I think it
is a good option if performance is not critical (otherwise using e.g.
ocaml-expat instead of PXP should be considered).

Gerd will be able to tell more.


-- Alain



More information about the Ocaml-pxp-users mailing list