[Ocaml-i18n] Re: [Ocaml-lib-devel] Some (simple) functions I'd like to see in ExtLib ...

Yamagata Yoriyuki yoriyuki at mbg.ocn.ne.jp
Thu Jun 3 04:36:41 PDT 2004


From: Richard Jones <rich at annexia.org>
Subject: [Ocaml-i18n] Re: [Ocaml-lib-devel] Some (simple) functions I'd like to see in ExtLib ...
Date: Thu, 27 May 2004 15:59:44 +0100

> On Thu, May 27, 2004 at 03:53:41PM +0100, Richard Jones wrote:
> > > > ** ExtChar (or perhaps better in UChar):
> > > >
> > > > is_space, is_alnum, is_digit, is_xdigit, etc.  It's inexplicable why
> > > > these were left out of the standard OCaml library.
> > > 
> > > you're welcome to send a full featured ExtChar module.
> > 
> > OK, will look at this.  Do you think it should be ExtChar or UChar
> > though?  Since so much of the code I now write uses UTF-8 exclusively
> > I'm loathe to contribute any more 8-bit-char-specific code to the
> > world ...
> 
> Actually I can answer my own question here.  We could define the
> ExtChar.is_* functions to only work correctly on 7-bit ASCII.  They
> would return false on any character codes >= 128.  This way they
> should do the Right Thing when presented with UTF-8 strings too.

I think the general consensus (of I18N experts) is that ISO-C
char. classes are not enough.  Unicode standard defines elaborate
character properties
(http://camomile.sourceforge.net/dochtml/UCharInfo.html).  You can
define ISO-C char. classes from these properties, though.  (glibc
actually does this).

--
Yamagata Yoriyuki




More information about the Ocaml-i18n mailing list