[ocaml-i18n] Native UTF-8 strings

Yamagata Yoriyuki yoriyuki at mbg.ocn.ne.jp
Tue Nov 25 18:18:18 PST 2003


From: Jun.Furuse at inria.fr
Subject: Re: [ocaml-i18n] Native UTF-8 strings
Date: Mon, 24 Nov 2003 16:05:29 +0100

> 
>   * string manipulation functions for each encoding like UTF-8, 
>     and how to cleanly provide them to users. (Basically I agree with
>     Yoriyuki's idea of using the open directive.) 
>   
>   * how to write i18n string/char constants in Caml programs, and
>     to print them on screen.
>   
>   * and how to provide non-european identifiers and module names
> 
> Are there anything missing? (or is my classement totally nonsense?)

We need a way to interact with the environment, like converting input
string to Unicode (as pointed out by Rich), getting locale etc.  It
seems one of hardest part of all, since 1) there is no portable way to
do this, and 2) there is a tension between tight integration to the
environment vs. portability across different OS.  Even interpretation
of a specific encoding can be different across platforms (as in the
case of SJIS).  For a desktop, a user would expect a ocaml program
interpret SJIS code in the same way as other non-ocaml programs.  For
a server program, one would expect SJIS be decoded by a
platform-independent way.  Locale name, collation etc. have similar
(and worse) problems.

But maybe we can factor out the OS dependent part by Functor, which
would permit developers give their favorite i18n functions to the
system, and by open directive overwrite the default.

--
Yamagata Yoriyuki



More information about the Ocaml-i18n mailing list