From yoriyuki at mbg.ocn.ne.jp Thu Jun 3 04:36:41 2004 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Thu, 03 Jun 2004 20:36:41 +0900 (JST) Subject: [Ocaml-i18n] Re: [Ocaml-lib-devel] Some (simple) functions I'd like to see in ExtLib ... In-Reply-To: <20040527145944.GA17156@redhat.com> References: <0c2c01c443f7$0cbdb980$ef01a8c0@warp> <20040527145341.GH9313@redhat.com> <20040527145944.GA17156@redhat.com> Message-ID: <20040603.203641.115905018.yoriyuki@mbg.ocn.ne.jp> From: Richard Jones Subject: [Ocaml-i18n] Re: [Ocaml-lib-devel] Some (simple) functions I'd like to see in ExtLib ... Date: Thu, 27 May 2004 15:59:44 +0100 > On Thu, May 27, 2004 at 03:53:41PM +0100, Richard Jones wrote: > > > > ** ExtChar (or perhaps better in UChar): > > > > > > > > is_space, is_alnum, is_digit, is_xdigit, etc. It's inexplicable why > > > > these were left out of the standard OCaml library. > > > > > > you're welcome to send a full featured ExtChar module. > > > > OK, will look at this. Do you think it should be ExtChar or UChar > > though? Since so much of the code I now write uses UTF-8 exclusively > > I'm loathe to contribute any more 8-bit-char-specific code to the > > world ... > > Actually I can answer my own question here. We could define the > ExtChar.is_* functions to only work correctly on 7-bit ASCII. They > would return false on any character codes >= 128. This way they > should do the Right Thing when presented with UTF-8 strings too. I think the general consensus (of I18N experts) is that ISO-C char. classes are not enough. Unicode standard defines elaborate character properties (http://camomile.sourceforge.net/dochtml/UCharInfo.html). You can define ISO-C char. classes from these properties, though. (glibc actually does this). -- Yamagata Yoriyuki From yoriyuki at mbg.ocn.ne.jp Thu Jun 10 13:55:40 2004 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Fri, 11 Jun 2004 05:55:40 +0900 (JST) Subject: [Ocaml-i18n] TimeZone in OCamlI18N Message-ID: <20040611.055540.68164695.yoriyuki@mbg.ocn.ne.jp> OCamlI18N has a TimeZone type, but no concrete TimeZone is provided. Is there a plan to add them? I have looked to tzdata, but it defines rather complex rules, and I am not sure OCamlI18N can work with them. If there is no plan, I will use Shawn Wagner's annextlib, though Unix data functions are using (in my understanding) enviromental variables for TZ and hence need some care for a thread application. What I want is parsing Last-Modified header of HTTP, which can contain 3-letter TZ name, for my personal tool. By the way, what do you think adding an absolute time type (independent from calenders and timezone), like Unix time_t. Since OCaml has bignum, we do not have the problem of year 2038 problem. -- Yamagata Yoriyuki From mattam at mattam.org Fri Jun 11 08:48:01 2004 From: mattam at mattam.org (Matthieu Sozeau) Date: Fri, 11 Jun 2004 17:48:01 +0200 Subject: [Ocaml-i18n] Re: TimeZone in OCamlI18N In-Reply-To: <20040611.055540.68164695.yoriyuki@mbg.ocn.ne.jp> References: <20040611.055540.68164695.yoriyuki@mbg.ocn.ne.jp> Message-ID: <200406111748.11431.mattam@mattam.org> On Thursday 10 June 2004 22:55, Yamagata Yoriyuki wrote: > OCamlI18N has a TimeZone type, but no concrete TimeZone is provided. > Is there a plan to add them? I have looked to tzdata, but it defines > rather complex rules, and I am not sure OCamlI18N can work with them. There is the default timezone (UTC) which you can access with TimeZone.get (). Others should be added by parsing some timezone definitions. AFAICS, LDML suggest using Olson naming and tzdata, so i suppose it is the way to go. > If there is no plan, Nope, no plan > I will use Shawn Wagner's annextlib, though Unix > data functions are using (in my understanding) enviromental variables > for TZ and hence need some care for a thread application. As usual... > What I want is parsing Last-Modified header of HTTP, which can contain > 3-letter TZ name, for my personal tool. > > By the way, what do you think adding an absolute time type > (independent from calenders and timezone), like Unix time_t. Since > OCaml has bignum, we do not have the problem of year 2038 problem. time_t is relative to the Epoch (00:00:00 UTC, January 1, 1970)... What would you want to do with it that you can't do with Bigints ? -- BOFH Excuse #173: Recursive traversal of loopback mount points -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: signature Url : /pipermail/ocaml-i18n/attachments/20040611/108d0549/attachment.pgp From rich at annexia.org Fri Jun 11 11:48:48 2004 From: rich at annexia.org (Richard Jones) Date: Fri, 11 Jun 2004 19:48:48 +0100 Subject: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 In-Reply-To: <20040416.233217.48813959.yoriyuki@mbg.ocn.ne.jp> References: <20040416.012529.21602566.yoriyuki@mbg.ocn.ne.jp> <1082049719.20677.1262.camel@pelican> <20040416.233217.48813959.yoriyuki@mbg.ocn.ne.jp> Message-ID: <20040611184848.GA10384@redhat.com> On Fri, Apr 16, 2004 at 11:32:18PM +0900, Yamagata Yoriyuki wrote: > As OCaml on the whole, I think the best strategy would depend on > application. Desktop applications would required to be consistent > with other C applications, so using OS functions (for example Glib > one, or Win API) would be best. For a network application, on the > other hand, platform-independence would be desirable. For such a > case, Camomile like approach (everything implemented by OCaml) would > be better. Possibly dumb question: In server applications which I've written, it's often important to change the language while running. For example, on http://www.postmaster.co.uk/ which is a multi-lingual UTF-8 mail service, when a request arrives at the webserver and is handled by mod_perl[1], we know from the cookie who the logged-in user is, and what their language preference is. And we set the LANG variable from this information. As a result, gettext results, collation, templates, etc. all change. Is this possible for Camomile? Rich. [1] mod_perl in that instance, but same will apply to mod_caml services which I'm writing at the moment. -- Richard Jones. http://www.annexia.org/ http://www.j-london.com/ Merjis Ltd. http://www.merjis.com/ - improving website return on investment 'There is a joke about American engineers and French engineers. The American team brings a prototype to the French team. The French team's response is: "Well, it works fine in practice; but how will it hold up in theory?"' From yoriyuki at mbg.ocn.ne.jp Fri Jun 11 16:10:55 2004 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Sat, 12 Jun 2004 08:10:55 +0900 (JST) Subject: [Ocaml-i18n] Re: TimeZone in OCamlI18N In-Reply-To: <200406111748.11431.mattam@mattam.org> References: <20040611.055540.68164695.yoriyuki@mbg.ocn.ne.jp> <200406111748.11431.mattam@mattam.org> Message-ID: <20040612.081055.91310606.yoriyuki@mbg.ocn.ne.jp> From: Matthieu Sozeau Subject: [Ocaml-i18n] Re: TimeZone in OCamlI18N Date: Fri, 11 Jun 2004 17:48:01 +0200 > > What I want is parsing Last-Modified header of HTTP, which can contain > > 3-letter TZ name, for my personal tool. > > > > By the way, what do you think adding an absolute time type > > (independent from calenders and timezone), like Unix time_t. Since > > OCaml has bignum, we do not have the problem of year 2038 problem. > > time_t is relative to the Epoch (00:00:00 UTC, January 1, 1970)... What would > you want to do with it that you can't do with Bigints ? I think that my previous post is not clear about what I want. I try again. I want a type which represents time by, for example, nano-seconds from Epoch and ability to convert it from/to calender types. Since required math for such conversion is different in each calender (for example, Georgian and Julius), converter should be belong to each calender. I'm interested in your opinion. -- Yamagata Yoriyuki From yoriyuki at mbg.ocn.ne.jp Fri Jun 11 16:27:16 2004 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Sat, 12 Jun 2004 08:27:16 +0900 (JST) Subject: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 In-Reply-To: <20040611184848.GA10384@redhat.com> References: <1082049719.20677.1262.camel@pelican> <20040416.233217.48813959.yoriyuki@mbg.ocn.ne.jp> <20040611184848.GA10384@redhat.com> Message-ID: <20040612.082716.124547080.yoriyuki@mbg.ocn.ne.jp> From: Richard Jones Subject: Re: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 Date: Fri, 11 Jun 2004 19:48:48 +0100 > Possibly dumb question: > > In server applications which I've written, it's often important to > change the language while running. For example, on > http://www.postmaster.co.uk/ which is a multi-lingual UTF-8 mail > service, when a request arrives at the webserver and is handled by > mod_perl[1], we know from the cookie who the logged-in user is, and > what their language preference is. And we set the LANG variable from > this information. As a result, gettext results, collation, templates, > etc. all change. Functions in UTF8Byte and UTF8Unicode modules (Unfortunately, these modules are currently undocumented. Meanwhile, see section 2.3 in README.txt http://camomile.sourceforge.net/README.txt) respect LANG variable. For other functions, you have to explicitly pass the locale as an argument. There is a tension between purely functional style and the style using implicit states. I am not sure about which style is better. I am interested to hear the other ocaml-i18n participants' opinion. -- Yamagata Yoriyuki From mattam at altern.org Fri Jun 11 17:16:11 2004 From: mattam at altern.org (Matthieu Sozeau) Date: Sat, 12 Jun 2004 02:16:11 +0200 Subject: [Ocaml-i18n] Re: TimeZone in OCamlI18N In-Reply-To: <20040612.081055.91310606.yoriyuki@mbg.ocn.ne.jp> References: <20040611.055540.68164695.yoriyuki@mbg.ocn.ne.jp> <200406111748.11431.mattam@mattam.org> <20040612.081055.91310606.yoriyuki@mbg.ocn.ne.jp> Message-ID: <200406120216.18488.mattam@altern.org> On Saturday 12 June 2004 01:10, Yamagata Yoriyuki wrote: > From: Matthieu Sozeau > Subject: [Ocaml-i18n] Re: TimeZone in OCamlI18N > Date: Fri, 11 Jun 2004 17:48:01 +0200 > > > > What I want is parsing Last-Modified header of HTTP, which can contain > > > 3-letter TZ name, for my personal tool. > > > > > > By the way, what do you think adding an absolute time type > > > (independent from calenders and timezone), like Unix time_t. Since > > > OCaml has bignum, we do not have the problem of year 2038 problem. > > > > time_t is relative to the Epoch (00:00:00 UTC, January 1, 1970)... What > > would you want to do with it that you can't do with Bigints ? > > I think that my previous post is not clear about what I want. I try > again. > > I want a type which represents time by, for example, nano-seconds from > Epoch and ability to convert it from/to calender types. Since > required math for such conversion is different in each calender (for > example, Georgian and Julius), converter should be belong to each > calender. Ok, that would certainly be useful, and not too difficult to implement :) -- "We have art to save ourselves from the truth." - Friedrich Nietzsche -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: signature Url : /pipermail/ocaml-i18n/attachments/20040612/5d670dce/attachment.pgp From skaller at users.sourceforge.net Fri Jun 11 19:16:38 2004 From: skaller at users.sourceforge.net (skaller) Date: 12 Jun 2004 12:16:38 +1000 Subject: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 In-Reply-To: <20040612.082716.124547080.yoriyuki@mbg.ocn.ne.jp> References: <1082049719.20677.1262.camel@pelican> <20040416.233217.48813959.yoriyuki@mbg.ocn.ne.jp> <20040611184848.GA10384@redhat.com> <20040612.082716.124547080.yoriyuki@mbg.ocn.ne.jp> Message-ID: <1087006597.16811.1255.camel@pelican.wigram> On Sat, 2004-06-12 at 09:27, Yamagata Yoriyuki wrote: > There is a tension between purely functional style and the style using > implicit states. I am not sure about which style is better. I am quite sure. Implicit state is VERY BAD especially for i18n. you *will* need to provide hooks to obtain locale, language, etc objects from current locale, environment variables, command line, config files, and other common places. But no function that actually does any I18n work should access these directly. Always allow to pass all the I18n information explicitly. It is quite possible for an application to need to simultaneously process multiple languages, multiple locales, etc, often concurrently. As mentioned in a note before Australia's Motor Traffic Authority uses a computerised theory test to licence car drivers in a selectable language (obviously it's a server with multiple terminals). My Interscript literate programming tool can be commanded: iscr --language=en --language=es --language=fr .. --weaver=latex --weaver=html sample.pak which produces latex and html documents in English, Spanish, and French simultaneously. You should note the C++ Standard Library which has locale objects which are used to 'imbue' various things that need them, such as IOstreams. [They default to the current C locale I believe] You might argue that a default could be used. I could even argue against that: we should not encourage people to ignore i18n. That default argument may need to be 'undefaulted' later down the track in development, and suddenly the calling function cannot pass the information because *it* was not passed it. So the whole of the program may have to be edited to pass the object from the mainline, or, the programmer will cheat and use a global variable .. which again makes their code dysfunctional. -- John Skaller, mailto:skaller at users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net From rich at annexia.org Sat Jun 12 01:51:16 2004 From: rich at annexia.org (Richard Jones) Date: Sat, 12 Jun 2004 09:51:16 +0100 Subject: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 In-Reply-To: <1087006597.16811.1255.camel@pelican.wigram> References: <1082049719.20677.1262.camel@pelican> <20040416.233217.48813959.yoriyuki@mbg.ocn.ne.jp> <20040611184848.GA10384@redhat.com> <20040612.082716.124547080.yoriyuki@mbg.ocn.ne.jp> <1087006597.16811.1255.camel@pelican.wigram> Message-ID: <20040612085116.GD9548@redhat.com> Well I don't care for the silly philosophical arguments about whether implicit state is bad or not. What matters is that I can do something to change the language before handling each request. Each Apache process handles hundreds of requests (not simultaneously, however), and so needs to change the language hundreds of times. Not having to pass around a variable helps. eg. It's easier to write: let string = gettext "This string will be translated" ... Note that GNU gettext doesn't get this completely right. After setting LANG or LANGUAGE you need to do: #ifdef __GLIBC__ { extern int _nl_msg_cat_cntr; _nl_msg_cat_cntr++; } #endif Rich. -- Richard Jones. http://www.annexia.org/ http://www.j-london.com/ Merjis Ltd. http://www.merjis.com/ - improving website return on investment C2LIB is a library of basic Perl/STL-like types for C. Vectors, hashes, trees, string funcs, pool allocator: http://www.annexia.org/freeware/c2lib/ From ben at socialtools.net Sat Jun 12 02:29:55 2004 From: ben at socialtools.net (Benjamin Geer) Date: Sat, 12 Jun 2004 10:29:55 +0100 Subject: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 In-Reply-To: <20040612.082716.124547080.yoriyuki@mbg.ocn.ne.jp> References: <1082049719.20677.1262.camel@pelican> <20040416.233217.48813959.yoriyuki@mbg.ocn.ne.jp> <20040611184848.GA10384@redhat.com> <20040612.082716.124547080.yoriyuki@mbg.ocn.ne.jp> Message-ID: <40CACD13.9030407@socialtools.net> Yamagata Yoriyuki wrote: > There is a tension between purely functional style and the style using > implicit states. I am not sure about which style is better. I think having a default locale based on the environment isn't bad. It keeps things simple for beginners who aren't ready to think about locales yet. Being able to change the default at run time wouldn't be bad either, and might handle Richard's wish not to have to specify locales lots of times in his program. However, I think it's important to be able to use several different locales simultaneously. Sometimes the right locale depends not on who the user is, but on what they're doing. I often have to work with multilingual data. At the moment I'm working on a financial application (not in Caml) in which we're supposed to allow users to specify that they want dates displayed in a certain locale, and currencies displayed in a different locale. A multi-threaded web application would also need to handle different locales at the same time. An approach that would seem appealing to me would be if you could create a locale object for any locale you want, and from that object, get other reusable objects that do the real work (formatting dates, translating strings, etc.). That might also be OK for Richard: when the process starts to handle an HTTP request, it could get the appropriate locale object, then from that get date formatters and so on, and use those over and over without ever having to specify the locale again. If it's not multi-threaded, it could even keep them in global variables (although I know John Skaller wouldn't approve :) . Ben From yoriyuki at mbg.ocn.ne.jp Sat Jun 12 07:55:21 2004 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Sat, 12 Jun 2004 23:55:21 +0900 (JST) Subject: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 In-Reply-To: <20040612085116.GD9548@redhat.com> References: <20040612.082716.124547080.yoriyuki@mbg.ocn.ne.jp> <1087006597.16811.1255.camel@pelican.wigram> <20040612085116.GD9548@redhat.com> Message-ID: <20040612.235521.91447046.yoriyuki@mbg.ocn.ne.jp> From: Richard Jones Subject: Re: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 Date: Sat, 12 Jun 2004 09:51:16 +0100 > Well I don't care for the silly philosophical arguments about whether > implicit state is bad or not. What matters is that I can do something > to change the language before handling each request. Each Apache > process handles hundreds of requests (not simultaneously, however), > and so needs to change the language hundreds of times. For functions in UTF8Byte and UTF8Unicode, you can change its locale by either 1) setting LC_* enviromental variables by Locale.setlocale 2) passing locale as an optional argument, which overwrites locale obtained from LC_* variables. For other functions, you have to explicitly pass the locale. However, you could use partial application, e.g. let my_col = compare_col locale in ... my_col s1 s2 ... Most functionality of Camomile is accessible from UTF8Byte and UTF8Unicode. -- Yamagata Yoriyuki From ben at socialtools.net Sat Jun 12 08:40:02 2004 From: ben at socialtools.net (Benjamin Geer) Date: Sat, 12 Jun 2004 16:40:02 +0100 Subject: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 In-Reply-To: <20040612.235521.91447046.yoriyuki@mbg.ocn.ne.jp> References: <20040612.082716.124547080.yoriyuki@mbg.ocn.ne.jp> <1087006597.16811.1255.camel@pelican.wigram> <20040612085116.GD9548@redhat.com> <20040612.235521.91447046.yoriyuki@mbg.ocn.ne.jp> Message-ID: <40CB23D2.5030805@socialtools.net> Yamagata Yoriyuki wrote: > For other functions, you have to explicitly pass the locale. However, > you could use partial application, e.g. I think partial application is probably be just as good as what I was suggesting doing with objects. Ben From skaller at users.sourceforge.net Sat Jun 12 10:35:04 2004 From: skaller at users.sourceforge.net (skaller) Date: 13 Jun 2004 03:35:04 +1000 Subject: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 In-Reply-To: <20040612.235521.91447046.yoriyuki@mbg.ocn.ne.jp> References: <20040612.082716.124547080.yoriyuki@mbg.ocn.ne.jp> <1087006597.16811.1255.camel@pelican.wigram> <20040612085116.GD9548@redhat.com> <20040612.235521.91447046.yoriyuki@mbg.ocn.ne.jp> Message-ID: <1087061704.16811.1295.camel@pelican.wigram> On Sun, 2004-06-13 at 00:55, Yamagata Yoriyuki wrote: > 1) setting LC_* enviromental variables by Locale.setlocale > 2) passing locale as an optional argument, which overwrites locale > obtained from LC_* variables. overwrites or overrides? -- John Skaller, mailto:skaller at users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net From yoriyuki at mbg.ocn.ne.jp Sat Jun 12 16:54:48 2004 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Sun, 13 Jun 2004 08:54:48 +0900 (JST) Subject: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 In-Reply-To: <1087061704.16811.1295.camel@pelican.wigram> References: <20040612085116.GD9548@redhat.com> <20040612.235521.91447046.yoriyuki@mbg.ocn.ne.jp> <1087061704.16811.1295.camel@pelican.wigram> Message-ID: <20040613.085448.35500655.yoriyuki@mbg.ocn.ne.jp> From: skaller Subject: Re: [Ocaml-i18n] Re: [Caml-list] Camomile-0.5.0 Date: 13 Jun 2004 03:35:04 +1000 > On Sun, 2004-06-13 at 00:55, Yamagata Yoriyuki wrote: > > 1) setting LC_* enviromental variables by Locale.setlocale > > 2) passing locale as an optional argument, which overwrites locale > > obtained from LC_* variables. > > overwrites or overrides? overrides -- Yamagata Yoriyuki