From ben at socialtools.net Tue Dec 2 10:53:03 2003 From: ben at socialtools.net (Benjamin Geer) Date: Tue, 02 Dec 2003 18:53:03 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system Message-ID: <3FCCDF8F.4060404@socialtools.net> I should introduce myself briefly: I'm a programmer with a background mostly in Java, and I started programming in Caml this year. I have an M.A. in linguistics; I speak English and French, and I'm learning Italian and Arabic. I've been thinking about writing a message catalogue system for Caml. My motivation is that I need it for a web application, but I'm keen to have it be suitable for Caml programs in general. I emailed some ideas to Matthieu Sozeau (author of OCamlI18n); he suggested that we continue the discussion on this list. He pointed me to an article about Perl's Maketext: http://www.icewalkers.com/Perl/5.8.0/lib/Locale/Maketext/TPJ13.html I think it makes a lot of good points. I think its main insight is that a translation in a message catalogue can be thought of as a function that returns a string in a particular language, often including some data that was passed to it. Since it's a function, the next question is: what language should it be written in? Locale::Maketext provides two languages: a simple 'bracket notation', and Perl itself. In Java, java.text.MessageFormat provides a bracket notation, with no ability to fall back to anything more powerful. This looks to me like a serious limitation. For example, its approach to plurals (in java.text.ChoiceFormat) only allows you to specify different forms for absolute ranges of quantities; this doesn't seem to be able to handle Slavic-style plurals (see the Polish example in the GNU gettext manual: http://www.gnu.org/software/gettext/manual/html_chapter/gettext_10.html#SEC150). It seems to me that bracket notations are appealing because they allow you to express the translation function as an *exemplar*, which is simple and intuitive: Your search returned {0} files in {1} directories. However, the bracket notations provided by java.text.MessageFormat and Locale::Maketext are not powerful enough to handle the more complex logic that is necessary in order to generate plurals in some natural languages. The solution to this problem in Maketext is to fall back to using a different programming language entirely, in this case Perl. But then the translator needs to learn two syntaxes; one of these is too simple for the task at hand, and the other one is too complex. Wouldn't it be nice if the translator could learn just one syntax, which allowed him to express the translation as an exemplar, and which was also powerful enough to handle the logic for plurals? It seems to me that what's needed here is a template language. I've written one, called CamlTemplate (http://saucecode.org/camltemplate). For simple exemplars, it's as easy to use as bracket notation, e.g.: Cannot open file ${a}. Getting back to the issue of plurals, suppose your message just has to say "n files", where n is a number. The English template could be: #if (a == 1) 1 file #else ${a} files #end The Polish one could be: #if (a == 1) 1 plik #elseif (a >= 5 && a <= 21) ${a} plik?w #elseif (a % 10 >= 2 && a % 10 <= 4) ${a} pliki #end The article on Maketext suggests writing Perl functions to generate plurals, and calling these functions from the translations. This could be done in CamlTemplate as well, e.g. for English: #macro quant(num, word) ${a} #if (a == 1) ${word} #else ${word}s #end #end The English "n files" template would then become: #quant(a, "file") Of course you could expand this to handle the common irregular forms. Since a CamlTemplate template can call Caml functions, simple string-matching functions could be provided to do things like this: #macro quant(num, word) ${a} #if (a == 1) ${word} #elseif (endsWith(word, "y")) ${stripSuffix(word, "y")}ies #else ${word}s #end #end The next question is: how do you access a translation from a program? What do you use for a message key? Gettext uses the message itself in some natural language (the one the programmer used); it reads message directly from program source code. I think this has several drawbacks: 1. If the same message is used several times in the program, when it changes, it must be changed in several places. 2. Representing the message as an exemplar might be complex in itself, as in the examples above, thus complicating the program. 3. The programmer might not be the person who writes the messages; having them in the source code is therefore an inconvenience, particularly if they need to be written before the programmer starts coding. The alternative is to use some arbitrary string as a key; this is the approach taken by java.text.MessageFormat. I think it's a more maintainable approach, because messages can be changed without changing program source code. Another question is: how do we store message catalogues? The problem of character encoding comes up right away. I suggest that we store them in XML files, because XML has built-in support for dealing with encodings. So I propose something like this: #macro quant(num, word) ${a} #if (a == 1) ${word} #elseif (endsWith(word, "y")) ${stripSuffix(word, "y")}ies #else ${word}s #end #end Disk ${a} is full. There are #quant(a, "file") in #quant(a1, "directory"). This could just be the default way of storing them; there could also be an interface allowing catalogues to be loaded from any other source. So to get a translation in a Caml program, I'm proposing a function like this: val msg : key:string -> args:string list -> string = You could use it like this: msg "files_in_dirs" [ file_count, directory_count ] If you were using CamlTemplate in a web application, you could also call this function in a template: ${msg("files_in_dirs", fileCount, directoryCount)} The article about Maketext points out that you'll want to share some functions between languages, or at least between different variants of the same language. Currently in CamlTemplate all macros are global, i.e. can be used by all templates. I'm thinking about adding a simple namespace facility so that a template could be in, say the, "en.UK" namespace; when it used the #quant macro, the template engine would look for that macro in the "en.UK" namespace; if it didn't find it there, it would look in the "en" namespace, and then in the default namespace. I think that would take care of all the issues raised by the Maketext article. I'd love to hear some reactions to this proposal. Ben From ben at socialtools.net Tue Dec 2 15:29:43 2003 From: ben at socialtools.net (Benjamin Geer) Date: Tue, 02 Dec 2003 23:29:43 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031202230008.GA8381@grand> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> Message-ID: <3FCD2067.10008@socialtools.net> Sylvain LE GALL wrote: > Well, i won't quote your whole message. So i just give some idea : > - using a text key which is the original sentence which need to be > translated is GOOD. Because most of the time translation is a feature > not the key of the program. So it should not be blocking for the rest > of the APP ( ie if a single translation doesn't not exist, it must not > issue an arbitrary key, nor raise an exception. The program can always have a default language, so if a translation for a key doesn't exist, it can use the equivalent text in the default language. This has worked fine on projects I've worked on. It's also what gettext does, isn't it? If you use symbolic keys, the only difference is that the text for the default language isn't in the source code; it's in a separate file. Program text is often written by usability specialists or marketing people, not by programmers. Having text in files that those people can edit is an advantage, isn't it? Still, there's nothing in my proposal to stop you from using the original sentence as a key; as far as the message catalogue facility is concerned, keys are just strings. > - if you really want to use anything else as a key, why don't you use > "KEY_1" as a key ( ie string as key ). I think it is not good but... Using strings as keys is exactly what I proposed, but why not use meaningful keys ("files_in_directories") instead of arbitrary ones ("KEY_1")? > - using more than one function ( or brackets or anything else ) is > getting inefficient when you have already complex function ( ie i > don't think anyone want to have a big source code just because of > translation ). I think it's unwise to say that something is inefficient before you've tested it. It only needs to be efficient *enough*. And have you looked at the code that a translator has to write in order to handle complex plurals in gettext? Here's an example for Slavic languages, taken from the gettext manual: Plural-Forms: nplurals=3; \ plural=n%10==1 && n%100!=11 ? 0 : \ n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2; Complexity is inherent in the problem, because languages are complex. However, I think the syntax above is horrible, and I think the syntax I proposed would be much easier for a translator to handle. > I recommend to use gettext. I think it is the most powerful tool for > translation. You can extract text... Did you read the article I was discussing? I think it makes a pretty strong case that gettext is inadequate. Here's the link again: http://www.icewalkers.com/Perl/5.8.0/lib/Locale/Maketext/TPJ13.html Ben From sylvain.le-gall at polytechnique.org Tue Dec 2 15:00:08 2003 From: sylvain.le-gall at polytechnique.org (Sylvain LE GALL) Date: Wed, 3 Dec 2003 00:00:08 +0100 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FCCDF8F.4060404@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> Message-ID: <20031202230008.GA8381@grand> Hello, Well, i won't quote your whole message. So i just give some idea : - using a text key which is the original sentence which need to be translated is GOOD. Because most of the time translation is a feature not the key of the program. So it should not be blocking for the rest of the APP ( ie if a single translation doesn't not exist, it must not issue an arbitrary key, nor raise an exception. - if you really want to use anything else as a key, why don't you use "KEY_1" as a key ( ie string as key ). I think it is not good but... - question of notation is hard... - using more than one function ( or brackets or anything else ) is getting inefficient when you have already complex function ( ie i don't think anyone want to have a big source code just because of translation ). I recommend to use gettext. I think it is the most powerful tool for translation. You can extract text... Moreover there is already a binding of gettext ( either in the hump or at http://www.gallu.homelinux.org/download/ ). In fact, i am working on a full ocaml program that read gettext files for translation. Regard Sylvain LE GALL On Tue, Dec 02, 2003 at 06:53:03PM +0000, Benjamin Geer wrote: > I should introduce myself briefly: > > I'm a programmer with a background mostly in Java, and I started > programming in Caml this year. I have an M.A. in linguistics; I speak > English and French, and I'm learning Italian and Arabic. > > From sylvain.le-gall at polytechnique.org Tue Dec 2 22:59:40 2003 From: sylvain.le-gall at polytechnique.org (sylvain.le-gall at polytechnique.org) Date: Wed, 3 Dec 2003 07:59:40 +0100 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FCD2067.10008@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> Message-ID: <20031203065940.GA564@gallu.homelinux.org> Hello, On Tue, Dec 02, 2003 at 11:29:43PM +0000, Benjamin Geer wrote: > Sylvain LE GALL wrote: > >Well, i won't quote your whole message. So i just give some idea : > >- using a text key which is the original sentence which need to be > > translated is GOOD. Because most of the time translation is a feature > > not the key of the program. So it should not be blocking for the rest > > of the APP ( ie if a single translation doesn't not exist, it must not > > issue an arbitrary key, nor raise an exception. > > The program can always have a default language, so if a translation for > a key doesn't exist, it can use the equivalent text in the default > language. This has worked fine on projects I've worked on. It's also > what gettext does, isn't it? If you use symbolic keys, the only > difference is that the text for the default language isn't in the source > code; it's in a separate file. > > Program text is often written by usability specialists or marketing > people, not by programmers. Having text in files that those people can > edit is an advantage, isn't it? > But in gettext, you have also separate file for translation ! I think it is .po ( and .POT ) files. You have only one langage in the source code ( or symbolic name ). > Still, there's nothing in my proposal to stop you from using the > original sentence as a key; as far as the message catalogue facility is > concerned, keys are just strings. > > >- if you really want to use anything else as a key, why don't you use > > "KEY_1" as a key ( ie string as key ). I think it is not good but... > > Using strings as keys is exactly what I proposed, but why not use > meaningful keys ("files_in_directories") instead of arbitrary ones > ("KEY_1")? > Off course, it was just an example, but it is also a problem. I always recommend to keep the relevant source the nearer of the source code. Ie there should be at least a default case which means something and is human readable, for programmer sake and users sake. I have seen many programs which display COLUMN_TEXT in the first column of a text, because it cannot find his own catalog... > >- using more than one function ( or brackets or anything else ) is > > getting inefficient when you have already complex function ( ie i > > don't think anyone want to have a big source code just because of > > translation ). > > I think it's unwise to say that something is inefficient before you've > tested it. It only needs to be efficient *enough*. And have you looked > at the code that a translator has to write in order to handle complex > plurals in gettext? Here's an example for Slavic languages, taken from > the gettext manual: > Sorry, it was just to create a reaction. > Plural-Forms: nplurals=3; \ > plural=n%10==1 && n%100!=11 ? 0 : \ > n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2; > > Complexity is inherent in the problem, because languages are complex. > However, I think the syntax above is horrible, and I think the syntax I > proposed would be much easier for a translator to handle. > Well, as i understand, there is a very complex form for slavic languages. I don't think gettext is perfect, but i think it efficient, ie it tries to solve most of the translation problem. > >I recommend to use gettext. I think it is the most powerful tool for > >translation. You can extract text... > > Did you read the article I was discussing? I think it makes a pretty > strong case that gettext is inadequate. Here's the link again: > > http://www.icewalkers.com/Perl/5.8.0/lib/Locale/Maketext/TPJ13.html > > Ben > Right now, i have no time for this... I promise to have a look at it tonight. Kind regard Sylvain LE GALL From ben at socialtools.net Wed Dec 3 02:50:35 2003 From: ben at socialtools.net (Benjamin Geer) Date: Wed, 03 Dec 2003 10:50:35 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031203065940.GA564@gallu.homelinux.org> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> Message-ID: <3FCDBFFB.4050605@socialtools.net> sylvain.le-gall at polytechnique.org wrote: > But in gettext, you have also separate file for translation ! I think it > is .po ( and .POT ) files. You have only one langage in the source code If the source code is in English, why should it be easy for the marketing department to edit the French translation (which is in a .po file), but difficult for them to edit the original English? Instead of typing text in the source code, wouldn't it be better if the programmer typed it in a separate file while coding? It just takes a few more seconds to do so. > I have seen many > programs which display COLUMN_TEXT in the first column of a text, > because it cannot find his own catalog... Gettext wouldn't stop this from happening, since the programmer can always put vague or meaningless text in the source code. Which the marketing department then has to correct. >>Plural-Forms: nplurals=3; \ >> plural=n%10==1 && n%100!=11 ? 0 : \ >> n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2; >> >>Complexity is inherent in the problem, because languages are complex. >>However, I think the syntax above is horrible, and I think the syntax I >>proposed would be much easier for a translator to handle. > > Well, as i understand, there is a very complex form for slavic > languages. If translators could use a tool whose syntax was easier for them, maybe more software would be localised in Slavic languages. As I understand it, a formula like the one above just allows the translator to know which of 3 types of plural forms is needed; but he still needs to write the word in each of those forms, in every translation. Especially because the correct form also depends on the grammatical case of the noun, and gettext doesn't help with that. The article on Maketext suggests that it would make translators' lives easier if they could automate this to some extent, by writing their own functions that produced words with the correct plural forms. But for that, they would need a programming language in which they could write those functions. What I'm proposing is to give them such a language, and to have it be the *same* language that they use to write trivial substitutions, so they only have to learn one syntax. > I promise to have a look at it tonight. OK, thanks. Ben From ben at socialtools.net Wed Dec 3 03:13:03 2003 From: ben at socialtools.net (Benjamin Geer) Date: Wed, 03 Dec 2003 11:13:03 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031203065940.GA564@gallu.homelinux.org> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> Message-ID: <3FCDC53F.9030005@socialtools.net> Another problem with gettext, which I forgot to mention in my last message, is that as far as I can tell, there's no way to use it in a web application, because it can only extract messages from C source code. (And maybe Caml if someone writes a tool to do that.) I'm writing a web application in which the pages are generated from templates (using CamlTemplate). I want to write templates that look like this:

${msg("please_log_in"})

${msg("username")} ${msg("password")}
It seems that in order to use gettext, I would need a tool that extracted messages from templates and generated .po files. I really don't want to write such a tool; it would be much easier to write the system I'm proposing. Ben From sylvain.le-gall at polytechnique.org Wed Dec 3 11:11:46 2003 From: sylvain.le-gall at polytechnique.org (sylvain.le-gall at polytechnique.org) Date: Wed, 3 Dec 2003 20:11:46 +0100 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FCDC53F.9030005@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDC53F.9030005@socialtools.net> Message-ID: <20031203191146.GA968@gallu.homelinux.org> Hello, On Wed, Dec 03, 2003 at 11:13:03AM +0000, Benjamin Geer wrote: > Another problem with gettext, which I forgot to mention in my last > message, is that as far as I can tell, there's no way to use it in a web > application, because it can only extract messages from C source code. > (And maybe Caml if someone writes a tool to do that.) > Well, no... ;-> You can extract sh, perl, python, ada, C++, java... and ocaml ( since it is me who has written it ). But for ocaml, it is only a patch i have produce some times ago against gettext ( i don't submit it to gettext, because i want to use camlp4 or something like that to extract string ). There is also PHP... And if you need you can create a tool to handle any other language... > I'm writing a web application in which the pages are generated from > templates (using CamlTemplate). I want to write templates that look > like this: > > > >

${msg("please_log_in"}) > >

> ${msg("username")} > ${msg("password")} >
> > > > It seems that in order to use gettext, I would need a tool that > extracted messages from templates and generated .po files. I really > don't want to write such a tool; it would be much easier to write the > system I'm proposing. > Regard Sylvain LE GALL From sylvain.le-gall at polytechnique.org Wed Dec 3 11:14:47 2003 From: sylvain.le-gall at polytechnique.org (sylvain.le-gall at polytechnique.org) Date: Wed, 3 Dec 2003 20:14:47 +0100 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FCDBFFB.4050605@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDBFFB.4050605@socialtools.net> Message-ID: <20031203191447.GB968@gallu.homelinux.org> On Wed, Dec 03, 2003 at 10:50:35AM +0000, Benjamin Geer wrote: > sylvain.le-gall at polytechnique.org wrote: > >But in gettext, you have also separate file for translation ! I think it > >is .po ( and .POT ) files. You have only one langage in the source code > > If the source code is in English, why should it be easy for the > marketing department to edit the French translation (which is in a .po > file), but difficult for them to edit the original English? Instead of > typing text in the source code, wouldn't it be better if the programmer > typed it in a separate file while coding? It just takes a few more > seconds to do so. > Well, you should be right, but programmers are really lazy and most of the time don't write such a catalog ( i had some experiences building such a catalog, but with time the catalog get less and less relevant regarding the source code ). > > I have seen many > >programs which display COLUMN_TEXT in the first column of a text, > >because it cannot find his own catalog... > > Gettext wouldn't stop this from happening, since the programmer can > always put vague or meaningless text in the source code. Which the > marketing department then has to correct. > Well, why do not use en.po ? ( yes you can create a file to translate the own langage string of the program... ). > >>Plural-Forms: nplurals=3; \ > >> plural=n%10==1 && n%100!=11 ? 0 : \ > >> n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2; > >> > >>Complexity is inherent in the problem, because languages are complex. > >>However, I think the syntax above is horrible, and I think the syntax I > >>proposed would be much easier for a translator to handle. > > > >Well, as i understand, there is a very complex form for slavic > >languages. > > If translators could use a tool whose syntax was easier for them, maybe > more software would be localised in Slavic languages. > > As I understand it, a formula like the one above just allows the > translator to know which of 3 types of plural forms is needed; but he > still needs to write the word in each of those forms, in every > translation. Especially because the correct form also depends on the > grammatical case of the noun, and gettext doesn't help with that. The > article on Maketext suggests that it would make translators' lives > easier if they could automate this to some extent, by writing their own > functions that produced words with the correct plural forms. > > But for that, they would need a programming language in which they could > write those functions. What I'm proposing is to give them such a > language, and to have it be the *same* language that they use to write > trivial substitutions, so they only have to learn one syntax. > > > I promise to have a look at it tonight. > > OK, thanks. > > Ben > I don't yet read the article. I will do it promise. Just to say : i am not against something new... ( i could even help you in such a task only if it is written in ocaml ;-> ). Kind regard Sylvain LE GALL From ben at socialtools.net Thu Dec 4 00:56:48 2003 From: ben at socialtools.net (Benjamin Geer) Date: Thu, 04 Dec 2003 08:56:48 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031203191447.GB968@gallu.homelinux.org> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDBFFB.4050605@socialtools.net> <20031203191447.GB968@gallu.homelinux.org> Message-ID: <3FCEF6D0.8080700@socialtools.net> sylvain.le-gall at polytechnique.org wrote: > Well, you should be right, but programmers are really lazy and most of > the time don't write such a catalog ( i had some experiences building > such a catalog, but with time the catalog get less and less relevant > regarding the source code ). Yes, it has to be maintained along with the source code. I think the advantages are significant, though. For example, you can't spell-check the string literals in your source code, but you can spell-check the text in a separate file. > Well, why do not use en.po ? ( yes you can create a file to translate > the own langage string of the program... ). Hmm, maybe that could be a good approach. Of course, you would still have to maintain the catalog as above. > Just to say : i am not against something new... ( i could even help you > in such a task only if it is written in ocaml ;-> ). Naturally. :) Ben From ben at socialtools.net Thu Dec 4 01:22:43 2003 From: ben at socialtools.net (Benjamin Geer) Date: Thu, 04 Dec 2003 09:22:43 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FCEF6D0.8080700@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDBFFB.4050605@socialtools.net> <20031203191447.GB968@gallu.homelinux.org> <3FCEF6D0.8080700@socialtools.net> Message-ID: <3FCEFCE3.6000604@socialtools.net> Benjamin Geer wrote: >> Well, why do not use en.po ? ( yes you can create a file to translate >> the own langage string of the program... ). > > Hmm, maybe that could be a good approach. Of course, you would still > have to maintain the catalog as above. Actually I think it could only be a good approach if there was a tool to extract messages from design documents instead of source code. Some organisations might want to plan their user interface (including the text) before coding it. So you'd write a design document saying things like: 'When the search is complete, the program will display a message saying: _("Your search returned ${a} documents in ${a1} files.").' You could then extract the strings surrounded by _() from that document, and generate the message catalog before the programmers had started coding. If I understand gettext correctly, it seems as if the example sentence above shows another problem with gettext: since the programmer still uses printf, there's no way for the translator to change the order of the numbers, is there? The programmer writes: printf ("Your search returned %d documents in %d files."); What if the translation needs to have a different word order, so that the number of files comes first? Is there a way to do that with gettext? Ben From yoriyuki at mbg.ocn.ne.jp Thu Dec 4 06:52:27 2003 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Thu, 04 Dec 2003 23:52:27 +0900 (JST) Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031202230008.GA8381@grand> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> Message-ID: <20031204.235227.97295945.yoriyuki@mbg.ocn.ne.jp> From: "Sylvain LE GALL" Subject: Re: [Ocaml-i18n] proposal: message catalogue system Date: Wed, 3 Dec 2003 00:00:08 +0100 > I recommend to use gettext. I think it is the most powerful tool for > translation. You can extract text... For handling multiple locales, setdomain of gettext (and any other approach relying a state) is not a greatest way to do so. A message generator should have a locale as an optional argument. Btw, does gettext work on Windows and Mac? -- Yamagata Yoriyuki From sylvain.le-gall at polytechnique.org Thu Dec 4 15:24:00 2003 From: sylvain.le-gall at polytechnique.org (sylvain.le-gall at polytechnique.org) Date: Fri, 5 Dec 2003 00:24:00 +0100 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031204.235227.97295945.yoriyuki@mbg.ocn.ne.jp> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <20031204.235227.97295945.yoriyuki@mbg.ocn.ne.jp> Message-ID: <20031204232400.GA360@gallu.homelinux.org> On Thu, Dec 04, 2003 at 11:52:27PM +0900, Yamagata Yoriyuki wrote: > From: "Sylvain LE GALL" > Subject: Re: [Ocaml-i18n] proposal: message catalogue system > Date: Wed, 3 Dec 2003 00:00:08 +0100 > > > I recommend to use gettext. I think it is the most powerful tool for > > translation. You can extract text... > > For handling multiple locales, setdomain of gettext (and any other > approach relying a state) is not a greatest way to do so. A message > generator should have a locale as an optional argument. > > Btw, does gettext work on Windows and Mac? > Hello, Gettext ( in C ) should normally works on mac and windows ( however i never tried it ). For the ocaml binding, i should say that it is working, but i need people who works on this OS to test it. Regard Sylvain LE GALL ps : i agree that it should take an optional parameter for the localization but it do so by using some functions ( don't remember the name but you pass a locale parameter ) From sylvain.le-gall at polytechnique.org Thu Dec 4 15:27:18 2003 From: sylvain.le-gall at polytechnique.org (sylvain.le-gall at polytechnique.org) Date: Fri, 5 Dec 2003 00:27:18 +0100 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FCEFCE3.6000604@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDBFFB.4050605@socialtools.net> <20031203191447.GB968@gallu.homelinux.org> <3FCEF6D0.8080700@socialtools.net> <3FCEFCE3.6000604@socialtools.net> Message-ID: <20031204232718.GB360@gallu.homelinux.org> On Thu, Dec 04, 2003 at 09:22:43AM +0000, Benjamin Geer wrote: > Benjamin Geer wrote: > >>Well, why do not use en.po ? ( yes you can create a file to translate > >>the own langage string of the program... ). > > > >Hmm, maybe that could be a good approach. Of course, you would still > >have to maintain the catalog as above. > > Actually I think it could only be a good approach if there was a tool to > extract messages from design documents instead of source code. Some > organisations might want to plan their user interface (including the > text) before coding it. So you'd write a design document saying things > like: 'When the search is complete, the program will display a message > saying: _("Your search returned ${a} documents in ${a1} files.").' You > could then extract the strings surrounded by _() from that document, and > generate the message catalog before the programmers had started coding. > > If I understand gettext correctly, it seems as if the example sentence > above shows another problem with gettext: since the programmer still > uses printf, there's no way for the translator to change the order of > the numbers, is there? The programmer writes: > > printf ("Your search returned %d documents in %d files."); > Oops... you are unfortunately right. There is no numbered parameter like in C. In printf C, you can do %2s %1s to explain which is the parameter position... But not in caml > What if the translation needs to have a different word order, so that > the number of files comes first? Is there a way to do that with gettext? > I begin to read the article. I think you are right... It seems very interesting. I will try to see how it could be implemented. Kind regard Sylvain LE GALL From rich at annexia.org Fri Dec 5 05:57:20 2003 From: rich at annexia.org (Richard Jones) Date: Fri, 5 Dec 2003 13:57:20 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031202230008.GA8381@grand> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> Message-ID: <20031205135720.GA21946@redhat.com> On Wed, Dec 03, 2003 at 12:00:08AM +0100, Sylvain LE GALL wrote: > Hello, > > Well, i won't quote your whole message. So i just give some idea : > - using a text key which is the original sentence which need to be > translated is GOOD. Because most of the time translation is a feature > not the key of the program. So it should not be blocking for the rest > of the APP ( ie if a single translation doesn't not exist, it must not > issue an arbitrary key, nor raise an exception. > - if you really want to use anything else as a key, why don't you use > "KEY_1" as a key ( ie string as key ). I think it is not good but... > - question of notation is hard... > - using more than one function ( or brackets or anything else ) is > getting inefficient when you have already complex function ( ie i > don't think anyone want to have a big source code just because of > translation ). > > I recommend to use gettext. I think it is the most powerful tool for > translation. You can extract text... I fully agree with these recommendations. On one very large website that I worked on (in Perl), we translated the whole site using just gettext into 7 languages (including Thai and Arabic [right-to-left]). It turns out that whole singular/plural issue can be worked around fairly easily. For example, we would write something like: Number of messages in your inbox: 10 instead of: You have 10 messages in your inbox. Using 'gettext' inline, and using standard xgettext to extract the messages made it very easy for the programmers, and meant that every last part of the application didn't have to be translated and kept up to date all at once. Conversely, I also worked on a Java project where they had rolled their own translation system using keys into an external XML file. That was a really bad approach, because the application kept crashing every time a translation was missing. Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment 'There is a joke about American engineers and French engineers. The American team brings a prototype to the French team. The French team's response is: "Well, it works fine in practice; but how will it hold up in theory?"' From rich at annexia.org Fri Dec 5 06:02:17 2003 From: rich at annexia.org (Richard Jones) Date: Fri, 5 Dec 2003 14:02:17 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031205135720.GA21946@redhat.com> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <20031205135720.GA21946@redhat.com> Message-ID: <20031205140217.GB21946@redhat.com> On Fri, Dec 05, 2003 at 01:57:20PM +0000, Richard Jones wrote: > On one very large website that I worked on (in Perl), we translated > the whole site using just gettext into 7 languages (including Thai and > Arabic [right-to-left]). I forget to mention that we did modify gettext to support placeholders. Thus you could write (in Perl): _("Number of messages in ::folder:: : ::n::", folder => $foldername, n => $nr_messages) or something similar. The problem with this was that the external translators would end up translating ::folder::, so we started to use single letters like ::f:: ! Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment "One serious obstacle to the adoption of good programming languages is the notion that everything has to be sacrificed for speed. In computer languages as in life, speed kills." -- Mike Vanier From rich at annexia.org Fri Dec 5 06:04:24 2003 From: rich at annexia.org (Richard Jones) Date: Fri, 5 Dec 2003 14:04:24 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FCDC53F.9030005@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDC53F.9030005@socialtools.net> Message-ID: <20031205140424.GC21946@redhat.com> On Wed, Dec 03, 2003 at 11:13:03AM +0000, Benjamin Geer wrote: > Another problem with gettext, which I forgot to mention in my last > message, is that as far as I can tell, there's no way to use it in a web > application, because it can only extract messages from C source code. > (And maybe Caml if someone writes a tool to do that.) > > I'm writing a web application in which the pages are generated from > templates (using CamlTemplate). I want to write templates that look > like this: > > > >

${msg("please_log_in"}) > >

> ${msg("username")} > ${msg("password")} >
> > > > It seems that in order to use gettext, I would need a tool that > extracted messages from templates and generated .po files. I really > don't want to write such a tool; it would be much easier to write the > system I'm proposing. The trick (which we used on the aforementioned large Perl project) was to have separate HTML templates for each language. The translators actually had pretty advanced tools which would keep the HTML templates in synch with each other. I think they generated the .html.fr, .html.it etc. from the "source" .html.en file. I was quite impressed, particularly since they were doing all this on Windows :-) Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment 'There is a joke about American engineers and French engineers. The American team brings a prototype to the French team. The French team's response is: "Well, it works fine in practice; but how will it hold up in theory?"' From ben at socialtools.net Sat Dec 6 03:45:08 2003 From: ben at socialtools.net (Benjamin Geer) Date: Sat, 06 Dec 2003 11:45:08 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031205140424.GC21946@redhat.com> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDC53F.9030005@socialtools.net> <20031205140424.GC21946@redhat.com> Message-ID: <3FD1C144.1070705@socialtools.net> Richard Jones wrote: > The trick (which we used on the aforementioned large Perl project) was > to have separate HTML templates for each language. Suppose the page has to display a list of error messages. Of course, it doesn't know in advance what these will be. Moreover, suppose these same error messages need to be displayed in a short form on some pages, and in a long form on other pages, depending on how much space the graphic designer has. Ben From ben at socialtools.net Sat Dec 6 03:50:54 2003 From: ben at socialtools.net (Benjamin Geer) Date: Sat, 06 Dec 2003 11:50:54 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031205140217.GB21946@redhat.com> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <20031205135720.GA21946@redhat.com> <20031205140217.GB21946@redhat.com> Message-ID: <3FD1C29E.203@socialtools.net> Richard Jones wrote: > _("Number of messages in ::folder:: : ::n::", folder => $foldername, > n => $nr_messages) For a sentence like Your search returned x files in y directories. that would get very awkward. The usability department might not be happy with Number of directories containing matching files: y. Number of matching files in those directories: x. If I saw that one a web site, I'd think, 'Why couldn't they just write one simple sentence?' And indeed there's no reason why it can't be done, if you have adequate tools. I really strongly suggest you read the article about Maketext: http://www.perldoc.com/perl5.8.0/lib/Locale/Maketext/TPJ13.html Ben From ben at socialtools.net Sat Dec 6 03:52:18 2003 From: ben at socialtools.net (Benjamin Geer) Date: Sat, 06 Dec 2003 11:52:18 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031205135720.GA21946@redhat.com> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <20031205135720.GA21946@redhat.com> Message-ID: <3FD1C2F2.8@socialtools.net> Richard Jones wrote: > Conversely, I also worked on a Java project where they had rolled > their own translation system using keys into an external XML > file. That was a really bad approach, because the application kept > crashing every time a translation was missing. That shows that their system was poorly implemented, but not that it was a bad approach. There's no reason for a key-based system to crash because of missing translations. Ben From rich at annexia.org Sat Dec 6 07:09:04 2003 From: rich at annexia.org (Richard Jones) Date: Sat, 6 Dec 2003 15:09:04 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FD1C144.1070705@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDC53F.9030005@socialtools.net> <20031205140424.GC21946@redhat.com> <3FD1C144.1070705@socialtools.net> Message-ID: <20031206150904.GA1021@redhat.com> On Sat, Dec 06, 2003 at 11:45:08AM +0000, Benjamin Geer wrote: > Richard Jones wrote: > >The trick (which we used on the aforementioned large Perl project) was > >to have separate HTML templates for each language. > > Suppose the page has to display a list of error messages. Of course, it > doesn't know in advance what these will be. Moreover, suppose these > same error messages need to be displayed in a short form on some pages, > and in a long form on other pages, depending on how much space the > graphic designer has. I'm not quite sure if this is a real requirement of not. I know that we did many 1000s of pages using the gettext approach and didn't come across this sort of problem. Perhaps we were thinking of it differently. In particular, when we had needed to display a list of something, then we would use the list feature of our templating library (which, BTW, is extremely similar to the templating library included in mod_caml). Thus: ::list(errors):: ::end::
::error_code:: ::error_message::
(the ::list:: ... ::end:: feature generates multiple rows from an array or list passed to the templating library by the program). The list of error messages would be translated in the program, eg: @errors = ( { error_code => 500, error_message => gettext ("Internal Server Error") }, error_code => 404, error_message => gettext ("Page Not Found") } ); At runtime the correct translation was chosen by gettext based on the LANG environment variable, which was set on a request-by-request basis, based on Accept: headers and the language preference for the current user from the database. I guess if you have short or long errors, then you can extend the above easily enough so you also have a ::short_error_message:: field which the graphic designer may substitute where necessary. Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment NET::FTPSERVER is a full-featured, secure, configurable, database-backed FTP server written in Perl: http://www.annexia.org/freeware/netftpserver/ From rich at annexia.org Sat Dec 6 07:12:59 2003 From: rich at annexia.org (Richard Jones) Date: Sat, 6 Dec 2003 15:12:59 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FD1C2F2.8@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <20031205135720.GA21946@redhat.com> <3FD1C2F2.8@socialtools.net> Message-ID: <20031206151259.GB1021@redhat.com> On Sat, Dec 06, 2003 at 11:52:18AM +0000, Benjamin Geer wrote: > Richard Jones wrote: > > Conversely, I also worked on a Java project where they had rolled > > their own translation system using keys into an external XML > > file. That was a really bad approach, because the application kept > > crashing every time a translation was missing. > > That shows that their system was poorly implemented, but not that it was > a bad approach. There's no reason for a key-based system to crash > because of missing translations. Well indeed. In fact the whole system was more badly designed and poorly implemented than I think you can possibly imagine. (Red Hat's CCM nee Ars Digita ACS Java, in case you're wondering what I was working on). The fact remains, however, that it's very hard for a key-based system to do the right thing when a translation is missing. The best thing it can do is display the key. Hopefully the key will be something meaningful to the end user, such as THIS_IS_THE_ENGLISH_MESSAGE. Ah, but now you might as well make the key *be* the English message! Looks like we've actually arrived at the gettext approach after all :-) Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment MAKE+ is a sane replacement for GNU autoconf/automake. One script compiles, RPMs, pkgs etc. Linux, BSD, Solaris. http://www.annexia.org/freeware/makeplus/ From rich at annexia.org Sat Dec 6 07:16:57 2003 From: rich at annexia.org (Richard Jones) Date: Sat, 6 Dec 2003 15:16:57 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FD1C29E.203@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <20031205135720.GA21946@redhat.com> <20031205140217.GB21946@redhat.com> <3FD1C29E.203@socialtools.net> Message-ID: <20031206151657.GC1021@redhat.com> On Sat, Dec 06, 2003 at 11:50:54AM +0000, Benjamin Geer wrote: > Richard Jones wrote: > > _("Number of messages in ::folder:: : ::n::", folder => $foldername, > > n => $nr_messages) > > For a sentence like > > Your search returned x files in y directories. > > that would get very awkward. The usability department might not be > happy with > > Number of directories containing matching files: y. > Number of matching files in those directories: x. You're dead right. I didn't claim that gettext was ideal, just that it does solve a good 95% of the problem, without undue complication. > http://www.perldoc.com/perl5.8.0/lib/Locale/Maketext/TPJ13.html I looked at this paper, and this does seem to be a good approach, although they could do with reducing the overhead of what the programmer needs to type, and they should make the language implicit (eg. in a global or environment variable) since it rarely changes and shouldn't have to be constantly passed around by the programmer. Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment 'There is a joke about American engineers and French engineers. The American team brings a prototype to the French team. The French team's response is: "Well, it works fine in practice; but how will it hold up in theory?"' From mattam at altern.org Sat Dec 6 08:19:17 2003 From: mattam at altern.org (Matthieu Sozeau) Date: Sat, 06 Dec 2003 17:19:17 +0100 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031206151259.GB1021@redhat.com> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <20031205135720.GA21946@redhat.com> <3FD1C2F2.8@socialtools.net> <20031206151259.GB1021@redhat.com> Message-ID: <87n0a5gbqy.wl%mattam@altern.org> >>>>> On Sat, 6 Dec 2003 15:12:59 +0000, Richard Jones said: > The fact remains, however, that it's very hard for a key-based system > to do the right thing when a translation is missing. The best thing it > can do is display the key. Hopefully the key will be something > meaningful to the end user, such as THIS_IS_THE_ENGLISH_MESSAGE. > Ah, but now you might as well make the key *be* the English message! > Looks like we've actually arrived at the gettext approach after all > :-) You can also have a sensible default catalogue, or even custom behavior when messages are not found with the Maketext approach. Also, any text should be accepted as a key so you can implement the behavior too if you prefer. That's only more flexibility. -- Matthieu Sozeau http://mattam.org From ben at socialtools.net Sat Dec 6 12:03:09 2003 From: ben at socialtools.net (Benjamin Geer) Date: Sat, 06 Dec 2003 20:03:09 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031206150904.GA1021@redhat.com> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDC53F.9030005@socialtools.net> <20031205140424.GC21946@redhat.com> <3FD1C144.1070705@socialtools.net> <20031206150904.GA1021@redhat.com> Message-ID: <3FD235FD.5080208@socialtools.net> Richard Jones wrote: > I'm not quite sure if this is a real requirement of not. I know that > we did many 1000s of pages using the gettext approach and didn't come > across this sort of problem. Perhaps we were thinking of it > differently. Making several versions of each page for different languages could be OK if you're really presenting things differently for different languages, but if the application does exactly the same thing in every language (I'm thinking of something like Amazon.com), wouldn't it be a pain to have to change 7 pages just to change the layout on 1 page? Wouldn't you worry that they'd get out of sync? After all, the purpose of templates is to allow you to keep layout separate from content. Ben From yoriyuki at mbg.ocn.ne.jp Sat Dec 6 13:29:48 2003 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Sun, 07 Dec 2003 06:29:48 +0900 (JST) Subject: [Ocaml-i18n] [ANNOUNCE] camommile-0.4.2 Message-ID: <20031207.062948.74566487.yoriyuki@mbg.ocn.ne.jp> camomile-0.4.2 is available from http://prdownloads.sourceforge.net/camomile/camomile-0.4.2.tar.bz2 This release is a bug-fix release. Changes are * Add bigarray to the dependency in META. * Fix a bug in input_line function of ULine. * Fix a XArray.add_array bug, which affects XString.add_text, UText.Buf.add_string. * Make the collator a bit fast. -- Yamagata Yoriyuki From yoriyuki at mbg.ocn.ne.jp Sat Dec 6 14:10:40 2003 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Sun, 07 Dec 2003 07:10:40 +0900 (JST) Subject: [Ocaml-i18n] Roadmap to camomile-0.5.0 Message-ID: <20031207.071040.85680964.yoriyuki@mbg.ocn.ne.jp> The development of Camomile is stalled recently, but I'd like to ask here about the future of camomile, so that I can motivate myself. Here are things in my mind for the future development of camomile. The order is the priority given to them in my mind. 1) Provide a replacement of Pervasive and String. I'd like to implement the idea of using the open directive. 2) Language support of Unicode. What I think about are, * The ocaml code is interpreted by the encoding specified in the current locale. * Counting characters as Unicode. * A Unicode literal in a string/character literal and pattern matching. * Non Latin-1 identifiers. 3) Text search using collation. 4) Better regexp, maybe Perl6-like one. 5) Integration of LDML. (locale description mark-up language) Also, I'd like to see more packages (GODI, Windows binary, fink...) Naturally, these are huge works. So, I'd like to know the things you need most, the things I missed completely and your opinion that everything in camomile is totally crap. -- Yamagata Yoriyuki From sylvain.le-gall at polytechnique.org Sat Dec 6 14:20:45 2003 From: sylvain.le-gall at polytechnique.org (Sylvain LE GALL) Date: Sat, 6 Dec 2003 23:20:45 +0100 Subject: [Ocaml-i18n] Roadmap to camomile-0.5.0 In-Reply-To: <20031207.071040.85680964.yoriyuki@mbg.ocn.ne.jp> References: <20031207.071040.85680964.yoriyuki@mbg.ocn.ne.jp> Message-ID: <20031206222045.GA2017@grand> Hello, On Sun, Dec 07, 2003 at 07:10:40AM +0900, Yamagata Yoriyuki wrote: > The development of Camomile is stalled recently, but I'd like to ask > here about the future of camomile, so that I can motivate myself. > > Here are things in my mind for the future development of camomile. > The order is the priority given to them in my mind. > > 1) Provide a replacement of Pervasive and String. I'd like to > implement the idea of using the open directive. > > 2) Language support of Unicode. What I think about are, > * The ocaml code is interpreted by the encoding specified in the > current locale. > * Counting characters as Unicode. > * A Unicode literal in a string/character literal and pattern matching. > * Non Latin-1 identifiers. > > 3) Text search using collation. > > 4) Better regexp, maybe Perl6-like one. > > 5) Integration of LDML. (locale description mark-up language) > > Also, I'd like to see more packages (GODI, Windows binary, fink...) > > Naturally, these are huge works. So, I'd like to know the things you > need most, the things I missed completely and your opinion that > everything in camomile is totally crap. > > -- > Yamagata Yoriyuki All this seems very good to me... But ( sorry ), what i want to see the most is that camomile is DFSG ( in the sense of debian ).... There is a conflict of licence about the UnicodeData.txt file... Which is problematic. So what i would like to see is an inverstigation of the fredom of the library... It is totally out of the coding scope but it is also needed to be in Debian... Kind regard Sylvain LE GALL From peter at jollys.org Sat Dec 6 15:37:11 2003 From: peter at jollys.org (Peter Jolly) Date: Sat, 06 Dec 2003 23:37:11 +0000 Subject: [Ocaml-i18n] Roadmap to camomile-0.5.0 In-Reply-To: <20031206222045.GA2017@grand> References: <20031207.071040.85680964.yoriyuki@mbg.ocn.ne.jp> <20031206222045.GA2017@grand> Message-ID: <6.0.0.22.0.20031206232750.028981f0@mail.purplecloud.net> >But ( sorry ), what i want to see the most is that camomile is DFSG ( in >the sense of debian ).... There is a conflict of licence about the >UnicodeData.txt file... Which is problematic. > >So what i would like to see is an inverstigation of the fredom of the >library... > >It is totally out of the coding scope but it is also needed to be in >Debian... The Unicode data issue has been discussed several times on Debian-legal, for example in this thread from July: http://lists.debian.org/debian-legal/2003/debian-legal-200307/msg00056.html. It mentions a plan to release an unambiguously Free-as-in-GNU version of the data in an updated miscfiles package, but that seems not to have happened yet. How are other Debian Unicode packages managing? From kwkmmsn at tcn-catv.ne.jp Sat Dec 6 22:45:58 2003 From: kwkmmsn at tcn-catv.ne.jp (KAWAKAMI Shigenobu) Date: Sun, 7 Dec 2003 15:45:58 +0900 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <20031204232400.GA360@gallu.homelinux.org> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <20031204.235227.97295945.yoriyuki@mbg.ocn.ne.jp> <20031204232400.GA360@gallu.homelinux.org> Message-ID: <20031207154558392287.GyazMail.kwkmmsn@tcn-catv.ne.jp> Hi, sylvain.le-gall at polytechnique.org ha dicho en "Re: [Ocaml-i18n] proposal: message catalogue system": >> Btw, does gettext work on Windows and Mac? >> > > Hello, > > Gettext ( in C ) should normally works on mac and windows ( however i > never tried it ). It works on Mac OSX. The GUI layer of Mac OS X relies on a different system than gettext for selecting languages and, understandably, the OS itself doesn't come with gettext. But you can install it and see it work. > For the ocaml binding, i should say that it is > working, but i need people who works on this OS to test it. Well, it works but I had some trouble in building it. I'll tell you my experience in a separate mail. ---- KAWAKAMI Shigenobu kwkmmsn at tcn-catv.ne.jp From sylvain.le-gall at polytechnique.org Sun Dec 7 01:16:27 2003 From: sylvain.le-gall at polytechnique.org (sylvain.le-gall at polytechnique.org) Date: Sun, 7 Dec 2003 10:16:27 +0100 Subject: [Ocaml-i18n] Roadmap to camomile-0.5.0 In-Reply-To: <6.0.0.22.0.20031206232750.028981f0@mail.purplecloud.net> References: <20031207.071040.85680964.yoriyuki@mbg.ocn.ne.jp> <20031206222045.GA2017@grand> <6.0.0.22.0.20031206232750.028981f0@mail.purplecloud.net> Message-ID: <20031207091627.GA1588@gallu.homelinux.org> On Sat, Dec 06, 2003 at 11:37:11PM +0000, Peter Jolly wrote: > > >But ( sorry ), what i want to see the most is that camomile is DFSG ( in > >the sense of debian ).... There is a conflict of licence about the > >UnicodeData.txt file... Which is problematic. > > > >So what i would like to see is an inverstigation of the fredom of the > >library... > > > >It is totally out of the coding scope but it is also needed to be in > >Debian... > > The Unicode data issue has been discussed several times on Debian-legal, > for example in this thread from July: > http://lists.debian.org/debian-legal/2003/debian-legal-200307/msg00056.html. > It mentions a plan to release an unambiguously Free-as-in-GNU version of > the data in an updated miscfiles package, but that seems not to have > happened yet. How are other Debian Unicode packages managing? > > Hello, Just to take one example : gucharmap seems to have a RC bugs concerning this files... I need to investigate, but debian-legal seems not to have any solution. I really wonder what to do about that ! Kind regard Sylvain LE GALL From rich at annexia.org Sun Dec 7 03:30:33 2003 From: rich at annexia.org (Richard Jones) Date: Sun, 7 Dec 2003 11:30:33 +0000 Subject: [Ocaml-i18n] proposal: message catalogue system In-Reply-To: <3FD235FD.5080208@socialtools.net> References: <3FCCDF8F.4060404@socialtools.net> <20031202230008.GA8381@grand> <3FCD2067.10008@socialtools.net> <20031203065940.GA564@gallu.homelinux.org> <3FCDC53F.9030005@socialtools.net> <20031205140424.GC21946@redhat.com> <3FD1C144.1070705@socialtools.net> <20031206150904.GA1021@redhat.com> <3FD235FD.5080208@socialtools.net> Message-ID: <20031207113033.GA9543@redhat.com> On Sat, Dec 06, 2003 at 08:03:09PM +0000, Benjamin Geer wrote: > Richard Jones wrote: > >I'm not quite sure if this is a real requirement of not. I know that > >we did many 1000s of pages using the gettext approach and didn't come > >across this sort of problem. Perhaps we were thinking of it > >differently. > > Making several versions of each page for different languages could be OK > if you're really presenting things differently for different languages, > but if the application does exactly the same thing in every language > (I'm thinking of something like Amazon.com), wouldn't it be a pain to > have to change 7 pages just to change the layout on 1 page? Wouldn't > you worry that they'd get out of sync? After all, the purpose of > templates is to allow you to keep layout separate from content. I don't really want to continue this argument much futher ... but suffice to say that although there were 7 different versions of the page in the system, the non-English ones were generated automatically using a Windows-based commercial translation memory system. Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment C2LIB is a library of basic Perl/STL-like types for C. Vectors, hashes, trees, string funcs, pool allocator: http://www.annexia.org/freeware/c2lib/ From yoriyuki at mbg.ocn.ne.jp Sun Dec 7 06:48:19 2003 From: yoriyuki at mbg.ocn.ne.jp (Yamagata Yoriyuki) Date: Sun, 07 Dec 2003 23:48:19 +0900 (JST) Subject: [Ocaml-i18n] Roadmap to camomile-0.5.0 In-Reply-To: <20031206222045.GA2017@grand> References: <20031207.071040.85680964.yoriyuki@mbg.ocn.ne.jp> <20031206222045.GA2017@grand> Message-ID: <20031207.234819.04771585.yoriyuki@mbg.ocn.ne.jp> From: "Sylvain LE GALL" Subject: Re: [Ocaml-i18n] Roadmap to camomile-0.5.0 Date: Sat, 6 Dec 2003 23:20:45 +0100 > But ( sorry ), what i want to see the most is that camomile is DFSG ( in > the sense of debian ).... There is a conflict of licence about the > UnicodeData.txt file... Which is problematic. I considered GNU misc files, but their unicode file looks like an identical copy of UnicodeData.txt. its legitimacy is problematic and Camomile needs another data from UCD, and Technical Reports. I think the best solution is that, someone with influence will talk with Unicode Consortium. IBM has an open source project (http://oss.software.ibm.com/icu/) using UCD, so they might help such a move. Sorry, but there is not much I can do. -- Yamagata Yoriyuki