This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Request for inclusion of sr_CS and sr_CS@Latn locales


Hi Petter,
Thanks for your great comments. 

Petter Reinholdtsen <pere@hungry.com> writes:
>
> Here are some comments.  Are you aware of my
> <URL:http://www.student.uit.no/~pere/linux/glibc/>?  I'll make sure
> your new locales are listed there as well.
>

Yes, I'm aware of that, but it has apparently been updated since my
last visit: it's much more informative this time. Great work Petter,
thanks.

>
> I thought there were some controversy regarding the new country
> keeping the code of the old one.  Keld, clear to explain the current
> status of this?
>

Apparently nothing that will change soon (it's already been another
five or six months since the assignment, and it took 5 months of
debates to put anything out in the first place). So, this seems a
pretty safe choice. Though, I'm not really insider on this. 


>> Choice of "Latn" was already discussed, and there was no clear
>> consensus (AFAICT) whether it's a good choice. "@latin" is not used
>> anywhere, so I don't see the merit of choosing that one over the
>> "@Latn" (and this second one is used for around 200 files in Gnome
>> CVS, and several other places).
>
> Ulrich seemed to have some strong opinions here.  I have no
> well-founded opinion.

When anything else is disregarded, this also comes down to the
(in)convenience caused. There are already hundreds of translations in
the form of sr@Latn.po files, and it would be a major inconvenience
for every translator to change that (of course, I didn't want to halt
the translation process before this is resolved).

Also, my recollection is that Ulrich was opposed to accepting a
*strategy* of using ISO 15924 script identifiers at all times, even
more so since "@cyrillic" has already been used in GNU libc. There's
no "@latin" or anything similar in any locale of current libc CVS, so
there's nothing against "@Latn", that I know of (of course). Actually,
there are only four distinct modifiers among locale definitions, with
"@euro" being the only repeated one, and "@cyrillic" having two
occurences since Uzbek locales were added not long ago -- this number
of different modifiers is not something I would base conclusion on.

Another argument I may make is that there are already at least a
couple dozen users of this locale name, and what's better than having
it tested in practice ;)

Of course, all of this is not relevant from the strict GNU libc point
of view, but I still hope it has merit, and that no unneccessary
work will be put on people contributing translations and maintainers
of many software packages.

Ulrich, could you please give your words about this issue? Is there any
chance of including Serbian Latin locale with the name of sr_CS@Latn?

>> LC_IDENTIFICATION
>
> The order of the sections do not follow my recommended order:

I've fixed these, I'm attaching new revisions. I don't feel like
submitting every recipient of libc-alpha list to more changes, so I
can send further updates to you personally, if that's ok with you
Petter. 

> I believe these should list a reference to the standard they confirm
> to, not the name of the locale and the year they were made.  This
> value is currently ignored, so there is no official list of values
> yet.  I suggest changing sr_CS:2003 with i18n:1997 or posix:2001,
> depending on which standard you intended these fields to follow.

I've set this as well in my local copies, and I used
i18n:1997. Though, I have followed a draft of ISO 14652 which uses
LC_VERSIONS instead of LC_IDENTIFICATION, and uses "i18n:1998"
instead. I'm not really sure what's the correct way to put it.

> I notice that LC_MONETARY lists 3;3 as the grouping, while LC_NUMERIC
> do not use grouping.  Is this intended?

Yes, that's intended. I don't have any official recommendations in
the case of regular numerical values, so I just sticked with what was
in sr_YU previously.

> Be adviced that the formatting of international currency have been
> fixed lately, so you might want to check the output again to verify
> that you get the format you expect.  What should 1.23, -1.23 and
> 123.45 look like when formatted as local and international currency?
> It would be useful to add test code to make sure the formatting is
> correct.

I didn't actually install latest GNU libc from CVS, so I didn't have
a chance to test recent changes to strfmon(). 

Anyhow, for the mentioned examples, expected output is (using UTF-8
for Cyrillic letters) "1,23 ÐÐÐ", "-1,23 ÐÐÐ" and "123,45 ÐÐÐ". For
international currency, you need only replace "ÐÐÐ" with "YUM".

With a proper testing, I've discovered several ommisions in
LC_MONETARY definition, and I've fixed these in the new revision.

>
> It would also be useful to provide sorting examples with the correct
> sorting order.  There are scripts to test the sorting of a given
> locale, and it is useful to detect if the ordering is correct or not.

I have some non-extensive tests, but they're far from being able
to detect everything.  Generally though, both ISO 14651 and Unicode
Collation Algorithm (TR10, I think) do the job well for Serbian (in
Cyrillic script, of course).  Collation for Serbian Latin
transcription is what's more problematic, and more difficult to do,
and I currently lack the time to do it properly through locale
data. Actually, it cannot be done programmatically at all if you're
aiming for full correctness, without accounting for exceptions, but it
can be made to work in most cases.

Thanks again for your great input,
Danilo

Attachment: sr_CS
Description: Updated Serbian locale

Attachment: txt00004.txt
Description: Updated Serbian Latin locale


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]