This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.

Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [nls 00248] gencat


Hi,

> We also realized that gencat could not handle multi-byte characters
> approprietly if the chacters include \x5c (back slash), which is the case
> where the input source files are in Shift-JIS, for example. (as you know
> more than us, many Japanese characters in SJIS include \x5c in the sencond
> byte.)

Yeah, this is a problem.  I don't know yet how to solve it.  Something
like the proposed patch seems to be required but it is really
unfortunate since this means it'll not anymore possible to compile all
message catalogs without major preparations.  Currently programs like
rpm simply compile all catalogs in a loop.  This will be impossible
since a specific locale for each catalog has to be set.

One alternative would be to require the catalogs with known problems
(like the SJIS encoded catalogs) to contain a comment line specifying
the charset.  E.g., require the first line to contain

$ charset=SJIS

If no such line is found the C locale data is used.  Otherwise we'll
convert the whole input file using iconv() to UCS-4, work on it, and
convert back before writing it out.

Does this sound acceptable?

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]