This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
Re: More about charsets
On Mar 27 18:52, Andy Koppe wrote:
> On 27 March 2010 17:53, Corinna Vinschen:
> > I also intend to make GB2312 the default name, rather than GBK since
> > that's the default for these languages in Linux.
>
> You mean have nl_langinfo(CODESET) return GB2312 when something like
> "zh_CN.GBK" is selected? Not sure about that, because it might cause
No. What I mean is, if somebody chooses a language_TERRITORY code which
default codepage is 936, then set the codeset to "GB2312". If somebody
explicitely chooses "GBK", stick to it. If somebody chooses "EUC-CN",
map it to GB2312. That reflects what Linux does. So that's what
happens:
setlocale (LC_CTYPE, "zh_CN");
printf ("%s\n", nl_langinfo (CODESET));
==> "GB2312"
setlocale (LC_CTYPE, "zh_CN.gbk");
printf ("%s\n", nl_langinfo (CODESET));
==> "GBK"
setlocale (LC_CTYPE, "zh_CN.eucCN");
printf ("%s\n", nl_langinfo (CODESET));
==> "GB2312"
> > Btw., apart from EUC-TW, what's missing as well is BIG5-HKSCS. ?I read
> > http://en.wikipedia.org/wiki/HKSCS and the Windows specific section,
> > but I'm still puzzled how this is supposed to work. ?Does Vista's
> > codepage 950 contain the HKSCS elements or not?!?
>
> Nope, it doesn't. For XP there's an installable package that turns
> codepage 950 into BIG5-HKSCS. As far as I understand it, in Vista MS
> gave up on the idea of extending BIG5, and instead interpreted the
> HKSCS spec as a requirement for fonts and programs to support the
> Unicode codepoints needed for Cantonese. Here's Michael Kaplan
> sounding off on codepage "951":
> http://blogs.msdn.com/michkap/archive/2007/05/12/2561904.aspx
Too bad. I hope it's not overly critical. Right now standard Big5 is
our default for zh_HK as well as for zh_TW.
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat