This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: New GB18030 gconv module for glibc (from ThizLinux Laboratory)
- From: Anthony Fok <anthony at thizlinux dot com>
- To: Ulrich Drepper <drepper at redhat dot com>
- Cc: Markus Scherer <markus dot scherer at us dot ibm dot com>, fai at thizlinux dot com, Bruno Haible <haible at ilog dot fr>, kevin at thizlinux dot com, libc-alpha at sources dot redhat dot com, sunnygu at thizgroup dot com, suzhe at gnuchina dot org, Yu Shao <yshao at redhat dot com>
- Date: Fri, 18 Jan 2002 10:07:36 +0800
- Subject: Re: New GB18030 gconv module for glibc (from ThizLinux Laboratory)
- References: <OFA37FFE2D.E724FA8B-ON88256B44.006732B8@raleigh.ibm.com> <m3zo3cbqos.fsf@myware.mynet>
On Thu, Jan 17, 2002 at 01:41:55PM -0800, Ulrich Drepper wrote:
> "Markus Scherer" <markus.scherer@us.ibm.com> writes:
>
> > I agree with what Anthony said about mapping code points: Even if they do
> > not have assigned characters,
>
> It is completely irrelevant what you think. The converters convert
> from the external charset to the internal private charset. The latter
> is defined in a way which disallows any non-Unicode position.
But U+33FF _is_ a valid and legal Unicode position. Again, remember that
"unassigned" and "illegal" are two different entities.
> What you do with your own code I don't care; but stay out of discussions
> like this when they are related to glibc.
libc-alpha is a public list, and I personally Cc: him and invited him to
join the discussion because Markus Scherer and Dirk Meyer are two
most authorative source of the GB18030 standard. Dirk translated the
GB18030 Standard to an English Summary so that the rest of the world (most
of whom can't read Chinese) learns GB18030. Markus works with the Chinese
standards agency and Unicode Consortium to provide _official_ mapping
tables for Unicode<->GB18030 conversion.
It is not what Markus thinks; it is _the_ GB18030 and Unicode standards!
And, as I said before, since GB18030 is in reality the UTF for Mainland
China (to retain GBK compatibility), what works for "-f ucs4 -t utf8" should
also work for "-f ucs4 -t gb18030". Either both work for all of
U+0000..U+D7FF, U+E000..U+10FFFF, or neither does. Otherwise, glibc is just
inconsistent, which implies a lack of understanding of the code and spirit
of the GB18030 Standard.
I could care less whether you use Yu Shao's or my code as long as
glibc does the _right_ thing, because the entire GNU system (XFree86,
GNOME, Gtk, etc. etc.) depends on glibc to do the right thing.
Best regards,
Anthony
--
Anthony Fok Tung-Ling
ThizLinux Laboratory <anthony@thizlinux.com> http://www.thizlinux.com/
Debian Chinese Project <foka@debian.org> http://www.debian.org/intl/zh/
Come visit Our Lady of Victory Camp! http://www.olvc.ab.ca/