This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug libc/1386] New: iconv incorrectly convert bytes 1A, 1C and 7F for IBM943 and IBM942
- From: "grhoten at jtcsv dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sources dot redhat dot com
- Date: 28 Sep 2005 03:48:26 -0000
- Subject: [Bug libc/1386] New: iconv incorrectly convert bytes 1A, 1C and 7F for IBM943 and IBM942
- Reply-to: sourceware-bugzilla at sourceware dot org
The conversion tables for IBM943 and IBM942 are incorrect for iconv. The byte
values for 1A, 1C and 7F do not round trip to Unicode (UTF-8) and back to these
Shift-JIS codepages. Normally Unicode 1A roundtrip maps to Shift-JIS 7F, Unicode
7F roundtrip maps to Shift-JIS 1C and Unicode 1C roundtrip maps to Shift-JIS 1A.
iconv does not have this behavior. For example iconv has the following behavior,
Unicode 1F converts to Shift-JIS 1C, and Shift-JIS 1C converts to Unicode 1A.
If you would like the mapping tables generated from IBM's official repository of
coded character sets, I recommend you look at these tables, and use them for the
basis of iconv.
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-942_P12A-1999.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-943_P15A-2003.ucm
For reference, here are other tables that can be used for the same CCSID (coded
character set identifier).
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-942_P120-1999.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-942_P12A-1998.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-943_P130-1999.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-943_P14A-1998.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-943_P14A-1999.ucm
(full disclosure) I work for IBM, and I am a part of the ICU project.
--
Summary: iconv incorrectly convert bytes 1A, 1C and 7F for IBM943
and IBM942
Product: glibc
Version: 2.3.3
Status: NEW
Severity: normal
Priority: P2
Component: libc
AssignedTo: gotom at debian dot or dot jp
ReportedBy: grhoten at jtcsv dot com
CC: glibc-bugs at sources dot redhat dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=1386
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.