This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/1386] New: iconv incorrectly convert bytes 1A, 1C and 7F for IBM943 and IBM942


The conversion tables for IBM943 and IBM942 are incorrect for iconv. The byte
values for 1A, 1C and 7F do not round trip to Unicode (UTF-8) and back to these
Shift-JIS codepages. Normally Unicode 1A roundtrip maps to Shift-JIS 7F, Unicode
7F roundtrip maps to Shift-JIS 1C and Unicode 1C roundtrip maps to Shift-JIS 1A.

iconv does not have this behavior. For example iconv has the following behavior,
Unicode 1F converts to Shift-JIS 1C, and Shift-JIS 1C converts to Unicode 1A.

If you would like the mapping tables generated from IBM's official repository of
coded character sets, I recommend you look at these tables, and use them for the
basis of iconv.

http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-942_P12A-1999.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-943_P15A-2003.ucm

For reference, here are other tables that can be used for the same CCSID (coded
character set identifier).
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-942_P120-1999.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-942_P12A-1998.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-943_P130-1999.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-943_P14A-1998.ucm
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/charset/data/ucm/ibm-943_P14A-1999.ucm
 
(full disclosure) I work for IBM, and I am a part of the ICU project.

-- 
           Summary: iconv incorrectly convert bytes 1A, 1C and 7F for IBM943
                    and IBM942
           Product: glibc
           Version: 2.3.3
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: gotom at debian dot or dot jp
        ReportedBy: grhoten at jtcsv dot com
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=1386

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]