This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
EUC-JP and the Yen sign
- To: libc-alpha at sources dot redhat dot com
- Subject: EUC-JP and the Yen sign
- From: Bruno Haible <haible at ilog dot fr>
- Date: Sun, 15 Oct 2000 14:00:11 +0200 (CEST)
The EUC-JP converter currently maps half-width Yen sign to backslash. These
are quite distinct glyphs, and it doesn't make sense to confuse them.
Part of the confusion comes from the unicode.org SHIFTJIS.TXT file, but
even Microsoft does not follow this table in their CP932 encoding table.
(More precisely they say 0x5C corresponds to U005C but they display it as
a Yen sign...)
Since on Unix we don't have the directory/filename separator problem,
we should map it according to the glyph - YEN to YEN.
Note also that the XFree86 "ja" and "ja.JIS" locales have as their
primary font choice for the 0x00..0x7F range the ISO8859-1:GL (= US-ASCII)
fonts. Therefore 0x5C in Japanese XFree86 environments will look like a
backslash, not like a Yen sign.
The patch below maps the half-width Yen sign to the full-width Yen sign
instead.
2000-10-14 Bruno Haible <haible@clisp.cons.org>
* iconvdata/euc-jp.c (BODY for TO_LOOP): Map U00A5 to \xa1\xef,
not to \x5c.
* iconvdata/EUC-JP.irreversible: Adjust accordingly.
*** glibc-20001010/iconvdata/euc-jp.c.bak Wed Jul 12 18:11:43 2000
--- glibc-20001010/iconvdata/euc-jp.c Sat Oct 14 17:26:09 2000
***************
*** 170,178 ****
if (ch < 0x8e || (ch >= 0x90 && ch <= 0x9f)) \
/* It's plain ASCII or C1. */ \
*outptr++ = ch; \
- else if (ch == 0xa5) \
- /* YEN sign => backslash */ \
- *outptr++ = 0x5c; \
else if (ch == 0x203e) \
/* overscore => asciitilde */ \
*outptr++ = 0x7e; \
--- 170,175 ----
***************
*** 180,185 ****
--- 177,186 ----
{ \
/* Try the JIS character sets. */ \
size_t found; \
+ \
+ /* Map half-width YEN sign to full-width YEN sign. */ \
+ if (__builtin_expect (ch == 0x00a5, 0)) \
+ ch = 0xffe5; \
\
/* See whether we have room for at least two characters. */ \
if (__builtin_expect (outptr + 1 >= outend, 0)) \
*** glibc-20001010/iconvdata/EUC-JP.irreversible.bak Tue Sep 5 03:47:42 2000
--- glibc-20001010/iconvdata/EUC-JP.irreversible Sat Oct 14 17:28:39 2000
***************
*** 1,6 ****
- 0x5C 0x00A5
0x7E 0x203E
0x8FA2B7 0x007E
0x8FA2B7 0xFF5E
0xA1C0 0x005C
0xA1C0 0xFF3C
--- 1,6 ----
0x7E 0x203E
0x8FA2B7 0x007E
0x8FA2B7 0xFF5E
0xA1C0 0x005C
0xA1C0 0xFF3C
+ 0xA1EF 0x00A5