This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

EUC-JP and the Yen sign



The EUC-JP converter currently maps half-width Yen sign to backslash. These
are quite distinct glyphs, and it doesn't make sense to confuse them.
Part of the confusion comes from the unicode.org SHIFTJIS.TXT file, but
even Microsoft does not follow this table in their CP932 encoding table.
(More precisely they say 0x5C corresponds to U005C but they display it as
a Yen sign...)
Since on Unix we don't have the directory/filename separator problem,
we should map it according to the glyph - YEN to YEN.

Note also that the XFree86 "ja" and "ja.JIS" locales have as their
primary font choice for the 0x00..0x7F range the ISO8859-1:GL (= US-ASCII)
fonts. Therefore 0x5C in Japanese XFree86 environments will look like a
backslash, not like a Yen sign.

The patch below maps the half-width Yen sign to the full-width Yen sign
instead.


2000-10-14  Bruno Haible  <haible@clisp.cons.org>

	* iconvdata/euc-jp.c (BODY for TO_LOOP): Map U00A5 to \xa1\xef,
	not to \x5c.
	* iconvdata/EUC-JP.irreversible: Adjust accordingly.

*** glibc-20001010/iconvdata/euc-jp.c.bak	Wed Jul 12 18:11:43 2000
--- glibc-20001010/iconvdata/euc-jp.c	Sat Oct 14 17:26:09 2000
***************
*** 170,178 ****
      if (ch < 0x8e || (ch >= 0x90 && ch <= 0x9f))			      \
        /* It's plain ASCII or C1.  */					      \
        *outptr++ = ch;							      \
-     else if (ch == 0xa5)						      \
-       /* YEN sign => backslash  */					      \
-       *outptr++ = 0x5c;							      \
      else if (ch == 0x203e)						      \
        /* overscore => asciitilde */					      \
        *outptr++ = 0x7e;							      \
--- 170,175 ----
***************
*** 180,185 ****
--- 177,186 ----
        {									      \
  	/* Try the JIS character sets.  */				      \
  	size_t found;							      \
+ 									      \
+ 	/* Map half-width YEN sign to full-width YEN sign.  */		      \
+ 	if (__builtin_expect (ch == 0x00a5, 0))				      \
+ 	  ch = 0xffe5;							      \
  									      \
  	/* See whether we have room for at least two characters.  */	      \
  	if (__builtin_expect (outptr + 1 >= outend, 0))			      \
*** glibc-20001010/iconvdata/EUC-JP.irreversible.bak	Tue Sep  5 03:47:42 2000
--- glibc-20001010/iconvdata/EUC-JP.irreversible	Sat Oct 14 17:28:39 2000
***************
*** 1,6 ****
- 0x5C	0x00A5
  0x7E	0x203E
  0x8FA2B7	0x007E
  0x8FA2B7	0xFF5E
  0xA1C0	0x005C
  0xA1C0	0xFF3C
--- 1,6 ----
  0x7E	0x203E
  0x8FA2B7	0x007E
  0x8FA2B7	0xFF5E
  0xA1C0	0x005C
  0xA1C0	0xFF3C
+ 0xA1EF	0x00A5

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]