This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH/RFA] Add CJK ambiguous character handling dependent on language


2009/6/3 Corinna Vinschen <vinschen@redhat.com>:
> Hi,
>
> as discussed three weeks ago, I'll now propose the following patch. ?It
> changes __wcwidth along the lines of Markus Kuhn's code and Iwamuro
> Motonori's proposal to use the language set via setlocale(1) to return
> different character widths for the CJK Ambiguous Width" category of
> characters. ?Tested on Cygwin.
>
> Ok to apply?

It looks good.
But I don't think that the test order is good for the performance.
How about the following code?

--- libc/string/wcwidth.c.ORIG	2009-06-04 02:00:48.015625000 +0900
+++ libc/string/wcwidth.c	2009-06-04 02:32:36.234375000 +0900
@@ -278,21 +278,28 @@
     { 0xE0100, 0xE01EF }
   };

-  /* binary search in table of ambiguous characters */
-  if (__locale_cjk_lang ()
-      && bisearch(ucs, ambiguous,
-		  sizeof(ambiguous) / sizeof(struct interval) - 1))
-    return 2;
-
-  /* test for 8-bit control characters */
+  /* Test for NUL character */
   if (ucs == 0)
     return 0;
-  if (ucs < 32 || (ucs >= 0x7f && ucs < 0xa0))
+
+  /* Test for printable ASCII characters */
+  if (ucs >= 0x20 && ucs < 0x7f)
+    return 1;
+
+  /* Test for control characters */
+  if (ucs < 0xa0)
     return -1;
+
   /* Test for surrogate pair values. */
   if (ucs >= 0xd800 && ucs <= 0xdfff)
     return -1;

+  /* binary search in table of ambiguous characters */
+  if (__locale_cjk_lang ()
+      && bisearch(ucs, ambiguous,
+		  sizeof(ambiguous) / sizeof(struct interval) - 1))
+    return 2;
+
   /* binary search in table of non-spacing characters */
   if (bisearch(ucs, combining,
 	       sizeof(combining) / sizeof(struct interval) - 1))
-- 
IWAMURO Motnori <http://vmi.jp/>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]