This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] S/390: Disable two UTF conversion instructions


Hi,

unfortunately it turned out that the UTF-16 to UTF-32 and UTF-16 to
UTF-8 conversion instructions fail to recognize certain conditions if
the input stream is corrupted.

The attached patch disables these instructions and adjusts the
software implementation in order to take care of the error condition.
Hopefully we can re-enable them if a millicode fix becomes available.

Tested without regressions on s390x.

Please apply if you think it is ok.

Bye,

-Andreas-


2010-02-11  Andreas Krebbel  <Andreas.Krebbel@de.ibm.com>

	* sysdeps/s390/s390-64/utf8-utf16-z9.c: Disable hardware
	instructions cu21 and cu24.  Add well-formedness checking
	parameter and adjust the software implementation.
	* sysdeps/s390/s390-64/utf16-utf32-z9.c: Likewise.


Index: libc/sysdeps/s390/s390-64/utf8-utf16-z9.c
===================================================================
--- libc.orig/sysdeps/s390/s390-64/utf8-utf16-z9.c
+++ libc/sysdeps/s390/s390-64/utf8-utf16-z9.c
@@ -345,9 +345,12 @@ gconv_end (struct __gconv_step *data)
    Operation.  */
 #define BODY								\
   {									\
-    if (GLRO (dl_hwcap) & HWCAP_S390_ETF3EH)				\
+    /* The hardware instruction currently fails to report an error for	\
+       isolated low surrogates so we have to disable the instruction	\
+       until this gets resolved.  */					\
+    if (0) /* (GLRO (dl_hwcap) & HWCAP_S390_ETF3EH) */			\
       {									\
-	HARDWARE_CONVERT ("cu21 %0, %1");				\
+	HARDWARE_CONVERT ("cu21 %0, %1, 1");				\
 	if (inptr != inend)						\
 	  {								\
 	    /* Check if the third byte is				\
@@ -388,7 +391,7 @@ gconv_end (struct __gconv_step *data)
 									\
 	outptr += 2;							\
       }									\
-    else if (c >= 0x0800 && c <= 0xd7ff)				\
+    else if ((c >= 0x0800 && c <= 0xd7ff) || c > 0xdfff)		\
       {									\
 	/* Three byte UTF-8 char.  */					\
 									\
Index: libc/sysdeps/s390/s390-64/utf16-utf32-z9.c
===================================================================
--- libc.orig/sysdeps/s390/s390-64/utf16-utf32-z9.c
+++ libc/sysdeps/s390/s390-64/utf16-utf32-z9.c
@@ -203,7 +203,10 @@ gconv_end (struct __gconv_step *data)
    swapping).  */
 #define BODY								\
   {									\
-    if (GLRO (dl_hwcap) & HWCAP_S390_ETF3EH)				\
+    /* The hardware instruction currently fails to report an error for	\
+       isolated low surrogates so we have to disable the instruction	\
+       until this gets resolved.  */					\
+    if (0) /* (GLRO (dl_hwcap) & HWCAP_S390_ETF3EH) */			\
       {									\
 	HARDWARE_CONVERT ("cu24 %0, %1, 1");				\
 	if (inptr != inend)						\
@@ -229,6 +232,12 @@ gconv_end (struct __gconv_step *data)
       }									\
     else								\
       {									\
+        /* An isolated low-surrogate was found.  This has to be         \
+	   considered ill-formed.  */					\
+        if (__builtin_expect (u1 >= 0xdc00, 0))				\
+	  {								\
+	    STANDARD_FROM_LOOP_ERR_HANDLER (2);				\
+	  }								\
 	/* It's a surrogate character.  At least the first word says	\
 	   it is.  */							\
 	if (__builtin_expect (inptr + 4 > inend, 0))			\


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]