This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

ARM Patch: correctly deal with big-endian string termination

From: Richard Earnshaw <Richard dot Earnshaw at buzzard dot freeserve dot co dot uk>
To: newlib at sourceware dot org
Date: Tue, 24 Mar 2009 23:47:34 +0000
Subject: ARM Patch: correctly deal with big-endian string termination

If a word loaded from the string being compared contains the byte-pair
0x01000 at any position within the word (ie 0xXXXX0100, 0xXX0100XX or
0x0100XXXX) then the syndrome value calculated by the zero-detection
algorithm will contain 0x8080 at the position of these two bytes.

On a little-endian machine this makes no difference to the behaviour of
the algorithm because the interesting bytes (if any) are in the least
significant part of the word; but on big-endian machines the 0x01 byte
is the last byte in the string.  We therefore have to handle this case
by testing each of the last bytes explicitly.

2009-03-24  Richard Earnshaw  <rearnsha@arm.com>

	* libc/machine/arm/strcmp.c (strcmp_unaligned): Correctly
	detect the nul-byte in a big-endian string.

Index: strcmp.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/machine/arm/strcmp.c,v
retrieving revision 1.3
diff -p -r1.3 strcmp.c
*** strcmp.c	23 Mar 2009 18:25:10 -0000	1.3
--- strcmp.c	24 Mar 2009 23:36:27 -0000
*************** strcmp_unaligned(const char* s1, const c
*** 192,197 ****
--- 192,198 ----
  	}								\
        if (__builtin_expect(((w1 - b1) & ~w1) & (b1 << 7), 0))		\
  	{								\
+ 	  /* See comment in assembler below re syndrome on big-endian */\
  	  if ((((w1 - b1) & ~w1) & (b1 << 7)) & mask)			\
  	    w2 RSHIFT= shift;						\
  	  else								\
*************** strcmp_unaligned(const char* s1, const c
*** 319,330 ****
        "b	8f\n"
  
   "5:\n\t"
!       "bics	r3, r3, #"MSB"\n\t"
        "bne	7f\n\t"
        "ldrb	w2, [wp2]\n\t"
        SHFT2LSB"	t1, w1, #24\n\t"
  #ifdef __ARMEB__
!       SHFT2LSB"	w2, w2, #24\n\t"
  #endif
        "b	8f\n"
  
--- 320,341 ----
        "b	8f\n"
  
   "5:\n\t"
! #ifdef __ARMEB__
!       /* The syndrome value may contain false ones if the string ends
! 	 with the bytes 0x01 0x00 */
!       "tst	w1, #0xff000000\n\t"
!       "itt	ne\n\t"
!       "tstne	w1, #0x00ff0000\n\t"
!       "tstne	w1, #0x0000ff00\n\t"
!       "beq	7f\n\t"
! #else
!       "bics	r3, r3, #0xff000000\n\t"
        "bne	7f\n\t"
+ #endif
        "ldrb	w2, [wp2]\n\t"
        SHFT2LSB"	t1, w1, #24\n\t"
  #ifdef __ARMEB__
!       "lsl	w2, w2, #24\n\t"
  #endif
        "b	8f\n"
  
*************** strcmp_unaligned(const char* s1, const c
*** 353,364 ****
        "b	2b\n"
  
   "5:\n\t"
!       SHFT2MSB"s	r3, r3, #16\n\t"
        "bne	7f\n\t"
        "ldrh	w2, [wp2]\n\t"
        SHFT2LSB"	t1, w1, #16\n\t"
  #ifdef __ARMEB__
!       SHFT2LSB"	w2, w2, #16\n\t"
  #endif
        "b	8f\n"
  
--- 364,384 ----
        "b	2b\n"
  
   "5:\n\t"
! #ifdef __ARMEB__
!       /* The syndrome value may contain false ones if the string ends
! 	 with the bytes 0x01 0x00 */
!       "tst	w1, #0xff000000\n\t"
!       "it	ne\n\t"
!       "tstne	w1, #0x00ff0000\n\t"
!       "beq	7f\n\t"
! #else
!       "lsls	r3, r3, #16\n\t"
        "bne	7f\n\t"
+ #endif
        "ldrh	w2, [wp2]\n\t"
        SHFT2LSB"	t1, w1, #16\n\t"
  #ifdef __ARMEB__
!       "lsl	w2, w2, #16\n\t"
  #endif
        "b	8f\n"
  
*************** strcmp_unaligned(const char* s1, const c
*** 390,397 ****
        SHFT2LSB"	w2, w2, #24\n\t"
        "b	8f\n"
   "5:\n\t"
!       "tst	r3, #128\n\t"
!       "bne	7f\n\t"
        "ldr	w2, [wp2], #4\n"
   "6:\n\t"
        SHFT2LSB"	t1, w1, #8\n\t"
--- 410,419 ----
        SHFT2LSB"	w2, w2, #24\n\t"
        "b	8f\n"
   "5:\n\t"
!       /* The syndrome value may contain false ones if the string ends
! 	 with the bytes 0x01 0x00 */
!       "tst	w1, #"LSB"\n\t"
!       "beq	7f\n\t"
        "ldr	w2, [wp2], #4\n"
   "6:\n\t"
        SHFT2LSB"	t1, w1, #8\n\t"

Follow-Ups:
- Re: ARM Patch: correctly deal with big-endian string termination
  - From: Jeff Johnston

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]