This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATH: removing quadratic complexity strstr, strcasestr and strcasestr-nonascii


On 30/06/2012, at 3:31 AM, Dmitrieva Liubov wrote:

> This patch removes SSE42 strstr, strcasestr and strcasestr-nonascii
> because these SSE42 algorithms have quadratic complexity.
> 
> The patch was checked on x86_64 and x86_32.
> 
> ChangeLog:
> 
> 2012-06-29  Liubov Dmitrieva  <liubov.dmitrieva@gmail.com>
> 
>       * sysdeps/i386/i686/multiarch/Makefile: Update
>       Remove strstr, strcasesrt, strcasestr-nonascii
>       SSE42 optimizations since algos have quadratic complexity
>       * sysdeps/i386/i686/multiarch/strcasestr-c.c: Delete
>       * sysdeps/i386/i686/multiarch/strcasestr-nonascii.c: Delete
>       * sysdeps/i386/i686/multiarch/strstr-c.c: Delete
>       * sysdeps/i386/i686/multiarch/strstr.c: Delete
> 
>       * sysdeps/x86_64/multiarch/Makefile: Update
>       Remove strstr, strcasesrt, strcasestr-nonascii
>       SSE42 optimizations since algos have quadratic complexity
>       * sysdeps/x86_64/multiarch/strcasestr-c.c: Delete
>       * sysdeps/x86_64/multiarch/strcasestr-nonascii.c: Delete
>       * sysdeps/x86_64/multiarch/strstr-c.c: Delete
>       * sysdeps/x86_64/multiarch/strstr.c: Delete
> 

While I don't have strong opinion about this patch, it doesn't seem right to throw away a perfectly good SSE42 implementation that outperforms the generic one by a factor of 2 on small needles.

As suggested in http://sourceware.org/bugzilla/show_bug.cgi?id=11607#c4, we can use a hybrid approach and use SSE42 implementation for short needles while falling back to generic O(n) implementation for long needles.  String/str[case]str.c has

  if (needle_len < LONG_NEEDLE_THRESHOLD)
    return two_way_short_needle ((const unsigned char *) haystack,
				 haystack_len,
				 (const unsigned char *) needle, needle_len);
  return two_way_long_needle ((const unsigned char *) haystack, haystack_len,
			      (const unsigned char *) needle, needle_len);

So, if you renamed current __strstr_sse42 to two_way_short_needle_sse42, and adjusted ifunc definition to resolve two_way_short_needle to either SSE42 implementation or to the generic one -- you would get the best of both worlds.

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]