This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] vectorized string functions
- From: Dmitrieva Liubov <liubov dot dmitrieva at gmail dot com>
- To: OndÅej BÃlka <neleai at seznam dot cz>, libc-alpha at sourceware dot org
- Date: Wed, 11 Jul 2012 18:34:11 +0400
- Subject: Re: [PATCH] vectorized string functions
Ondrej,
>> +sysdep_routines += strnlen strnlen_sse2 strnlen_ssse3 strnlen_sse4_1
>> + CFLAGS-strnlen_ssse3.c += -mssse3
>> + CFLAGS-strnlen_sse4_1.c += -msse4
It seems to me that sometimes you produces too many versions.
Strnlen example:
Objdump shows strnlen_sse2 and strnlen_ssse3 are exactly the same. (No
any SSSE3 instruction GCC compiler generates)
strnlen_sse4_1 differs from others only with ptest instead of pmovmskb
+ testl pair but it's known that this almost no affect performance but
we've got IFUNC wrapper overhead.
>> delete mode 100644 sysdeps/x86_64/multiarch/strnlen-sse2-no-bsf.S
And we should check regressions on atom machine before removing no_bsf
atom specific version.
--
Liubov Dmitrieva
Intel Corporation