This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH v3] faster strlen on x64


Hi, 

Afetr testing by Liuba I prepared final version of my patch
(attached and on neleai/strlen branch.).

I used hooking to examine behaviour of implementations in wild, it can be 
downloaded on http://kam.mff.cuni.cz/~ondra/strlen_profile.tar.bz2
(Run ./benchmarks for unit tests, read TODO as it is not complete.)

No aditional failures on x64.

Uses of strlen_* in strcat are inlined for now, optimizations will come
after I deal with strcpy.

It could be also use in linker, I split this functionality into
additional patch.

Ondra

2013-01-31  Ondrej Bilka  <neleai@seznam.cz>

	* sysdeps/x86_64/strlen.S: Replace with new SSE2 based 
	implementation which is faster on all x86_64 architectures.
	Tested on AMD, Intel Nehalem, Atom, SNB, IVB, Haswell.
	* sysdeps/x86_64/strnlen.S: Likewise.

	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines):
	Remove all multiarch strlen and strnlen versions.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c: Update.
	Remove strlen and strnlen related parts.

	* sysdeps/x86_64/multiarch/strcat-sse2-unaligned.S: Update.
	Inline strlen part.
	* sysdeps/x86_64/multiarch/strcat-ssse3.S: Likewise.

	* sysdeps/x86_64/multiarch/strlen.S: Remove.
	* sysdeps/x86_64/multiarch/strlen-sse2-no-bsf.S: Remove.
	* sysdeps/x86_64/multiarch/strlen-sse2-pminub.S: Remove.
	* sysdeps/x86_64/multiarch/rtld-strlen.S: Remove.
	* sysdeps/x86_64/multiarch/strlen-sse4.S: Remove.
	* sysdeps/x86_64/multiarch/strnlen.S: Remove.
	* sysdeps/x86_64/multiarch/strnlen-sse2-no-bsf.S: Remove.

Attachment: faster_strlen_on_x64.patch
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]