-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
In the same vein as strchr. OK to apply?
2008-05-22 Eric Blake <ebb9@byu.net>
Optimize the generic and x86 strlen.
* libc/string/strlen.c (strlen) [!__OPTIMIZE_SIZE__]: Pre-align
data so unaligned searches aren't penalized.
* libc/machine/i386/strlen.S (strlen) [!__OPTIMIZE_SIZE__]: Word
operations are faster than repnz byte searches.
I'm attaching my test program as well. I compiled my changes to strlen.S
as the function strlen1, so I could compare to the pre-patch strlen on
cygwin. When run as:
$ for i in `seq 0 16` ; do
| for j in `seq 0 $i` ; do
| ./foo $i 1 $j 0
| done; done
it proves that strlen handles any alignment for both the start of the
string as well as any alignment of the NUL at the end of the string (ie.
100% coverage of the code).
When run with longer strings and multiple iterations, I got the following
timings:
$ ./foo
usage: ./foo size repeat offset [func]
$ time ./foo 1000000 1000 0
real 0m2.071s
user 0m1.999s
sys 0m0.046s
$ time ./foo 1000000 1000 1
real 0m2.065s
user 0m1.921s
sys 0m0.031s
# Pre-patch assembly, uses 'repnz scasb', which is dirt slow
# at least alignment was not an issue with byte-wise algorithm
$ time ./foo 1000000 1000 0 0
real 0m0.707s
user 0m0.624s
sys 0m0.046s
$ time ./foo 1000000 1000 1 0
real 0m0.704s
user 0m0.655s
sys 0m0.046s
# patched assembly, 3x faster, alignment still doesn't matter
$ time ./foo 1000000 1000 0 1
real 0m0.702s
user 0m0.655s
sys 0m0.030s
$ time ./foo 1000000 1000 1 1
real 0m0.724s
user 0m0.686s
sys 0m0.031s
# patched C code, marginally slower than patched assembly
- --
Don't work too hard, make some time for fun as well!
Eric Blake ebb9@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkg1a/UACgkQ84KuGfSFAYD/WwCfXFRROzpPnnCThvNeuv/AzGJb
nGIAn06OYjM6DvOeQooj658yBF18bKt3
=Zkco
-----END PGP SIGNATURE-----