This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Ping: [PATCH v4] faster strlen on x64


Ping,


On Wed, Feb 13, 2013 at 12:38:40PM +0100, OndÅej BÃlka wrote:
> Hello,
> 
> I wrote at previous version that unaligned read of first 16 bytes is bad
> tradeoff. When I made faster strcpy header I realized that it was because 
> I was doing separate check if it crosses page.
> 
> When I do only check if next 64 bytes do not cross page and first do 
> unaligned 16 byte load then it causes only small overhead for larger
> strings. This makes my implementation faster for wider family of
> workloads. It speed up gcc benchmark and most other programs.
> 
> On unit tests revised version is somewhat slower than previous version.
> It is caused by choosing first 16 bytes only rarely which causes branch
> misprediction.
> 
> I did two additional small improvements, first is squashing padding patch.
> Second bit is test to cross page can be done as x%4096 < 4096-48 instead
> x%4096 <= 4096-64 because I align x into 16 bytes.
> 
> I updated benchmarks, difference between new and revised version is at 
> http://kam.mff.cuni.cz/~ondra/benchmark_string/strlen_profile.html
> http://kam.mff.cuni.cz/~ondra/strlen_profile.tar.bz2
>  
> 
> Ondra


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]