This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [WIKI] Optimization overview.
- From: David Miller <davem at davemloft dot net>
- To: neleai at seznam dot cz
- Cc: libc-alpha at sourceware dot org
- Date: Mon, 18 Feb 2013 15:10:53 -0500 (EST)
- Subject: Re: [WIKI] Optimization overview.
- References: <20130218172258.GA31978@domone.kolej.mff.cuni.cz>
From: Ondřej Bílka <neleai@seznam.cz>
Date: Mon, 18 Feb 2013 18:22:58 +0100
> I wrote about common string optimizations at
> http://sourceware.org/glibc/wiki/Optimizations/string_functions
> I put there control flow that is fastest in practice among those I
> tried (there are several possible tradeoffs which I will write about).
> I plan use it for most subsequent patches.
>
>
> What can be covered relatively generaly?
> Comments?
Thanks for writing this up, it is of great value.
On Sparc v9 32-bit and 64-bit I do the unaligned initial word in
parallel with the loop setup. This seemed to be the most optimal.
You simply align the pointer and do an aligned load, and use a 'or' to
set all of the bytes in the word up to the actual starting byte to
one. If the initial pointer is aligned, the mask cancels out and we
pre-process one full word upon entry to the loop.
PowerPC uses this method as well, as that's where I got the idea
from.