This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Fri, May 18, 2012 at 02:25:47PM +1200, Maxim Kuvyrkov wrote: > On 18/05/2012, at 10:20 AM, Joseph S. Myers wrote: > > > Anyone interested in performance of these functions may also be interested > > in bug 12100, where the SSE4 version of strstr reintroduces the unwanted > > quadratic asymptotic performance. (This is not a comment on your patch > > itself, but a mention of something people looking at this area might also > > be interested in.) > > Thanks for the pointer. For avoidance of doubt, I've benchmarked the patch on a Core2 machine without SSE4. > > I will benchmark the SSE4 implementation against the normal + this patch on short needles. The benchmark that motivated this patch is libosip message parsing, which heavily uses string functions with small strings. > I posted strstr implementation at this list and got no response. I use only SSE2 instructions and my algorithm is faster than using SSE4 instructions. With trivial modifications I could also use plain arithmetic/AVX2 A trick is to check first two characters or zero terminator in parallel. This is about 25 times faster on core2 vs glibc one, on i7 its only 3 times faster. I use these trick for my extension of regular expression engine a partialy autogenerated strstr follows. > -- > Maxim Kuvyrkov > CodeSourcery / Mentor Graphics -- transient bus protocol violation
Attachment:
strstr2.c
Description: Text document
Attachment:
sse2.h
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |