This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] faster string operations for buldozer.
On Wed, Sep 26, 2012 at 10:27:58AM -0700, Roland McGrath wrote:
> > +2012-09-26 Ondrej Bilka <neleai@seznam.cz>
> > +
> > + * sysdeps/x86_64/multiarch/init_arch.c: Select faster string function
> > + implementation for buldozer.
>
> ChangeLog entries are not submitted in diff form. See the wiki.
> This entry has poor formatting and poor content. It should look like:
>
> * sysdeps/x86_64/multiarch/init_arch.c (__init_cpu_features):
> Set bit_Prefer_PMINUB_for_stringop for AMD processors.
>
> > + __cpu_features.feature[index_Fast_Rep_String]
> > + |= ( bit_Prefer_PMINUB_for_stringop);
>
> Excess space and excess parens here.
Ok, here is changed patch. According to
http://support.amd.com/us/Processor_TechDocs/47414.pdf
unaligned loads should be fast. I dont know
what about bit_Fast_Rep_String and bit_Fast_Copy_Backward.
2012-09-26 Ondrej Bilka <neleai@seznam.cz>
* sysdeps/x86_64/multiarch/init_arch.c (__init_cpu_features):
Set bit_Prefer_PMINUB_for_stringop for AMD processors.
Set bit_Fast_Unaligned_Load for AMD processors with AVX
diff --git a/sysdeps/x86_64/multiarch/init-arch.c
b/sysdeps/x86_64/multiarch/init-arch.c
index fb44dcf..b872e5f 100644
--- a/sysdeps/x86_64/multiarch/init-arch.c
+++ b/sysdeps/x86_64/multiarch/init-arch.c
@@ -131,6 +131,9 @@ __init_cpu_features (void)
__cpu_features.feature[index_Prefer_SSE_for_memop]
|= bit_Prefer_SSE_for_memop;
+ /* Assuming unaligned loads are fast when avx is available.*/
+ if ((ecx & bit_AVX) != 0)
+ __cpu_features.feature[index_Fast_Rep_String]
+ |= ( bit_Fast_Unaligned_Load);
+
+ __cpu_features.feature[index_Fast_Rep_String]
+ |= bit_Prefer_PMINUB_for_stringop;
+
unsigned int eax;
__cpuid (0x80000000, eax, ebx, ecx, edx);
if (eax >= 0x80000001)