This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH v3] BZ #14059 - Fix AVX and FMA4 detection.


On Wed, May 16, 2012 at 8:21 PM, Carlos O'Donell
<carlos@systemhalted.org> wrote:
> 2012-05-11 ?Andreas Jaeger ?<aj@suse.de>
> ? ? ? ? ? ?Carlos O'Donell ?<carlos_odonell@mentor.com>
>
> ? ? ? ?[BZ #14059]
> ? ? ? ?* sysdeps/x86_64/multiarch/init-arch.h
> ? ? ? ?(bit_YMM_Usable): Rename to...
> ? ? ? ?(bit_AVX_Usable): ... this.
> ? ? ? ?(bit_FMA4_Usable): New macro.
> ? ? ? ?(bit_XMM_state): New macro.
> ? ? ? ?(bit_YMM_state): New macro.
> ? ? ? ?[__ASSEMBLER__] (index_YMM_Usable): Rename to...
> ? ? ? ?[__ASSEMBLER__] (index_AVX_Usable): ... this.
> ? ? ? ?[__ASSEMBLER__] (index_FMA4_Usable): New macro.
> ? ? ? ?(CPUID_OSXSAVE): New macro.
> ? ? ? ?(CPUID_AVX): New macro.
> ? ? ? ?(CPUID_FMA4): New macro.
> ? ? ? ?(index_YMM_Usable): Rename to...
> ? ? ? ?(index_AVX_Usable): ... this.
> ? ? ? ?(HAS_AVX): Use HAS_ARCH_FEATURE.
> ? ? ? ?(HAS_FMA4): Likewise.
> ? ? ? ?(HAS_YMM_USABLE): Remove.
> ? ? ? ?* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
> ? ? ? ?Fix check for AVX, enable FMA4 only if it exists and if AVX is
> ? ? ? ?usable.
> ? ? ? ?* sysdeps/x86_64/multiarch/strcmp.S: Use bit_AVX_Usable.
> ? ? ? ?* sysdeps/i386/i686/multiarch/Makefile: Add test-multiarch to tests.
> ? ? ? ?* sysdeps/x86_64/multiarch/Makefile: Likewise.
> ? ? ? ?* sysdeps/i386/i686/multiarch/test-multiarch.c: New file.
> ? ? ? ?* sysdeps/x86_64/multiarch/test-multiarch.c: New file.
> --
> ?i386/i686/multiarch/Makefile ? ? ? ? | ? ?1
> ?i386/i686/multiarch/test-multiarch.c | ? ?1
> ?x86_64/multiarch/Makefile ? ? ? ? ? ?| ? ?1
> ?x86_64/multiarch/init-arch.c ? ? ? ? | ? 17 ++++--
> ?x86_64/multiarch/init-arch.h ? ? ? ? | ? 51 +++++++++++++-------
> ?x86_64/multiarch/strcmp.S ? ? ? ? ? ?| ? ?9 ++-
> ?x86_64/multiarch/test-multiarch.c ? ?| ? 88 +++++++++++++++++++++++++++++++++++
> ?7 files changed, 142 insertions(+), 26 deletions(-)

Does the FMA4 support depend on AVX being present *and* enabled?

The patch enables FMA4 support if AVX is present, is this wrong?

We have FMA4 as bit-16, but unfortunately bit-16 of the CPUID result
is marked reserved in the "Intel 64 and IA-32 Architectures Software
Developer's Manual" (May 2012).

What are we actually detecting with FMA4?

I see that this is all part of an AMD and Intel mixup.

I found FMA4 in "AMD64 Architecture Programmer’s Manual Volume 2:
System Programming" (March 2012), and in "AMD64 Architecture
Programmer’s Manual Volume 6: 128-Bit and 256-Bit, XOP, and FMA4
Instructions" which does not say FMA4 is dependent on AVX.

~~~
Support for the new instructions is indicated by use of the CPUID instruction:
- XOP—ECX bit 11 as returned by CPUID function 8000_0001h.
- FMA4—ECX bit 16 as returned by CPUID function 8000_0001h.
Attempting to execute these instructions causes a #UD exception either
if they are not present in the
hardware or if operating system support for YMM context switching is
not indicated by setting
CR4.OSXSAVE to 1.
~~~

Thus FMA4 is enabled if present and YMM state is usable, similar to
AVX, but not dependent on AVX.

The delta is this:
diff --git a/sysdeps/x86_64/multiarch/init-arch.c
b/sysdeps/x86_64/multiarch/init-arch.c
index 26d62ef..155033d 100644
--- a/sysdeps/x86_64/multiarch/init-arch.c
+++ b/sysdeps/x86_64/multiarch/init-arch.c
@@ -143,21 +143,23 @@ __init_cpu_features (void)
   else
     kind = arch_kind_other;

-  if (CPUID_AVX)
+  /* Can we call xgetbv?  */
+  if (CPUID_OSXSAVE)
     {
-      /* Determine if AVX is usable.  */
-      if (CPUID_OSXSAVE
-         && ({ unsigned int xcrlow;
-               unsigned int xcrhigh;
-               asm ("xgetbv"
-                    : "=a" (xcrlow), "=d" (xcrhigh) : "c" (0));
-               (xcrlow & (bit_YMM_state | bit_XMM_state)) ==
-                (bit_YMM_state | bit_XMM_state); }))
-       __cpu_features.feature[index_AVX_Usable] |= bit_AVX_Usable;
-
-      /* FMA4 depends on AVX support.  */
-      if (CPUID_FMA4)
-       __cpu_features.feature[index_FMA4_Usable] |= bit_FMA4_Usable;
+      unsigned int xcrlow;
+      unsigned int xcrhigh;
+      asm ("xgetbv" : "=a" (xcrlow), "=d" (xcrhigh) : "c" (0));
+      /* Is YMM and XMM state usable?  */
+      if ((xcrlow & (bit_YMM_state | bit_XMM_state)) ==
+         (bit_YMM_state | bit_XMM_state))
+       {
+         /* Determine if AVX is usable.  */
+         if (CPUID_AVX)
+           __cpu_features.feature[index_AVX_Usable] |= bit_AVX_Usable;
+         /* Determine if FMA4 is usable.  */
+         if (CPUID_FMA4)
+           __cpu_features.feature[index_FMA4_Usable] |= bit_FMA4_Usable;
+       }
     }

   __cpu_features.family = family;
---

I'll send out a new email when testing is done.

Who has a box with FMA4 for testing?

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]