This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[PATCH] PowerPC: Extend fpu fenv operations to operate on 64-bit FPSCR

From: "Ryan S. Arnold" <rsa at us dot ibm dot com>
To: GNU libc devel <libc-alpha at sources dot redhat dot com>
Date: Tue, 23 Oct 2007 09:40:44 -0500
Subject: [PATCH] PowerPC: Extend fpu fenv operations to operate on 64-bit FPSCR
Reply-to: rsa at us dot ibm dot com

Greetings,

For some reason when I posted this to libc-alpha@sourceware.org it
didn't make it out to the email inboxes so I'm resending.

The included patch provides the following changes to extend the FPSCR
(floating point status and control register) to 64-bits for PowerPC
POWER6 and POWER6x:

o For PowerPC in general it redefines fpu_control_t from an unsigned
long int to an unsigned long long int so that the entire 64-bit FPSCR on
POWER6 can be saved/restored.  On non-POWER6 hardware the high-order
32-bits are simply discarded/ignored for all operations.

  It is necessary to change it for all PowerPC because the loader is
built as the default PowerPC arch (i.e. non-POWER6) and uses the
rtld_global_ro struct which contains an fpu_control_t and therefore must
be able to load a libc built for POWER6 or non-POWER6 (which also uses
the rtld_global_ro struct).

  Any difference in size of fpu_control_t between the loader and libc
will cause libc to access rtld_global_ro struct members at the wrong
offsets from where the structure was populated by the loader.

  The impact is minimal since currently the fpu_control_t is initially
populated from a double anyway and the fpu_control_t type is not used
dynamically in any external function and thus, no symbol versioning is
required.

o Provides a conditional implementation of _FPU_GETCW and _FPU_SETCW
which operate on either the entire 64-bit FPSCR on POWER6[x] or simply
the low-order 32-bits of the fpu_control_t on non-POWER6[x].

o Provides granular control over the high order 32 bits of the FPSCR
that aren't reserved.

o Provides an overridden sysdeps/powerpc/math/test-fpucw that verifies
that the high-order word is saved to the FPSCR and restored using the
_FPU_[GET|SET]CW macros.  It also verifies that there are no deleterious
effects on non-POWER6 systems.

o Provides conditional definitions of the PowerPC fenv helper macros
(fesetenv_register & fegetenv_register) which are used by the fenv
functions to set/get the contents of the FPSCR.

o Provides sysdeps/powerpc/math/test-powerpc-fenv which tests the
high-order word save/restore using the fe[set|get]env_register
functions.

o Replaces the FPSCR restore with a macro which will provide a 64-bit
restore on POWER6 for swapcontext and setcontext.

o Replaces some of the magic number fenv masks in the fenv functions
with a define that better describes what the magic number means.

I successfully executed build and runtime tests (make check) on POWER5
(32-bit FPSCR) as well as POWER6 (64-bit FPSCR).

Ryan S. Arnold
IBM Linux Technology Center
Linux Toolchain Development

2007-10-22  Ryan S. Arnold  <rsa@us.ibm.com>

	* sysdeps/mach/hurd/powerpc/sigreturn.c: Enable 64-bit FPSCR restore
	on POWER6.
	* sysdeps/powerpc/fpscr.h: (_RESTORE_FPSCR) New file that adds a
	macro which will either use the default two operand mtfsf instruction
	or the new four operand mtfsf instruction (on POWER6 and POWER6x).
	* sysdeps/powerpc/fpu/Implies: Add sysdeps/powerpc/math as an Implies
	file in order to add it to the search path to be able to override
	math/test-fpucw.c for PowerPC.
	* sysdeps/powerpc/fpu/feholdexcpt.c (_FPU_MASK_ALL): provide macro to
	eliminate magic number (0x000000F8) which represents a mask of all of
	the floating point exception enable bits.
	* sysdeps/powerpc/fpu/fenv_libc.h (fesetenv_register): Provides a
	_ARCH_PWR6 || _ARCH_PWR6X conditional macro to restore the entire
	64-bit FPSCR on POWER6.
	(relax_fenv_state): Provides a _ARCH_PWR6 || _ARCH_PWR6X conditional
	macro to return the decimal rounding mode to the default 'round to
	nearest, ties to zero' on POWER6[X] per the POWER ISA.
	(FPSCR_NI FPSCR_29): Conditionaly replace FPSCR_NI bit definition
	with FPSCR_29 for _ARCH_PWR6 || _ARCH_PWR6X since non-IEEE is not
	supported on POWER6 and POWER6X (i.e. it is marked Reserved).
	* sysdeps/powerpc/fpu/fesetenv.c (_FPU_MASK_ALL): provide macro to
	eliminate magic number (0x000000F8) which represents a mask of all of
	the floating point exception enable bits.
	* sysdeps/powerpc/fpu/feupdateenv.c (_FPU_MASK_ALL): provide macro to
	eliminate magic number (0x000000F8) which represents a mask of all of
	the floating point exception enable bits.
	* sysdeps/powerpc/fpu/fpu_control.h (_FPU_DEC_RC_*): Add bit masks to
	allow the setting of the decimal rounding modes per POWER6 ISA 2.05 in
	the high-order 32-bits of the 64-bit FPSCR.
	(fpu_control_t): Redefine fpu_control_t unsigned long long int and
	double word align it for all PowerPC.
	(_FPU_GETCW _FPU_SETCW): Provides an _ARCH_PWR6 || _ARCH_PWR6X
	conditional macro for _FPU_GETCW and _FPU_SETCW to get and set the
	entire 64-bit FPSCR on POWER6.
	(_FPU_MASK_NI): Conditionally defines _FPU_MASK_NI as 0x0 on POWER6
	and POWER6X because it isn't supported by the architecture.
	(_FPU_RESERVED): Conditionally defines _FPU_RESERVED on POWER6 and
	POWER6X to reserve bits 0:29 of the high order word as well as the
	normal reserved bits in the low order word of the FPSCR.
	* sysdeps/powerpc/math/Makefile: Add new test 'test-powerpc-fenv' to
	libm-tests.
	* sysdeps/powerpc/math/test-fpucw.c: Provides a PowerPC based override
	test which allows for an unsigned long long int control word mask.
	Also provides test of _FPU_GETCW and _FPUSETCW working against a
	64-bit or 32-bit FPSCR.
	* sysdeps/powerpc/math/test-powerpc-fenv.c: New test to verify the
	efficacy of fesetenv_register() and fegetenv_register() in getting
	and setting the high-order word of the 64-bit FPSCR, or in ignoring the
	high-order word when acting against the 32-bit FPSCR.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S
	(__CONTEXT_FUNC_NAME) : replace mtfsf insn with _RESTORE_FPSCR macro
	which will conditionally operate on 32-bit or 64-bit FPSCR.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S
	(__CONTEXT_FUNC_NAME) : replace mtfsf insn with _RESTORE_FPSCR macro
	which will conditionally operate on 32-bit or 64-bit FPSCR.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/setcontext.S
	(__novec_setcontext __setcontext) : replace mtfsf insns with
	_RESTORE_FPSCR macro which will conditionally operate on 32-bit or
	64-bit FPSCR.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/swapcontext.S
	(__novec_swapcontext __swapcontext) : replace mtfsf insns with
	_RESTORE_FPSCR macro which will conditionally operate on 32-bit or
	64-bit FPSCR.

--- glibc.orig/sysdeps/mach/hurd/powerpc/sigreturn.c	2001-11-09 18:37:47.000000000 -0600
+++ glibc.new/sysdeps/mach/hurd/powerpc/sigreturn.c	2007-10-09 12:14:06.000000000 -0500
@@ -1,5 +1,5 @@
 /* Return from signal handler for Hurd.  PowerPC version.
-   Copyright (C) 1996,97,98,2001 Free Software Foundation, Inc.
+   Copyright (C) 1996,97,98,2001,2007 Free Software Foundation, Inc.
    This file is part of the GNU C Library.

    The GNU C Library is free software; you can redistribute it and/or
@@ -91,7 +91,11 @@

   /* Restore the floating-point control/status register.  */
   asm volatile ("lfd 0,256(31)");
+#if defined _ARCH_PWR6 || defined _ARCH_PWR6X
+  asm volatile ("mtfsf 0xff,0,1,0");
+#else
   asm volatile ("mtfsf 0xff,0");
+#endif

   /* Restore floating-point registers. */
   restore_fpr (0);
--- glibc.orig/sysdeps/powerpc/fpscr.h	1969-12-31 18:00:00.000000000 -0600
+++ glibc.new/sysdeps/powerpc/fpscr.h	2007-10-02 21:23:59.000000000 -0500
@@ -0,0 +1,30 @@
+/* Macro to restore the entire FPSCR, short and long form.
+   Copyright (C) 2007 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by IBM Corporation,
+   Author(s): Ryan S. Arnold <rsa@us.ibm.com>
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#if defined _ARCH_PWR6 || defined _ARCH_PWR6X
+ /* On Power6[x] four operands are required with the 'L' field
+  * equal to '1' in order to copy the entirety of 'reg' into the
+  * full 64-bit wide FPSCR.  The field mask and 'W' field are
+  * ignored when L is '1'.  */
+ #define _RESTORE_FPSCR(reg) mtfsf	0xff, (reg), 1, 0
+#else
+ #define _RESTORE_FPSCR(reg) mtfsf	0xff, (reg)
+#endif
--- glibc.orig/sysdeps/powerpc/fpu/Implies	1969-12-31 18:00:00.000000000 -0600
+++ glibc.new/sysdeps/powerpc/fpu/Implies	2007-10-05 12:22:03.000000000 -0500
@@ -0,0 +1 @@
+powerpc/math
--- glibc.orig/sysdeps/powerpc/fpu/feholdexcpt.c	2007-05-07 01:21:41.000000000 -0500
+++ glibc.new/sysdeps/powerpc/fpu/feholdexcpt.c	2007-10-09 12:06:18.000000000 -0500
@@ -1,5 +1,5 @@
 /* Store current floating-point environment and clear exceptions.
-   Copyright (C) 1997, 2005 Free Software Foundation, Inc.
+   Copyright (C) 1997, 2005, 2007 Free Software Foundation, Inc.
    This file is part of the GNU C Library.

    The GNU C Library is free software; you can redistribute it and/or
@@ -18,6 +18,8 @@
    02111-1307 USA.  */

 #include <fenv_libc.h>
+#include <fpu_control.h>
+#define _FPU_MASK_ALL (_FPU_MASK_ZM | _FPU_MASK_OM | _FPU_MASK_UM | _FPU_MASK_XM | _FPU_MASK_IM)

 int
 feholdexcept (fenv_t *envp)
@@ -35,7 +37,7 @@
   /* If the old env had any eabled exceptions, then mask SIGFPE in the
      MSR FE0/FE1 bits.  This may allow the FPU to run faster because it
      always takes the default action and can not generate SIGFPE. */
-  if ((old.l[1] & 0x000000F8) != 0)
+  if ((old.l[1] & _FPU_MASK_ALL) != 0)
     (void)__fe_mask_env ();

   /* Put the new state in effect.  */
--- glibc.orig/sysdeps/powerpc/fpu/fenv_libc.h	2006-03-16 05:46:34.000000000 -0600
+++ glibc.new/sysdeps/powerpc/fpu/fenv_libc.h	2007-10-08 13:15:59.000000000 -0500
@@ -1,5 +1,5 @@
 /* Internal libc stuff for floating point environment routines.
-   Copyright (C) 1997, 2006 Free Software Foundation, Inc.
+   Copyright (C) 1997, 2006, 2007 Free Software Foundation, Inc.
    This file is part of the GNU C Library.

    The GNU C Library is free software; you can redistribute it and/or
@@ -30,9 +30,15 @@
 #define fegetenv_register() \
         ({ fenv_t env; asm volatile ("mffs %0" : "=f" (env)); env; })

+/* Power6[x] provides a 64-bit FPSCR.  */
+#if defined _ARCH_PWR6 || defined _ARCH_PWR6X
+#define fesetenv_register(env) \
+        ({ double d = (env); asm volatile ("mtfsf 0xff,%0,1,0" : : "f" (d)); })
+#else
 /* Equivalent to fesetenv, but takes a fenv_t instead of a pointer.  */
 #define fesetenv_register(env) \
         ({ double d = (env); asm volatile ("mtfsf 0xff,%0" : : "f" (d)); })
+#endif

 /* This very handy macro:
    - Sets the rounding mode to 'round to nearest';
@@ -40,7 +46,13 @@
    - Prevents exceptions from being raised for inexact results.
    These things happen to be exactly what you need for typical elementary
    functions.  */
+#if defined _ARCH_PWR6 || defined _ARCH_PWR6X
+/* - Sets the decimal rounding mode to 'round to nearest';  */
+#define relax_fenv_state() \
+       ({ asm ("mtfsfi 7,0,1"); asm("mtfsfi 7,0"); })
+#else
 #define relax_fenv_state() asm ("mtfsfi 7,0")
+#endif

 /* Set/clear a particular FPSCR bit (for instance,
    reset_fpscr_bit(FPSCR_VE);
@@ -120,7 +132,11 @@
   FPSCR_UE,        /* underflow exception enable */
   FPSCR_ZE,        /* zero divide exception enable */
   FPSCR_XE,        /* inexact exception enable */
+#if defined _ARCH_PWR6 || defined _ARCH_PWR6X
+  FPSCR_29,        /* Reserved */
+#else
   FPSCR_NI         /* non-IEEE mode (typically, no denormalised numbers) */
+#endif
   /* the remaining two least-significant bits keep the rounding mode */
 };

--- glibc.orig/sysdeps/powerpc/fpu/fesetenv.c	2007-05-07 01:22:06.000000000 -0500
+++ glibc.new/sysdeps/powerpc/fpu/fesetenv.c	2007-10-08 14:09:12.000000000 -0500
@@ -18,8 +18,11 @@
    02111-1307 USA.  */

 #include <fenv_libc.h>
+#include <fpu_control.h>
 #include <bp-sym.h>

+#define _FPU_MASK_ALL (_FPU_MASK_ZM | _FPU_MASK_OM | _FPU_MASK_UM | _FPU_MASK_XM | _FPU_MASK_IM)
+
 int
 __fesetenv (const fenv_t *envp)
 {
@@ -29,18 +32,18 @@
   new.fenv = *envp;
   old.fenv = fegetenv_register ();
   
-  /* If the old env has no eabled exceptions and the new env has any enabled
-     exceptions, then unmask SIGFPE in the MSR FE0/FE1 bits.  This will put
-     the hardware into "precise mode" and may cause the FPU to run slower on
-     some hardware.  */
-  if ((old.l[1] & 0x000000F8) == 0 && (new.l[1] & 0x000000F8) != 0)
+  /* If the old env has no enabled exceptions and the new env has any enabled
+     exceptions, then unmask SIGFPE in the MSR FE0/FE1 bits.  This will put the
+     hardware into "precise mode" and may cause the FPU to run slower on some
+     hardware.  */
+  if ((old.l[1] & _FPU_MASK_ALL) == 0 && (new.l[1] & _FPU_MASK_ALL) != 0)
     (void)__fe_nomask_env ();
   
   /* If the old env had any eabled exceptions and the new env has no enabled
      exceptions, then mask SIGFPE in the MSR FE0/FE1 bits.  This may allow the
      FPU to run faster because it always takes the default action and can not 
      generate SIGFPE. */
-  if ((old.l[1] & 0x000000F8) != 0 && (new.l[1] & 0x000000F8) == 0)
+  if ((old.l[1] & _FPU_MASK_ALL) != 0 && (new.l[1] & _FPU_MASK_ALL) == 0)
     (void)__fe_mask_env ();
     
   fesetenv_register (*envp);
--- glibc.orig/sysdeps/powerpc/fpu/feupdateenv.c	2007-05-07 01:22:29.000000000 -0500
+++ glibc.new/sysdeps/powerpc/fpu/feupdateenv.c	2007-10-08 14:37:04.000000000 -0500
@@ -19,8 +19,11 @@
    02111-1307 USA.  */

 #include <fenv_libc.h>
+#include <fpu_control.h>
 #include <bp-sym.h>

+#define _FPU_MASK_ALL (_FPU_MASK_ZM | _FPU_MASK_OM | _FPU_MASK_UM | _FPU_MASK_XM | _FPU_MASK_IM)
+
 int
 __feupdateenv (const fenv_t *envp)
 {
@@ -39,14 +42,14 @@
      exceptions, then unmask SIGFPE in the MSR FE0/FE1 bits.  This will put
      the hardware into "precise mode" and may cause the FPU to run slower on
      some hardware.  */
-  if ((old.l[1] & 0x000000F8) == 0 && (new.l[1] & 0x000000F8) != 0)
+  if ((old.l[1] & _FPU_MASK_ALL) == 0 && (new.l[1] & _FPU_MASK_ALL) != 0)
     (void)__fe_nomask_env ();
   
   /* If the old env had any eabled exceptions and the new env has no enabled
      exceptions, then mask SIGFPE in the MSR FE0/FE1 bits.  This may allow the
      FPU to run faster because it always takes the default action and can not 
      generate SIGFPE. */
-  if ((old.l[1] & 0x000000F8) != 0 && (new.l[1] & 0x000000F8) == 0)
+  if ((old.l[1] & _FPU_MASK_ALL) != 0 && (new.l[1] & _FPU_MASK_ALL) == 0)
     (void)__fe_mask_env ();

   /* Atomically enable and raise (if appropriate) exceptions set in `new'. */
--- glibc.orig/sysdeps/powerpc/fpu/fpu_control.h	2003-02-27 14:57:06.000000000 -0600
+++ glibc.new/sysdeps/powerpc/fpu/fpu_control.h	2007-10-08 14:43:02.000000000 -0500
@@ -1,5 +1,5 @@
 /* FPU control word definitions.  PowerPC version.
-   Copyright (C) 1996, 1997, 1998 Free Software Foundation, Inc.
+   Copyright (C) 1996, 1997, 1998, 2007 Free Software Foundation, Inc.
    This file is part of the GNU C Library.

    The GNU C Library is free software; you can redistribute it and/or
@@ -20,14 +20,12 @@
 #ifndef _FPU_CONTROL_H
 #define _FPU_CONTROL_H

-/* rounding control */
+/* binary float rounding control */
 #define _FPU_RC_NEAREST 0x00   /* RECOMMENDED */
 #define _FPU_RC_DOWN    0x03
 #define _FPU_RC_UP      0x02
 #define _FPU_RC_ZERO    0x01

-#define _FPU_MASK_NI  0x04 /* non-ieee mode */
-
 /* masking of interrupts */
 #define _FPU_MASK_ZM  0x10 /* zero divide */
 #define _FPU_MASK_OM  0x40 /* overflow */
@@ -35,33 +33,76 @@
 #define _FPU_MASK_XM  0x08 /* inexact */
 #define _FPU_MASK_IM  0x80 /* invalid operation */

-#define _FPU_RESERVED 0xffffff00 /* These bits are reserved are not changed. */
+/* decimal float rounding control */
+#define _FPU_DEC_RC_NEAREST             0x0000000000000000ULL
+#define _FPU_DEC_RC_TOWARDZERO          0x0000000100000000ULL
+#define _FPU_DEC_RC_UPWARD              0x0000000200000000ULL
+#define _FPU_DEC_RC_DOWNWARD            0x0000000300000000ULL
+#define _FPU_DEC_RC_NEARESTFROMZERO     0x0000000400000000ULL
+
+/* The following Decimal Rounding Modes are supported by Power6[x] hardware
+ * but don't have corresponding C-Spec rounding modes.  */
+#define _FPU_DEC_RC_NEARESTTOWARDZERO          0x0000000500000000ULL
+#define _FPU_DEC_RC_FROMZERO                   0x0000000600000000ULL
+#define _FPU_DEC_RC_PREPAREFORSHORTERPRECISION 0x0000000700000000ULL

 /* The fdlibm code requires no interrupts for exceptions.  */
-#define _FPU_DEFAULT  0x00000000 /* Default value.  */
+#define _FPU_DEFAULT  0x0000000000000000ULL /* Default value.  */

-/* IEEE:  same as above, but (some) exceptions;
+/* IEEE:  same as above, but (some) exceptions enabled;
    we leave the 'inexact' exception off.
  */
-#define _FPU_IEEE     0x000000f0
+#define _FPU_IEEE     0x00000000000000f0ULL
+
+/* Type of the control word.  The __DI__ mode is here to force alignment.  */
+typedef unsigned long long fpu_control_t __attribute__ ((__mode__ (__DI__)));
+
+/* Power6[x] provides a 64-bit FPSCR with decimal rounding modes.  */
+#if defined _ARCH_PWR6 || defined _ARCH_PWR6X
+
+/* Not available on Power6[x].  We no-op it for forward porting
+ * compatibility since it is valid on some PowerPC processors.  */
+#define _FPU_MASK_NI    0x0000000000000000ULL /* non-ieee mode  */
+
+/* Note: The Non-IEEE Mode is NOT available on Power6.  */
+#define _FPU_RESERVED 0xfffffff8ffffff04ULL
+
+/* Macros for accessing the hardware control word on Power6[x].  */
+#define _FPU_GETCW(__cw) ({						\
+  union { double d; fpu_control_t cw; }					\
+    tmp __attribute__ ((__aligned__(8)));				\
+  __asm__ ("mffs 0; stfd%U0 0,%0" : "=m" (tmp.d) : : "fr0");		\
+  (__cw)=tmp.cw;							\
+})
+#define _FPU_SETCW(__cw) ({						\
+  union { double d; fpu_control_t cw; }					\
+    tmp __attribute__ ((__aligned__(8)));				\
+  tmp.cw = __cw;							\
+  /* Set the entire 64-bit FPSCR.  */					\
+  __asm__ ("lfd%U0 0,%0; mtfsf 255,0,1,0" : : "m" (tmp.d) : "fr0");	\
+})
+#else
+
+#define _FPU_MASK_NI  0x04 /* non-ieee mode */

-/* Type of the control word.  */
-typedef unsigned int fpu_control_t __attribute__ ((__mode__ (__SI__)));
+/* Bits 29:31 left un-reserved for soft-decimal float rounding direction.  */
+#define _FPU_RESERVED 0xfffffff8ffffff00ULL

 /* Macros for accessing the hardware control word.  */
-#define _FPU_GETCW(__cw) ( { \
-  union { double d; fpu_control_t cw[2]; } \
-    tmp __attribute__ ((__aligned__(8))); \
-  __asm__ ("mffs 0; stfd%U0 0,%0" : "=m" (tmp.d) : : "fr0"); \
-  (__cw)=tmp.cw[1]; \
-  tmp.cw[1]; } )
-#define _FPU_SETCW(__cw) { \
-  union { double d; fpu_control_t cw[2]; } \
-    tmp __attribute__ ((__aligned__(8))); \
-  tmp.cw[0] = 0xFFF80000; /* More-or-less arbitrary; this is a QNaN. */ \
-  tmp.cw[1] = __cw; \
-  __asm__ ("lfd%U0 0,%0; mtfsf 255,0" : : "m" (tmp.d) : "fr0"); \
-}
+#define _FPU_GETCW(__cw) ({						\
+  union { double d; fpu_control_t cw; }					\
+    tmp __attribute__ ((__aligned__(8)));				\
+  __asm__ ("mffs 0; stfd%U0 0,%0" : "=m" (tmp.d) : : "fr0");		\
+  (__cw)=tmp.cw;							\
+})
+#define _FPU_SETCW(__cw) ({						\
+  union { double d; fpu_control_t cw; }					\
+    tmp __attribute__ ((__aligned__(8)));				\
+  tmp.cw = __cw;							\
+  /* Effectively ignores the high 32 bits.  */				\
+  __asm__ ("lfd%U0 0,%0; mtfsf 255,0" : : "m" (tmp.d) : "fr0");		\
+})
+#endif

 /* Default control word set at startup.  */
 extern fpu_control_t __fpu_control;
--- glibc.orig/sysdeps/powerpc/math/Makefile	1969-12-31 18:00:00.000000000 -0600
+++ glibc.new/sysdeps/powerpc/math/Makefile	2007-10-08 14:50:54.000000000 -0500
@@ -0,0 +1,3 @@
+ifeq ($(subdir),math)
+libm-tests = test-powerpc-fenv
+endif
--- glibc.orig/sysdeps/powerpc/math/test-fpucw.c	1969-12-31 18:00:00.000000000 -0600
+++ glibc.new/sysdeps/powerpc/math/test-fpucw.c	2007-10-08 15:08:39.000000000 -0500
@@ -0,0 +1,76 @@
+/* Test to verify proper save and restore of 64-bit fpu control word.
+
+   Copyright (C) 2007 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by IBM Corporation, 2007.
+   Author(s): Ryan S. Arnold <rsa@us.ibm.com>
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <fpu_control.h>
+#include <stdio.h>
+
+int
+main (void)
+{
+#ifdef _FPU_GETCW
+  fpu_control_t cw;
+  _FPU_GETCW (cw);
+
+  cw &= ~_FPU_RESERVED;
+
+  if (cw != (_FPU_DEFAULT & ~_FPU_RESERVED))
+    {
+      printf ("control word is 0x%.16llx but should be 0x%.16llx.\n",
+	      (unsigned long long int) cw,
+	      (unsigned long long int) (_FPU_DEFAULT & ~_FPU_RESERVED));
+      return 1;
+    }
+
+  cw |= (_FPU_DEC_RC_TOWARDZERO | _FPU_RC_DOWN);
+  _FPU_SETCW(cw);
+  _FPU_GETCW(cw);
+
+#if defined _ARCH_PWR6 || defined _ARCH_PWR6X
+  /* Make sure the high order bits are saved and restored.  */
+  if (cw != ((_FPU_DEFAULT & ~_FPU_RESERVED)
+			  | _FPU_DEC_RC_TOWARDZERO
+			  | _FPU_RC_DOWN))
+    {
+      printf ("control word is 0x%.16llx but should be 0x%.16llx.\n",
+	      (unsigned long long int) cw,
+	      (unsigned long long int) ((_FPU_DEFAULT & ~_FPU_RESERVED)
+					| _FPU_DEC_RC_TOWARDZERO
+					| _FPU_RC_DOWN));
+      return 1;
+    }
+#else
+  /* Make sure the high order bits are discarded.  */
+  if (cw != ((_FPU_DEFAULT & ~_FPU_RESERVED & ~_FPU_DEC_RC_TOWARDZERO)
+			  | _FPU_RC_DOWN))
+    {
+      printf ("control word is 0x%.16llx but should be 0x%.16llx.\n",
+	      (unsigned long long int) cw,
+	      (unsigned long long int) ((_FPU_DEFAULT
+					 & ~_FPU_RESERVED
+					 & ~_FPU_DEC_RC_TOWARDZERO)
+					| _FPU_RC_DOWN));
+      return 1;
+    }
+#endif
+#endif /* _FPU_GETCW */
+  return 0;
+}
--- glibc.orig/sysdeps/powerpc/math/test-powerpc-fenv.c	1969-12-31 18:00:00.000000000 -0600
+++ glibc.new/sysdeps/powerpc/math/test-powerpc-fenv.c	2007-10-09 11:57:54.000000000 -0500
@@ -0,0 +1,85 @@
+/* Tests to verify proper behavior of internal fenv functions which operate
+   directly upon the 32 or 64 bit FPSCR.
+
+   Copyright (C) 2007 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by IBM Corporation, 2007.
+   Author(s): Ryan S. Arnold <rsa@us.ibm.com>
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <fenv_libc.h>
+#include <stdio.h>
+
+int
+main (void)
+{
+  fenv_union_t fe;
+  fenv_union_t orig;
+  orig.fenv = fegetenv_register();
+  fe.fenv = fegetenv_register();
+
+  /* Set a decimal float rounding mode in the high order word.  */
+  fe.l[0] |= 0x01;
+
+  fesetenv_register(fe.fenv);
+  fe.fenv = fegetenv_register();
+
+#if defined _ARCH_PWR6 || defined _ARCH_PWR6X
+  /* Make sure the high order bits are saved to the FPSCR and restored.  */
+  if (fe.l[0] != (orig.l[0] | 0x01))
+    {
+      printf ("fenv_t highword is 0x%.8x but should be 0x%.8x.\n",
+	      (unsigned int) fe.l[0],
+	      (unsigned int) (orig.l[0] | 0x01));
+      return 1;
+    }
+#else
+  /* Make sure the high order bits are discarded.  */
+  if (fe.l[0] != (orig.l[0] & ~0x01))
+    {
+      printf ("fenv_t highword is 0x%.8x but should be 0x%.8x.\n",
+	      (unsigned int) fe.l[0],
+	      (unsigned int) (orig.l[0] & ~0x01));
+      return 1;
+    }
+#endif
+
+  relax_fenv_state();
+  fe.fenv = fegetenv_register();
+
+#if defined _ARCH_PWR6 || defined _ARCH_PWR6X
+  /* Make sure the decimal rounding mode triple is 0x0 since '000' is decimal
+   * round to nearest.  */
+  if (fe.l[0] != 0x00)
+    {
+      printf ("fenv_t highword is 0x%.8x but should be 0x%.8x.\n",
+	      (unsigned int) fe.l[0],
+	      (unsigned int) 0x0);
+      return 1;
+    }
+#else
+  /* Make sure the high order word is empty.  */
+  if (fe.l[0] != 0x0)
+    {
+      printf ("fenv_t highword is 0x%.8x but should be 0x%.8x.\n",
+	      (unsigned int) fe.l[0],
+	      (unsigned int) 0x0);
+      return 1;
+    }
+#endif
+  return 0;
+}
--- glibc.orig/sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S	2006-03-16 05:48:55.000000000 -0600
+++ glibc.new/sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S	2007-10-09 12:08:40.000000000 -0500
@@ -1,5 +1,5 @@
 /* Jump to a new context powerpc32 common.
-   Copyright (C) 2005, 2006 Free Software Foundation, Inc.
+   Copyright (C) 2005, 2006, 2007 Free Software Foundation, Inc.
    This file is part of the GNU C Library.

    The GNU C Library is free software; you can redistribute it and/or
@@ -27,6 +27,8 @@
    Any archecture that implements the Vector unit is assumed to also 
    implement the floating unit.  */

+#include <fpscr.h>
+
 /* Stack frame offsets.  */
 #define _FRAME_BACKCHAIN	0
 #define _FRAME_LR_SAVE		4
@@ -199,7 +201,7 @@
 	/* Restore the floating-point registers */
 	lfd	fp31,_UC_FREGS+(32*8)(r31)
 	lfd	fp0,_UC_FREGS+(0*8)(r31)
-	mtfsf	0xff,fp31
+	_RESTORE_FPSCR (fp31)
 	lfd	fp1,_UC_FREGS+(1*8)(r31)
 	lfd	fp2,_UC_FREGS+(2*8)(r31)
 	lfd	fp3,_UC_FREGS+(3*8)(r31)
--- glibc.orig/sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S	2006-01-06 21:56:26.000000000 -0600
+++ glibc.new/sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S	2007-10-09 12:08:50.000000000 -0500
@@ -1,5 +1,5 @@
 /* Save current context and jump to a new context.
-   Copyright (C) 2005, 2006 Free Software Foundation, Inc.
+   Copyright (C) 2005, 2006, 2007 Free Software Foundation, Inc.
    This file is part of the GNU C Library.

    The GNU C Library is free software; you can redistribute it and/or
@@ -27,6 +27,8 @@
    Any archecture that implements the Vector unit is assumed to also 
    implement the floating unit.  */

+#include <fpscr.h>
+
 /* Stack frame offsets.  */
 #define _FRAME_BACKCHAIN	0
 #define _FRAME_LR_SAVE		4
@@ -425,7 +427,7 @@
 	/* Restore the floating-point registers */
 	lfd	fp31,_UC_FREGS+(32*8)(r31)
 	lfd	fp0,_UC_FREGS+(0*8)(r31)
-	mtfsf	0xff,fp31
+	_RESTORE_FPSCR (fp31)
 	lfd	fp1,_UC_FREGS+(1*8)(r31)
 	lfd	fp2,_UC_FREGS+(2*8)(r31)
 	lfd	fp3,_UC_FREGS+(3*8)(r31)
--- glibc.orig/sysdeps/unix/sysv/linux/powerpc/powerpc64/setcontext.S	2006-03-16 05:48:54.000000000 -0600
+++ glibc.new/sysdeps/unix/sysv/linux/powerpc/powerpc64/setcontext.S	2007-10-09 12:09:25.000000000 -0500
@@ -1,5 +1,5 @@
 /* Switch to context.
-   Copyright (C) 2002, 2004, 2005, 2006 Free Software Foundation, Inc.
+   Copyright (C) 2002, 04, 05, 06, 07 Free Software Foundation, Inc.
    This file is part of the GNU C Library.

    The GNU C Library is free software; you can redistribute it and/or
@@ -27,6 +27,8 @@
 #include "ucontext_i.h"
 #include <asm/errno.h>

+#include <fpscr.h>
+
 #if SHLIB_COMPAT (libc, GLIBC_2_3, GLIBC_2_3_4)
 ENTRY(__novec_setcontext)
 	CALL_MCOUNT 1
@@ -65,7 +67,7 @@
   lfd  fp0,(SIGCONTEXT_FP_REGS+(32*8))(r31)
   lfd  fp31,(SIGCONTEXT_FP_REGS+(PT_R31*8))(r31)
   lfd  fp30,(SIGCONTEXT_FP_REGS+(PT_R30*8))(r31)
-  mtfsf  0xff,fp0
+  _RESTORE_FPSCR (fp0)
   lfd  fp29,(SIGCONTEXT_FP_REGS+(PT_R29*8))(r31)
   lfd  fp28,(SIGCONTEXT_FP_REGS+(PT_R28*8))(r31)
   lfd  fp27,(SIGCONTEXT_FP_REGS+(PT_R27*8))(r31)
@@ -346,7 +348,7 @@
   lfd  fp0,(SIGCONTEXT_FP_REGS+(32*8))(r31)
   lfd  fp31,(SIGCONTEXT_FP_REGS+(PT_R31*8))(r31)
   lfd  fp30,(SIGCONTEXT_FP_REGS+(PT_R30*8))(r31)
-  mtfsf  0xff,fp0
+  _RESTORE_FPSCR (fp0)
   lfd  fp29,(SIGCONTEXT_FP_REGS+(PT_R29*8))(r31)
   lfd  fp28,(SIGCONTEXT_FP_REGS+(PT_R28*8))(r31)
   lfd  fp27,(SIGCONTEXT_FP_REGS+(PT_R27*8))(r31)
--- glibc.orig/sysdeps/unix/sysv/linux/powerpc/powerpc64/swapcontext.S	2006-03-16 05:48:54.000000000 -0600
+++ glibc.new/sysdeps/unix/sysv/linux/powerpc/powerpc64/swapcontext.S	2007-10-09 12:09:39.000000000 -0500
@@ -1,5 +1,5 @@
 /* Save current context and install the given one.
-   Copyright (C) 2002, 2004, 2005, 2006 Free Software Foundation, Inc.
+   Copyright (C) 2002, 04, 05, 06, 07 Free Software Foundation, Inc.
    This file is part of the GNU C Library.

    The GNU C Library is free software; you can redistribute it and/or
@@ -27,6 +27,8 @@
 #include "ucontext_i.h"
 #include <asm/errno.h>

+#include <fpscr.h>
+
 #if SHLIB_COMPAT (libc, GLIBC_2_3, GLIBC_2_3_4)
 ENTRY(__novec_swapcontext)
 	CALL_MCOUNT 2
@@ -160,7 +162,7 @@
   lfd  fp0,(SIGCONTEXT_FP_REGS+(32*8))(r31)
   lfd  fp31,(SIGCONTEXT_FP_REGS+(PT_R31*8))(r31)
   lfd  fp30,(SIGCONTEXT_FP_REGS+(PT_R30*8))(r31)
-  mtfsf  0xff,fp0
+  _RESTORE_FPSCR (fp0)
   lfd  fp29,(SIGCONTEXT_FP_REGS+(PT_R29*8))(r31)
   lfd  fp28,(SIGCONTEXT_FP_REGS+(PT_R28*8))(r31)
   lfd  fp27,(SIGCONTEXT_FP_REGS+(PT_R27*8))(r31)
@@ -646,7 +648,7 @@
   lfd  fp0,(SIGCONTEXT_FP_REGS+(32*8))(r31)
   lfd  fp31,(SIGCONTEXT_FP_REGS+(PT_R31*8))(r31)
   lfd  fp30,(SIGCONTEXT_FP_REGS+(PT_R30*8))(r31)
-  mtfsf  0xff,fp0
+  _RESTORE_FPSCR (fp0)
   lfd  fp29,(SIGCONTEXT_FP_REGS+(PT_R29*8))(r31)
   lfd  fp28,(SIGCONTEXT_FP_REGS+(PT_R28*8))(r31)
   lfd  fp27,(SIGCONTEXT_FP_REGS+(PT_R27*8))(r31)

Follow-Ups:
- Re: [PATCH] PowerPC: Extend fpu fenv operations to operate on 64-bit FPSCR
  - From: Ryan S. Arnold

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]