This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Add x86-64 and SSE math support to i386 bits/mathinline.h
- From: Roland McGrath <roland at hack dot frob dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Wed, 6 Jun 2012 14:55:10 -0700 (PDT)
- Subject: Re: [PATCH] Add x86-64 and SSE math support to i386 bits/mathinline.h
- References: <20120606172605.GA16944@intel.com>
> @@ -120,26 +119,42 @@
>
> /* The gcc, version 2.7 or below, has problems with all this inlining
> code. So disable it for this version of the compiler. */
> -# if __GNUC_PREREQ (2, 8)
> +# if __GNUC_PREREQ (2, 8) && defined __USE_ISOC99
> +__BEGIN_NAMESPACE_C99
This is inside:
#if defined __USE_ISOC99 && defined __GNUC__ && __GNUC__ >= 2
so the change to the condition is not required. If __BEGIN_NAMESPACE_C99
is necessary that is not strictly part of the x86-{32,64} unification
(though I realize it is a difference between the two existing files) and
so should be in a separate change (and have a bug report).
> /* Test for negative number. Used in the signbit() macro. */
> __MATH_INLINE int
> __NTH (__signbitf (float __x))
> {
> +# ifndef __x86_64__
> __extension__ union { float __f; int __i; } __u = { __f: __x };
> return __u.__i < 0;
> +# else
It seems more natural to order things as ifdef rather than ifndef, so put
the x86_64 code first. But shouldn't this just test __SSE2__ instead?
> + int __m;
> + __asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x));
> + return __m & 0x8;
> +# endif
The existing i386 function always returns 1 or 0, while this returns 8 or 0.
I realize the existing x86_64 code differs this way too, but that seems
questionable. So perhaps this should be "return (__m & 0x8) != 0;"?
> __MATH_INLINE int
> __NTH (__signbit (double __x))
> {
> +# ifndef __x86_64__
> __extension__ union { double __d; int __i[2]; } __u = { __d: __x };
> return __u.__i[1] < 0;
> +# else
> + int __m;
> + __asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x));
> + return __m & 0x80;
> +# endif
The same two issues apply here.
> + /* Mark as volatile since the result is dependend on the state of
> + the SSE control register (the rounding mode). Otherwise GCC might
> + remove these assembler instructions since it does not know about
> + the rounding mode change and cannot currently be told. */
s/dependend/dependent/. There should be two spaces between the sentences.
I realize this is just code you copied from the existing x86_64 file, so
these nits are not your fault per se, but we might as well fix them now.
There are several more instances below which I won't bother to cite
individually.
> +# ifdef __SSE2_MATH__
> +__MATH_INLINE long int
> +__NTH (lrint (double __x))
> +{
> + long int __res;
> + /* Mark as volatile since the result is dependend on the state of
> + the SSE control register (the rounding mode). Otherwise GCC might
> + remove these assembler instructions since it does not know about
> + the rounding mode change and cannot currently be told. */
> + __asm __volatile__ ("cvtsd2si %1, %0" : "=r" (__res) : "xm" (__x));
> + return __res;
> +}
> +# endif
> +# ifdef __x86_64__
> +__MATH_INLINE long long int
> +__NTH (llrintf (float __x))
> +{
> + long long int __res;
> + /* Mark as volatile since the result is dependend on the state of
> + the SSE control register (the rounding mode). Otherwise GCC might
> + remove these assembler instructions since it does not know about
> + the rounding mode change and cannot currently be told. */
> + __asm __volatile__ ("cvtss2si %1, %0" : "=r" (__res) : "xm" (__x));
> + return __res;
> +}
> +__MATH_INLINE long long int
> +__NTH (llrint (double __x))
> +{
> + long long int __res;
> + /* Mark as volatile since the result is dependend on the state of
> + the SSE control register (the rounding mode). Otherwise GCC might
> + remove these assembler instructions since it does not know about
> + the rounding mode change and cannot currently be told. */
> + __asm __volatile__ ("cvtsd2si %1, %0" : "=r" (__res) : "xm" (__x));
> + return __res;
> +}
> +# endif
All this duplication can be replaced with an __lrint_code macro like the
existing i386 file uses.
I don't understand why llrint* are defined this way only for x86_64.
It seems appropriate for __SSE2_MATH__ on i386 as well.
Thanks,
Roland