This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Add a new macro to mask a float
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Wed, 29 Jun 2016 15:56:25 -0300
- Subject: Re: [PATCH] Add a new macro to mask a float
- Authentication-results: sourceware.org; auth=none
- References: <1467142073-13886-1-git-send-email-tuliom at linux dot vnet dot ibm dot com> <alpine dot DEB dot 2 dot 20 dot 1606282056460 dot 12650 at digraph dot polyomino dot org dot uk> <577402AC dot 5080208 at linaro dot org> <alpine dot DEB dot 2 dot 20 dot 1606291725470 dot 22371 at digraph dot polyomino dot org dot uk>
On 29/06/2016 14:34, Joseph Myers wrote:
> On Wed, 29 Jun 2016, Adhemerval Zanella wrote:
>
>> My understanding of this optimization is to just make the the FP to GPR move,
>> bitwise operation and GRP to FP move again to a more simple bitwise operation
>> on FP register itself. It is indeed equivalent to integer masking and I
>> believe the 'normalized' here means to make the float mask to represented
>> as internal double required in VSX operations.
>
> What do you mean by "internal double"? Is this purely some fixed
> rearrangement of bits, so that e.g. subnormal float values still get
> represented as subnormals rather than like normal doubles?
In fact the float number are converted in double value, so 0x1p-149f would
be represented internally in the VSX register as
v4_int32 = {0x0, 0x0, 0x0, 0x36a00000}. And in fact this is an issue
(below).
>
> Say the number is the least subnormal float - 0x1p-149f, integer
> representation 1 - and that it's masked with 0xfffff000, as in the various
> MASK_FLOAT calls. Can you confirm that the instruction sequence in the
> patch produces 0.0f, as the integer masking does, when executed on a
> POWER8 processor? And that if instead the value is 0x1p-137f, it's
> returned unchanged?
>
> If equivalent to integer masking for all inputs including subnormals and
> infinities and NaNs, then my previous point applies that this should be a
> compiler optimization instead of a glibc patch.
>
Now that you raised these questioning I do not think this change is safe
for float values in POWER. Current patch does:
__asm__ ("xvmovdp %x2, %x2\n\t" \
"xxland %x0, %x2, %1\n\t" \
And I think 'xvmovdp' here is not what it really meant (it is
Copy Sign Double-Precision). I think what the algorithm meant was in fact:
__asm__ ("xvcvdpsp %x2, %x2\n\t" \
"xxland %x0, %x2, %1\n\t" \
"xvcvspdp %x0, %x0" \
What in fact will first convert the internal double representation to float,
apply the mask and then convert back to double as intended. With this
change both 0x1p-149f and 0x1p-137f shows the same behaviour using default
implementation and MASK_FLOAT.
However both 'xvcvdpsp' and 'xvcvspdp' may change the floating point status
depending of the argument, which is transformation aims to avoid. Analysing
it raised some questions also how safe is sysdeps/powerpc/fpu/s_float_bitwise.h
operations.