This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: x86 *rint* functions: fastmath or not fastmath.
- From: Jeff Johnston <jjohnstn at redhat dot com>
- To: Dave Korn <dave dot korn at artimi dot com>
- Cc: newlib at sourceware dot org
- Date: Fri, 04 Jan 2008 13:27:03 -0500
- Subject: Re: x86 *rint* functions: fastmath or not fastmath.
- References: <011101c84e43$c64587c0$2e08a8c0@CAM.ARTIMI.COM>
Dave Korn wrote:
Hi Jeff, happy new year and hope you had a pleasant break,
When we left off before christmas, I wrote:
I added the prototypes to math.h unconditionally - it's just
occurred to me as an afterthought that I should probably have put them in
fastmath.h
and you wrote:
After looking at it, I tend to agree with you that you have really
provided fastmath versions of the functions.
I'd now like to revisit that discussion. First off, I was seriously sleep
deprived at the time I wrote my email, and not thinking straight; the separate
concepts of fast math and hard float were somewhat tangled up in my head.
What I /meant/ to suggest was that the prototypes should be moved to an
x86-specific header file (since the corresponding functions don't exist on
other platforms), and due to confusion suggested fastmath.h when it's not
actually what I really wanted or meant at all, but my addled brain seized on
it just because it's a header file and it's x86-specific. Sorry for the
confusion.
So, when you replied agreeing with me, I'm not sure if I've just confused
the issue with my own lack of clarity, or if there's some other reason to say
these are fastmath functions.
The discussion and resolution was kind of rushed due to my self-imposed
1.16.0 deadline.
Which brings me to the nub of the issue: are these fast math functions, or
are they good enough to be first-class implementations?
I haven't seen any formal definition of the difference between fast and
non-fast math, but my understanding is that fast math might not be entirely
accurate or rounded correctly, might or might not handle exceptions correctly,
and might or might not optimise using assumptions such as associativity and
distributivity that aren't entirely valid for FP; that is, the issues are
accuracy and IEEE compliance.
Pretty much. A fast math routine usually uses a hardware instruction or
a coding trick. It often isn't prepared for all possible inputs (e.g.
NaNs or Infs or extreme values). It sometimes has accuracy implications
when compared to the IEEE soft-floating point versions (e.g. a sin
instruction might not handle extremely huge or extemely small values
properly or there is a hardware limitation to the accuracy). The odd
situations usually require special-casing, as exemplified in the
soft-float routines, that might not be present in the instruction
logic. The fact that fastmath is a purely optional optimization, lets
it off the hook for full compliance.
To the best of my knowledge, I can't see why these functions would meet any
of those conditions. The x87 produces IEEE-conformant results for single
operations like these, the excess precision between stages of a prolonged
series of fpu insns shouldn't come into play here (obviously this
consideration might be different if we were discussing anything except the
round-to-integer instructions), and the x87 handles all the exceptions and
status codes properly - and we don't even have support for fenv.h and the
fe{set,get}* exception and status handling parts of the library yet.
So, as far as I can see, these functions ought to be good enough for
first-class library functions. I wonder whether you agree with my reasoning
or not here, and whether you'd consider a patch that completely overrode the
common/ soft implementations of rint/rintf/lrint/lrintf altogether? If not,
we might still decide that they're "good enough" for cygwin and I'll supply a
patch that only affects cygwin, but I can't decide which yet until I
understand the reasoning for calling them fast-math or not.
If they match or surpass the accuracy of the soft-float versions for the
full-range of inputs and set errno appropriately, then they are
certainly first-class and can override.
BTW, my plan is to provide fenv.h and implement the related functions. I
care most about cygwin, which does use the fast math implementations (and my
major motivator is to speed up FP in cygwin and provide control over the x87
hardware fpu features), so I'm not really keen to provide soft
implementations. For that reason, would you prefer if I generate my patches
to target cygwin only, or would you be happy with patches that add new
funtionality to all i386 builds, but only when using hard float?
I'm ok with adding such functionality to all i386 builds.
cheers,
DaveK