This is the mail archive of the libc-alpha@sourceware.cygnus.com mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: Enable

To: aj at suse dot de
Subject: Re: Enable
From: "David H. Munro" <munro at icf dot llnl dot gov>
Date: Mon, 4 Oct 1999 12:07:31 -0700
CC: libc-alpha at sourceware dot cygnus dot com


>David's suggestion leads to a minimal patch - but opens a can of worms
>(everybody likes to have his favorite environment set).  I rather
>would like to have a general solution and prefer Geoff's proposal.

I agree that a third FE_..._ENV macro for fesetenv isn't likely to
solve the problem, except for the person whose preference it
represents.

>The C9x <fenv.h> interface isn't nearly as flexible as some programs
>would like.  You can't request signals for a specific subset of the
>IEEE exceptions, for instance.  I'd like to see the SVID <ieeefp.h>
>interface in 2.2.

Both Solaris and HPUX provide the fpgetmask/fpsetmask functions.
However, Solaris declares them in <ieeefp.h>, while HPUX declares
them in <math.h>.

This interface is fine, but it doesn't address the fast underflow
issue.  HPUX provides the additional fpsetfastmode function to set
that FPU bit, while Solaris does not (even though some Sun hardware
does slow underflows by default).  The additional cost libsunmath
library on Solaris provides the old SunOS4 function
nonstandard_arithmetic to set the fast underflow bit (and ieee_handler
to set exception mask bits).

It also duplicates some of the fenv.h functionality; I think it is
confusing to have both fenv.h and ieeefp.h both addressing FPU status
and control.  On the other hand, I don't know the policy on adding
things to ANSI standard header files.  It's clear that the C9X fenv.h
standard is deficient (appalling, since we are only talking about a
few dozen bits in at most two registers on any machine) -- is the
preference in such cases to augment fenv.h or to go with a second
interface like ieeefp.h?

As a practical matter for supporting existing code, I can tell you
that it doesn't matter at all -- anybody who is trying to support a
code that needs this already has so many ifdefs for different
platforms that another one won't make a particle of difference.

>I'd much rather just [add] extra calls, `feenableexcept',
>`fedisableexcept' which do this:
>
>int feenableexcept(int excepts);
>
>Returns a bitmap containing those exceptions in 'excepts' which were
>enabled.  Will not enable _more_ exceptions that 'excepts' specifies;
>returns 0 if 'excepts' is not FE_ALL_EXCEPT and cannot enable or
>disable exceptions individually.
>
>Note that on PPC, at least, you can test whether exceptions have
>occurred at a finer level than you can enable or disable them.  So if
>you ask to have just FE_INVALID_SQRT enabled, it'll fail because you can
>only enable all the FE_INVALID exceptions at once.

This straightforward extension to fenv.h sounds better to me also.
The fact that macros FE_DIVBYZERO, FE_INVALID, etc. exist fools many
people into thinking that they can be used as arguments to fesetenv to
effect the exception mask bits (as opposed to the status bits).

Again, you need to watch out for the fast underflow bit on several
types of FPU hardware (at least HP PA-RISC and SPARC; I'm not sure
about alphas).  Since that cannot use the exception mask bit macros,
you probably need to add another pair of functions -- perhaps
fesetfastmode/fegetfastmode.  These would be no-ops where the hardware
doesn't support the concept, but would set/get the appropriate FPU
control bit where it does.

-----------------
Here's some documentation on the "fast underflow" settings.  HP lost a
multimillion dollar contract at Livermore when an early PA-RISC
machine came out because they didn't get that bit set correctly when
running our benchmark FFT code; it's not wise to ignore the issue.
Nobody figured out why the machine we were sure was slower ran the
benchmark faster until it was too late.

HPUX fpsetfastmode man page excerpt:

  fpgetfastmode() and fpsetfastmode() allow the programmer to change the
  way the system handles underflow.  Fast underflow mode, also known as
  fastmode, is an alternative to IEEE-754-compliant underflow mode.  On
  Series 700/800 systems, most underflow cases are supported by trapping
  into the kernel, where the IEEE-mandated conversion of the result into
  a denormalized value or zero is accomplished by software emulation.
  On later PA1.1 systems and on all PA2.0 systems, fastmode causes the
  hardware to simply substitute a zero for the result of an operation,
  with no fault occurring.  This may be a significant performance
  optimization for applications that underflow frequently.  Fastmode
  also causes denormalized floating-point operands to be treated as if
  they were true zero operands.

Solaris nonstandard_arithmetic man page excerpt:

  nonstandard_arithmetic() and standard_arithmetic() are mean-
  ingful on systems that provide an alternative faster mode of
  floating-point arithmetic  that  does  not  conform  to  the
  default  IEEE  Standard. Nonstandard modes vary among imple-
  mentations; nonstandard mode may, for  instance,  result  in
  setting  subnormal  results to zero or in treating subnormal
  operands   as   zero,   or   both,   or   something    else.
  standard_arithmetic()  reverts to the default standard mode.
  On systems that provide only one mode, these functions  have
  no effect.

Follow-Ups:
- Re: Enable
  - From: Andreas Jaeger

References:
- Enable
  - From: Andreas Jaeger

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]