This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Subject: single-precision "expf" super slow on x86-64??
- From: Miles Bader <miles at gnu dot org>
- To: libc-help at sourceware dot org
- Date: Sun, 29 Apr 2012 14:00:10 +0900
- Subject: Subject: single-precision "expf" super slow on x86-64??
Hi, I've run into the following very odd behavior:
On my debian x86-64 system, the single-precision "expf" function seems
to be about six times _slower_ than the double-precision "exp"
function!
Moreover, "expf" seems to be far slower than other "slow" math
functions (sin, cos, etc) while "exp" seems to be roughly the same
speed as them.
I first noticed this in a real program, which seemed oddly slow, and
got it to speed up significantly by using exp instead of expf (even
though the values being manipulated are all single-precision, so I
don't need exp).
I've duplicated this in the attached test program; here's the output:
$ gcc-4.7 -o exp-test -O2 -march=native exp-test.c -lm
$ ./exp-test
fisum = 4.99891e+06, disum = 4.99891e+06
fosum = 1.71808e+07, dosum = 1.71808e+07
float (expf) user time: 2.94819 sec
double (exp) user time: 0.520032 sec
The first two lines of output just verify that the results are the
same for float and double calculations. The second two lines show the
user CPU times (via getrusage), showing that "expf" takes about six
times as long as "exp"...
The glibc version is 2.13, the compiler is gcc 4.7 (compiling for
x86-64), and the CPU is an AMD Phenom.
Anybody have any idea what's going on? This behavior seems very
weird...
[As a workaround, I could modify my program to just use "exp", even on
single-precision values, but this seems a fragile hack, and detecting
this odd situation with autoconf seems ... annoying...]
Thanks,
-Miles
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <sys/resource.h>
#define NITERS 10000000
float finps[NITERS], foutps[NITERS];
double dinps[NITERS], doutps[NITERS];
int main ()
{
struct rusage start_float_ru, end_float_ru;
struct rusage start_double_ru, end_double_ru;
int i;
for (i = 0; i < NITERS; i++)
dinps[i] = drand48() + 1e-6;
for (i = 0; i < NITERS; i++)
finps[i] = (float) (dinps[i]);
getrusage (RUSAGE_SELF, &start_float_ru);
for (i = 0; i < NITERS; i++)
foutps[i] = expf (finps[i]);
getrusage (RUSAGE_SELF, &end_float_ru);
getrusage (RUSAGE_SELF, &start_double_ru);
for (i = 0; i < NITERS; i++)
doutps[i] = exp (dinps[i]);
getrusage (RUSAGE_SELF, &end_double_ru);
double fisum = 0;
for (i = 0; i < NITERS; i++)
fisum += finps[i];
double disum = 0;
for (i = 0; i < NITERS; i++)
disum += dinps[i];
double fosum = 0;
for (i = 0; i < NITERS; i++)
fosum += foutps[i];
double dosum = 0;
for (i = 0; i < NITERS; i++)
dosum += doutps[i];
printf ("fisum = %g, disum = %g\n", fisum, disum);
printf ("fosum = %g, dosum = %g\n", fosum, dosum);
printf ("float (expf) user time: %g sec\n",
((double)(end_float_ru.ru_utime.tv_sec
- start_float_ru.ru_utime.tv_sec)
+(double)(end_float_ru.ru_utime.tv_usec
- start_float_ru.ru_utime.tv_usec) / 1.e6));
printf ("double (exp) user time: %g sec\n",
((double)(end_double_ru.ru_utime.tv_sec
- start_double_ru.ru_utime.tv_sec)
+(double)(end_double_ru.ru_utime.tv_usec
- start_double_ru.ru_utime.tv_usec) / 1.e6));
return 0;
}
--
Corporation, n. An ingenious device for obtaining individual profit without
individual responsibility.