This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH] Split mantissa calculation loop and add branch prediction tomp multiplication
- From: Siddhesh Poyarekar <siddhesh at redhat dot com>
- To: libc-alpha at sourceware dot org
- Date: Mon, 31 Dec 2012 14:58:50 +0530
- Subject: [PATCH] Split mantissa calculation loop and add branch prediction tomp multiplication
Hi,
The loop inside __mul in mpa.c is slow because of the branch
evaluation within the loop. Attached patch splits the loop to make
the loop simpler to evaluate and also adds some branch prediction to
improve performance. I have verified that this does not break the
testsuite.
I checked for performance improvement with this patch using the test
program here:
http://entropymine.com/imageworsener/slowpow/
with the following commandline:
time ./powtest 1000000 1.0000000000000020 1.5000000000500000
Taking into account the best times with and without the patch (with
more than 10 runs each), the improvement is about a quarter of a
second.
OK to commit?
Siddhesh
ChangeLog:
* sysdeps/ieee754/dbl-64/mpa.c (__mul): Split mantissa
calculation loop and add branch prediction.
diff --git a/sysdeps/ieee754/dbl-64/mpa.c b/sysdeps/ieee754/dbl-64/mpa.c
index 78afd99..278fdb6 100644
--- a/sysdeps/ieee754/dbl-64/mpa.c
+++ b/sysdeps/ieee754/dbl-64/mpa.c
@@ -447,33 +447,51 @@ void
SECTION
__mul(const mp_no *x, const mp_no *y, mp_no *z, int p) {
- int i, i1, i2, j, k, k2;
+ int i, j, k, k2;
double u;
- /* Is z=0? */
- if (X[0]*Y[0]==ZERO)
- { Z[0]=ZERO; return; }
-
- /* Multiply, add and carry */
- k2 = (p<3) ? p+p : p+3;
- Z[k2]=ZERO;
- for (k=k2; k>1; ) {
- if (k > p) {i1=k-p; i2=p+1; }
- else {i1=1; i2=k; }
- for (i=i1,j=i2-1; i<i2; i++,j--) Z[k] += X[i]*Y[j];
-
- u = (Z[k] + CUTTER)-CUTTER;
- if (u > Z[k]) u -= RADIX;
- Z[k] -= u;
- Z[--k] = u*RADIXI;
- }
+ /* Is z=0? */
+ if (__glibc_unlikely (X[0] * Y[0] == ZERO))
+ {
+ Z[0]=ZERO;
+ return;
+ }
- /* Is there a carry beyond the most significant digit? */
- if (Z[1] == ZERO) {
- for (i=1; i<=p; i++) Z[i]=Z[i+1];
- EZ = EX + EY - 1; }
- else
- EZ = EX + EY;
+ /* Multiply, add and carry */
+ k2 = (__glibc_unlikely (p < 3)) ? p + p : p + 3;
+ Z[k2] = ZERO;
+
+ for (k = k2; k > p; )
+ {
+ for (i = k - p, j = p; i < p + 1; i++, j--)
+ Z[k] += X[i] * Y[j];
+
+ u = (Z[k] + CUTTER)-CUTTER;
+ if (u > Z[k]) u -= RADIX;
+ Z[k] -= u;
+ Z[--k] = u * RADIXI;
+ }
+
+ for (; k > 1; )
+ {
+ for (i = 1,j = k - 1; i < k; i++, j--)
+ Z[k] += X[i] * Y[j];
+
+ u = (Z[k] + CUTTER)-CUTTER;
+ if (u > Z[k])
+ u -= RADIX;
+ Z[k] -= u;
+ Z[--k] = u*RADIXI;
+ }
+
+ EZ = EX + EY;
+ /* Is there a carry beyond the most significant digit? */
+ if (__glibc_unlikely (Z[1] == ZERO))
+ {
+ for (i = 1; i <= p; i++)
+ Z[i] = Z[i+1];
+ EZ--;
+ }
Z[0] = X[0] * Y[0];
}