This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Consolidate and inline calls to __acr


On Mon, Dec 31, 2012 at 05:22:40PM +0100, Andreas Schwab wrote:
>      else        {i1=1;   i2=k;   }
> -    for (i=i1,j=i2-1; i<i2; i++,j--)  Z[k] += X[i]*Y[j];
> +#if 1
> +    /* rearange this inner loop to allow the fmadd instructions to be
> +       independent and execute in parallel on processors that have
> +       dual symetrical FP pipelines.  */
> +    if (i1 < (i2-1))
> +    {
> +	/* make sure we have at least 2 iterations */
> +	if (((i2 - i1) & 1L) == 1L)
> +	{
> +                /* Handle the odd iterations case.  */
> +		zk2 = x->d[i2-1]*y->d[i1];
> +	}
> +	else
> +		zk2 = zero.d;
> +	/* Do two multiply/adds per loop iteration, using independent
> +           accumulators; zk and zk2.  */
> +	for (i=i1,j=i2-1; i<i2-1; i+=2,j-=2) 
> +	{
> +		zk += x->d[i]*y->d[j];
> +		zk2 += x->d[i+1]*y->d[j-1];
> +	}
> +	zk += zk2; /* final sum.  */
> +    }
> +    else
> +    {
> +        /* Special case when iterations is 1.  */
> +	zk += x->d[i1]*y->d[i1];
> +    }
> +#else
> +    /* The orginal code.  */
> +    for (i=i1,j=i2-1; i<i2; i++,j--)  zk += X[i]*Y[j];
> +#endif
>  

... and this seems to be the only relevant change.  If the conversion
of mp_no.d[] to int[] is approved, this should not even be necessary
since FP is not needed at all.


Siddhesh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]