This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Compile AVX libm functions with -mavx
On Tue, Oct 2, 2012 at 4:07 PM, Matt Turner <mattst88@gmail.com> wrote:
> On Tue, Oct 2, 2012 at 1:19 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Oct 2, 2012 at 12:47 PM, OndÅej BÃlka <neleai@seznam.cz> wrote:
>>> On Tue, Oct 02, 2012 at 03:31:50PM -0400, Mike Frysinger wrote:
>>>> On Tuesday 02 October 2012 15:20:54 H.J. Lu wrote:
>>>> > On Tue, Oct 2, 2012 at 12:02 PM, Mike Frysinger <vapier@gentoo.org> wrote:
>>>> > > On Tuesday 02 October 2012 09:53:25 H.J. Lu wrote:
>>>> > >> This patch compiles AVX libm functions with -mavx. It reduces text size
>>>> > >
>>>> > >> of libm.so by about 1%:
>>>> > > looks like you're reverting 56f6f6a2403cfa7267cad722597113be35ecf70d.
>>>> > > shouldn't you revert all of it and not just change the CFLAGS back ?
>>>> >
>>>> > Doesn't this patch:
>>>> >
>>>> > http://sourceware.org/ml/libc-alpha/2012-10/msg00055.html
>>>> >
>>>> > do that?
>>>>
>>>> yes, i missed the follow up
>>>>
>>>> > > it'd be useful to know *why* Ulrich moved away from -mavx, but
>>>> > > unfortunately his commit message is useless.
>>>> >
>>>> > I can only guess:
>>>>
>>>> might be useful to put some notes (like referring to the older commit) into
>>>> the commit message when you do commit things
>>>> -mike
>>>
>>> could it be a 60 cycle penalty when switching between legagy sse and avx
>>> state?
>>
>> This true. We can use -mprefer-avx128 to make sure that only 128bit AVX
>> instructions are used.
>>
>> --
>> H.J.
>
> The latency for switching between old SSE and new (AVX-style
Latency comes from switching between the 128-bit SSE context and
the 256-bit AVX context. If we only use the lower 128-bit AVX context,
there is no latency.
> 3-operand) form is what causes the penalty. What is the purpose of
> -mprefer-avx128? I can't find a description of it online.
I just fixed it:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54785
-mprefer-avx128 will avoid 256-bit AVX instructions. Only 128-bit
AVX instructions are generated. It has the same effect on context
switch as -msse2avx.
--
H.J.