This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- From: Will Newton <will dot newton at linaro dot org>
- To: Richard Henderson <rth at twiddle dot net>
- Cc: libc-ports at sourceware dot org, Patch Tracking <patches at linaro dot org>
- Date: Mon, 15 Apr 2013 19:31:15 +0100
- Subject: Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- References: <516BCEE5 dot 9070809 at linaro dot org> <516C357F dot 40406 at twiddle dot net> <CANu=Dmj5b2oFqvALjR5x_+Nwg8CRQSSazP9omFroin2SDqb_Tg at mail dot gmail dot com> <516C4554 dot 3090202 at twiddle dot net>
On 15 April 2013 19:22, Richard Henderson <rth@twiddle.net> wrote:
> On 2013-04-15 19:44, Will Newton wrote:
>>
>> On 15 April 2013 18:14, Richard Henderson <rth@twiddle.net> wrote:
>>>
>>> On 2013-04-15 11:56, Will Newton wrote:
>>>>
>>>>
>>>> +# ifdef PIC
>>>> +1: .long _GLOBAL_OFFSET_TABLE_ - 0b - PC_OFS
>>>> +.Lmemcpy_neon:
>>>> + .long C_SYMBOL_NAME(__memcpy_neon)(GOT)
>>>> +.Lmemcpy_vfp:
>>>> + .long C_SYMBOL_NAME(__memcpy_vfp)(GOT)
>>>> +.Lmemcpy_arm:
>>>> + .long C_SYMBOL_NAME(__memcpy_arm)(GOT)
>>>
>>>
>>>
>>> There's no need for GOT entries. Just use pc-relative references.
>>
>>
>> Are you suggesting I use GOTOFF here or something else?
>
>
> Declining to look up the real names of the constants, something like
>
> ldr r1, .Lmemcpy_arm
> tst r0, #VFP
> ldrne r1, .Lmemcpy_vfp
> tst r0, #NEON
> ldrne r1, .Lmemcpy_neon
> 1: add r0, r1, pc
> bx lr
>
> .Lmemcpy_arm:
> .long C_SYMBOL_NAME(__memcpy_arm) - 1b - PC_OFS
> .Lmemcpy_vfp:
> .long C_SYMBOL_NAME(__memcpy_vfp) - 1b - PC_OFS
> .Lmemcpy_neon:
> .long C_SYMBOL_NAME(__memcpy_neon) - 1b - PC_OFS
>
>
> And I forgot -- this is a bug fix. Using GOT references from IFUNC
> resolvers is not guaranteed to work, due to the one pass nature of ld.so.
> See comments by Dave Miller in the glibc archive wrt Sparc IFUNC.
>
> And does anyone see a benefit to obfuscating this code with PIC tests just
> to avoid a single ADD insn in the static library?
Thanks Richard, that is a great improvement. I'll post up a v2
rewritten this way.
--
Will Newton
Toolchain Working Group, Linaro