This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.


On 18 April 2013 10:39, OndÅej BÃlka <neleai@seznam.cz> wrote:
> On Mon, Apr 15, 2013 at 11:38:49AM +0100, Will Newton wrote:
>> On 15 April 2013 11:06, MÃns RullgÃrd <mans@mansr.com> wrote:
>>
>> Hi MÃns,
>>
>> >> Add a high performance memcpy routine optimized for Cortex-A15 with
>> >> variants for use in the presence of NEON and VFP hardware, selected
>> >> at runtime using indirect function support.
>> >
>> > How does this perform on Cortex-A9?
>>
>> The code is also faster on A9 although the gains are not quite as
>> pronounced. A set of numbers is attached (they linewrap pretty
>> horribly inline).
>>
>>
> I forget to ask where to get benchmark source. Without it there is no
> way to tell if it was done correctly.
> You must randomly vary sizes in range n..2n and also vary alignments.

The benchmark is taken from the cortex-strings package:

https://launchpad.net/cortex-strings

I wrote a wrapper around the benchmark to vary alignment in {1, 2, 4,
8} and a variety of block lengths between 8 and 200.

--
Will Newton
Toolchain Working Group, Linaro


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]