This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.


On 30 August 2013 20:26, Carlos O'Donell <carlos@redhat.com> wrote:
> On 08/30/2013 02:48 PM, Will Newton wrote:
>> On 30 August 2013 18:14, Carlos O'Donell <carlos@redhat.com> wrote:
>>
>> Hi Carlos,
>>
>>>>> A small change to the entry to the aligned copy loop improves
>>>>> performance slightly on A9 and A15 cores for certain copies.
>>>>>
>>>>> ports/ChangeLog.arm:
>>>>>
>>>>> 2013-08-07  Will Newton  <will.newton@linaro.org>
>>>>>
>>>>>         * sysdeps/arm/armv7/multiarch/memcpy_impl.S: Tighten check
>>>>>         on entry to aligned copy loop for improved performance.
>>>>> ---
>>>>>  ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S | 4 ++--
>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> Ping?
>>>
>>> How did you test the performance?
>>>
>>> glibc has a performance microbenchmark, did you use that?
>>
>> No, I used the cortex-strings package developed by Linaro for
>> benchmarking various string functions against one another[1].
>>
>> I haven't checked the glibc benchmarks but I'll look into that. It's
>> quite a specific case that shows the problem so it may not be obvious
>> which one is better however.
>
> If it's not obvious how is someone supposed to review this patch? :-)
>
>> [1] https://launchpad.net/cortex-strings
>
> There are 2 benchmarks. One appears to be dhrystone 2.1, which isn't a string
> test in and of itself which should not be used for benchmarking or changing
> string functions. The other is called "multi" and appears to run some functions
> in a loop and take the time.
>
> e.g.
> http://bazaar.launchpad.net/~linaro-toolchain-dev/cortex-strings/trunk/view/head:/benchmarks/multi/harness.c
>
> I would not call `multi' exhaustive, and while neither is the glibc performance
> benchmark tests the glibc tests have received review from the glibc community
> and are our preferred way of demonstrating performance gains when posting
> performance patches.
>
> I would really really like to see you post the results of running your new
> implementation with this benchmark and show the numbers that claim this is
> faster. Is that possible?

The mailing list server does not seem to accept image attachments so I
have uploaded the performance graph here:

http://people.linaro.org/~will.newton/glibc_memcpy/sizes-memcpy-08-04-2.5.png

-- 
Will Newton
Toolchain Working Group, Linaro


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]