This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Improve memcpy for Atom
That was done to improve performance on NAT test from EEMBC2.0-suite
(with this patch it accelerates by 20%). If it's better to use half of
cache, this could be changed back. But besides that change there are
changes in implementation for small sizes (movq and movdqa are used
instead of movl) - what do you think about them?
On 19 October 2011 20:08, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Oct 19, 2011 at 6:11 AM, Michael Zolotukhin
> <michael.v.zolotukhin@gmail.com> wrote:
>> Hi,
>>
>> This patch contains one function:
>> __memcpy_ssse3
>>
>> It improves memcpy on small sizes and on sizes between half of shared
>> cache size and shared cache size for Atom (up to 40% performance
>> gain).
>>
>> The patch was tested on Atom.
>>
>> Change Log:
>> 2011-10-11 ?Michael Zolotukhin ?<michael.v.zolotukhin@gmail.com>
>>
>> ? ? ? ?* sysdeps/i386/i686/multiarch/memcpy-ssse3.S: Update.
>> ? ? ? ?XMM-moves are used for copying on small sizes. Use
>> SHARED_CACHE_SIZE instead of
>> ? ? ? ?SHARED_CACHE_SIZE_HALF.
>>
>
> We use SHARED_CACHE_SIZE_HALF on purpose. ?memcpy in one
> process may be slower in some cases, but so it won't starve other
> processes for cache.
>
> --
> H.J.
>
--
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.