This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [PATCHv2] ARM: NEON optimized implementation of memcpy.
On Wed, Jul 15, 2009 at 05:08:21PM +0300, Siarhei Siamashka wrote:
> The memcpy implementation from that package is done in C, probably with the
> hope that the compiler can generate some good code for it. I highly doubt that
> this is going to happen any time soon, so normal assembly code will be always
> better.
I'd rather have numbers than generalizations; the code generated is
not too bad, and having the compiler able to schedule for each
specific processor is a lot more maintainable.
> It's good to know, just because the way they are now, performance would be
> only lost. Is there anything else that may be using these __aeabi_memcpy*
> functions at the moment?
Third-party compilers (like RealView)
> There must be some reason why these __aeabi_memcpy* functions exist in the
> first place. Probably somebody thought that handling very small copies is
> performance critical. Don't know if this is actually justified in practice.
I think this and your later comments misunderstood what I was talking
about. __aeabi_memcpy* are supposed to be optimized for large copies;
that's in the ABI documentation. The expectation is that small copies
will be inlined at the call site. Thus, having it handle small copies
efficiently is not worth even a few cycles (as long as it's correct).
--
Daniel Jacobowitz
CodeSourcery