This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [PATCH v2] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- From: Richard Henderson <rth at twiddle dot net>
- To: Will Newton <will dot newton at linaro dot org>
- Cc: libc-ports at sourceware dot org, patches at linaro dot org
- Date: Wed, 17 Apr 2013 17:40:46 +0200
- Subject: Re: [PATCH v2] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- References: <516D18F0 dot 4060009 at linaro dot org>
On 2013-04-16 11:25, Will Newton wrote:
ports/sysdeps/arm/armv7/multiarch/Makefile | 3 +
Does this really require v7? From a brief read I didn't see anything in the
_arm version that didn't work since v5te (ldrd and pld). Any reason not to put
this into armv6 instead?
+ENTRY(memcpy)
+ .type memcpy, %gnu_indirect_function
+ ldr r1, .Lmemcpy_arm
+ tst r0, #HWCAP_ARM_NEON
+ it ne
+ ldrne r1, .Lmemcpy_neon
+ bne 1f
Swap vfp and neon tests and you don't need the branch.
+.Lreturn:
Unused label?
+ ldr tmp1, [src, #-60] /* 15 words to go. */
+ str tmp1, [dst, #-60]
These negative offsets mean thumb2 doesn't work. That's fine, but it means
that you need care for this in the _arm case.
You have two choices: either do the swapping to arm mode by hand in the impl
file, or force the entire memcpy.o to arm mode by using #define NO_THUMB at the
top, before the #include <sysdep.h>.
If you chose the later, then you don't have to worry about thumb2's restriction
on rd=rn when rm=pc, and can avoid the extra move. And the then unnecessary it
markup.
r~