This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.


On 2013-04-16 11:25, Will Newton wrote:
  ports/sysdeps/arm/armv7/multiarch/Makefile         |   3 +

Does this really require v7? From a brief read I didn't see anything in the _arm version that didn't work since v5te (ldrd and pld). Any reason not to put this into armv6 instead?

+ENTRY(memcpy)
+	.type	memcpy, %gnu_indirect_function
+	ldr	r1, .Lmemcpy_arm
+	tst	r0, #HWCAP_ARM_NEON
+	it	ne
+	ldrne	r1, .Lmemcpy_neon
+	bne	1f

Swap vfp and neon tests and you don't need the branch.

+.Lreturn:

Unused label?

+	ldr	tmp1, [src, #-60]	/* 15 words to go.  */
+	str	tmp1, [dst, #-60]

These negative offsets mean thumb2 doesn't work. That's fine, but it means that you need care for this in the _arm case.

You have two choices: either do the swapping to arm mode by hand in the impl file, or force the entire memcpy.o to arm mode by using #define NO_THUMB at the top, before the #include <sysdep.h>.

If you chose the later, then you don't have to worry about thumb2's restriction on rd=rn when rm=pc, and can avoid the extra move. And the then unnecessary it markup.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]