This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ARM] Optimised strchr and strlen


On 12/21/2011 02:55 AM, David Gilbert wrote:
> That 'simple' one is showing the benefit at the short lengths,
> the 'smarter' one I have is doing 8 bytes/loop and is nice on the long
> strings - but as you can see worse at the short ones.

Having not seen your "smarter" strchr, it's hard to suggest anything
concrete.  I'd have thought that there's enough slack in load delay
that one or two arithmetic operations could be done without penalty...

Something like performing a simple compare loop looking for "alignment plus":

...
	bic	r3, r0, #7
	and	r1, r1, #255
	adds	r3, r3, #32
1:
	ldrb	r2, [r0],#1
	cmp	r2, r1
	cbz	r2, .Lfound_zero
	it	ne
	cmpne	r0, r3
	bne	1b
	cmp	r2, r1
	beq	.Lfound
	@ Here, r0 is aligned.  Do something word-based.
...

or even just

	and	r3, r0, #7
	and	r1, r1, #255
	rsb	r3, r3, #32
1:
	ldrb	r2, [r0],#1
	cmp	r2, r1
	beq	.Lfound
	subs	r3, r3, #1
	cbz	r2, .Lfound_zero
	bne	1b
	@ Here, r0 is aligned.  Do something word-based.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]