This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] libc/machine/m68k: Incorporate memcpy and memset.


On Thu, Apr 26, 2007 at 09:48:09PM -0700, Kazu Hirata wrote:
> Index: newlib/libc/machine/m68k/memcpy.S
> ===================================================================
> RCS file: newlib/libc/machine/m68k/memcpy.S
> diff -N newlib/libc/machine/m68k/memcpy.S
> [...]
> +1:
> +	move.l	(%a1)+,(%a0)+
> +	move.l	(%a1)+,(%a0)+
> +.Lcopy8:
> +	move.l	(%a1)+,(%a0)+
> +	move.l	(%a1)+,(%a0)+
> +.Lcopy:
> +#if !defined (__mcoldfire__)
> +	dbra	%d0,1b
> +#else
> +	subq.l	#1,%d0
> +	bpl	1b
> +#endif

as already pointed out by Eric, dbxx is limited in range.  it is my
understanding that dbxx does a compare to -1 against the lower 16bits of
the count register, which would effectively limit the count to 32767.

also, unrolling the loop as shown above isn't the fastest block-copy
method for CPU32.  the following loop:

1:
	move.l	(%a1)+,(%a0)+
	dbra	%d0, 1b

does not need to do any instruction fetches on CPU32, and as such runs
faster than a manually unrolled loop.  I assume Fido, being a CPU32-
derivative, would also benefit from less loop unrolling.

-- 
  Aaron J. Grier  |   Frye Electronics, Tigard, OR   |  aaron@frye.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]