This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [patch] libc/machine/m68k: Incorporate memcpy and memset.

From: "Aaron J. Grier" <aaron at frye dot com>
To: newlib at sources dot redhat dot com
Date: Tue, 1 May 2007 11:02:59 -0700
Subject: Re: [patch] libc/machine/m68k: Incorporate memcpy and memset.
References: <200704270448.l3R4m9k0004721@sparrowhawk.codesourcery.com>

On Thu, Apr 26, 2007 at 09:48:09PM -0700, Kazu Hirata wrote:
> Index: newlib/libc/machine/m68k/memcpy.S
> ===================================================================
> RCS file: newlib/libc/machine/m68k/memcpy.S
> diff -N newlib/libc/machine/m68k/memcpy.S
> [...]
> +1:
> +	move.l	(%a1)+,(%a0)+
> +	move.l	(%a1)+,(%a0)+
> +.Lcopy8:
> +	move.l	(%a1)+,(%a0)+
> +	move.l	(%a1)+,(%a0)+
> +.Lcopy:
> +#if !defined (__mcoldfire__)
> +	dbra	%d0,1b
> +#else
> +	subq.l	#1,%d0
> +	bpl	1b
> +#endif

as already pointed out by Eric, dbxx is limited in range.  it is my
understanding that dbxx does a compare to -1 against the lower 16bits of
the count register, which would effectively limit the count to 32767.

also, unrolling the loop as shown above isn't the fastest block-copy
method for CPU32.  the following loop:

1:
	move.l	(%a1)+,(%a0)+
	dbra	%d0, 1b

does not need to do any instruction fetches on CPU32, and as such runs
faster than a manually unrolled loop.  I assume Fido, being a CPU32-
derivative, would also benefit from less loop unrolling.

-- 
  Aaron J. Grier  |   Frye Electronics, Tigard, OR   |  aaron@frye.com

Follow-Ups:
- Re: [patch] libc/machine/m68k: Incorporate memcpy and memset.
  - From: Aaron J. Grier

References:
- [patch] libc/machine/m68k: Incorporate memcpy and memset.
  - From: Kazu Hirata

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]