This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] more problems with newlib/libc/machine/m68k/memcpy.S


On 2/9/10 5:15 PM, Josef Wolf wrote:
Hello folks,

I have done a closer look at newlib/libc/machine/m68k/memcpy.S and I think
there are a couple of problems with this code:

1. When the destination pointer is mis-aligned, up to three bytes are copied
    before getting into the optimized main loop. But d1, which contains the
    number of bytes to copy, is never adjusted to account for this fact. As a
    consequence, up to 3 byte too much can be copied by memcpy().

When destination is mis-aligned, D1 is properly adjusted here:


	and.l	#3,d0		| look for the lower two only
	beq	2f		| is aligned?
	sub.l	d0,d1

Or am I missing something?


2. When MISALIGNED_OK==0, no attempt is made to get the pointers to aligned positions. This results in much slower code than the old code for such CPUs. IMHO, the cause of the problem reported in the thread starting with http://sourceware.org/ml/newlib/2009/msg01079.html was mis-interpreted. The problem was _not_ the attempt to align dest. The problem was that even after dest was aligned, src was still not aligned. Thus, disabling alignment completely is the wrong response to that problem.

The problem was not misinterpreted; and it is true that memcpy can be optimized for CPUs without alignment module by using 2-byte copies. It is not straightforward to implement a single memcpy routine that will be optimal for both CPUs that support unaligned accesses and those which don't -- that is why I posted a simple byte-copy fix.


It seems that your patch lengthens the critical path for CPUs which support unaligned accesses.


3. -mcpu32 seems to imply -mc68020. So the check for alignment capabilities gives a wrong result for cpu32. BTW: alignment capabilities depend not only on the CPU. It is also dependant on bus width and how the memory is connected.

Thank you for pointing this out.


Regards,

--
Maxim Kuvyrkov
CodeSourcery
maxim@codesourcery.com
(650) 331-3385 x724


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]