This is the mail archive of the newlib@sourceware.cygnus.com mailing list for the newlib project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
I thought I would pass this on. Does the new version of memcpy do much better than this? ---------- Forwarded message ---------- Date: Tue, 9 Dec 97 12:03:28 -0600 From: Eric Norum <eric@skatter.USask.Ca> To: rtems-list@oarcorp.com Subject: Re: memcpy performance It's even worse than just a byte-by-byte copy! On the 971024 snapshot (gen68360 BSP) a call to memcpy produces: 1) A call to bcopy 2) The bcopy routine links a stack frame and calls memmove 3) The memmove routine: a) links a stack frame b) checks for overlap c) does a byte-by-byte copy 5 instructions/byte on a CPU32 processor! There's a heck a of a lot of unnecessary code here: Two extra function calls Two extra stack frames Extra code to check for overlap A very inefficient loop Processor-independent improvements required: 1) There should be an explicit memcpy routine. 2) The library should be compiled with aggressive optimization. Processor-dependent improvements that would be nice: M68k - The loop in memmove should be done in such a way that processors like the CPU32 can go into loop mode. Now all we need is a willing volunteer...... --- Eric Norum eric@skatter.usask.ca Saskatchewan Accelerator Laboratory Phone: (306) 966-6308 University of Saskatchewan FAX: (306) 966-6058 Saskatoon, Canada.