This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Expand mempcpy into memcpy


On Wed, Feb 06, 2013 at 12:00:15PM -0800, H.J. Lu wrote:
> On Wed, Feb 6, 2013 at 11:43 AM, David Miller <davem@davemloft.net> wrote:
> > From: OndÅej BÃlka <neleai@seznam.cz>
> > Date: Wed, 6 Feb 2013 20:25:24 +0100
> >
> >> On Wed, Feb 06, 2013 at 01:40:00PM -0500, David Miller wrote:
> >>> From: OndÅej BÃlka <neleai@seznam.cz>
> >>> Date: Wed, 6 Feb 2013 18:40:21 +0100
> >>>
> >>> > This makes all mempcpy call to call memcpy instead.
> >>> >
> >>> > There are two reasons, one is that it will simplify maintainance.
> >>> > Second is that posible performance gains are saving one addition and
> >>> > perhaps spilling one register. Problem is that implementation is quite
> >>> > big - 1105 bytes on x64. Probably cost of instruction/branch cache
> >>> > misses outweigth speedup we gained.
> >>> >
> >>> > I will move unused inline functions later.
> >>>
> >>> The overhead of mempcpy vs. memcpy on sparc is exactly 2 instructions
> >>> and 1 extra cycle.
> >>>
> >>> I think you are making an extremely cpu specific decision on how this
> >>> macro behaves.
> >>
> >> I wrote this unclearly.
> >> "that posible performance gains are saving one addition and perhaps spilling one register."
> >> comment refered to old macro.
> >>
> >> Then downside of old macro is that memcpy and mempcpy occupy 2000 bytes
> >> of instruction cache.
> >>
> >> With my change with memcpy only 1000 bytes will be occupied which means
> >> fewer cache misses.
> >
> > On sparc there is no such extra icache space.
> >
> > mempcpy is simply a 2-instruction stub which sets up a return value
> > register and branches into memcpy.
> >
> > If x86 implements this by having yet another entire copy of the memcpy
> > core for the sake of mempcpy, that's frankly an implementation mis-feature.
> 
git grep mempcpy also found powerpc implementation with same problem.  

Also most ports have optimized memcpy but I did not find any mempcpy
implementation there.

Perhaps best way is use my header as default and architectures with optimized
mempcpy need to enable it with _HAVE_STRING_ARCH_mempcpy.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]