This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: overhead of bfd_{get,put}*()


On Tue, 23 Feb 2010, H.J. Lu wrote:

> On Tue, Feb 23, 2010 at 8:00 AM, Ian Lance Taylor <iant@google.com> wrote:
> > David Miller <davem@davemloft.net> writes:
> >
> >> The top offenders were, surprisingly for me, bfd_getb64(),
> >> bfd_putb64() and bfd_getb_signed_64(). ?And it's not because they
> >> touch memory, it's the byte loads and shift/or dance they do.
> >
> > This is exactly why gold does its complex template dance: to avoid
> > that overhead.
> >
> > Those functions will hit unaligned data in some cases.
> >
> > Ian
> >
> 
> We can define host specific bfd_{get,put}*() to optimize for
> hosts which don't require strict alignment for integers.

I think the GCC position is that accessing directly through an unaligned 
pointer is undefined behavior and it doesn't guarantee to use the 
instruction you might be expecting that you think is safe with 
insufficient alignment (it might use a vector load or store instead).

What you can do is something along the lines of: if GCC, load using a 
packed structure type; leave it to the GCC target to know what 
instructions are OK for such a load.  (As a further optimization, first 
check for alignment using __builtin_expect to mark alignment as very 
likely, and use a non-packed load if aligned.)  If 4.3 or later, then use 
__builtin_bswap32 or __builtin_bswap64 for byte-swapping if the host 
endianness is wrong; otherwise use a C byte-swap if the host endianness if 
wrong.  If not GCC, use the existing code.  It might also be reasonable 
for these optimized versions to be inline functions.  Note that all this 
only requires BFD to know host endianness, not strict alignment 
properties.

GCC compiles

struct s { int a; } __attribute__((packed));
int
load (void *p)
{
  if (__builtin_expect (((int) p & 3) == 0, 1))
    return *(int *)p;
  else
    return (*(struct s *)p).a;
}

to code without a conditional on x86, so having both aligned and unaligned 
paths shouldn't pessimize things there.

-- 
Joseph S. Myers
joseph@codesourcery.com

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]