This is the mail archive of the libc-hacker@sourceware.cygnus.com mailing list for the glibc project.

Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Symbol binding in shared libraries...


Hi guys,

The ARM port has a problem with PC24 relocs appearing in the .text segment of a
shared library.  These are PC relative branches, where the offset is a signed
integer which must be representable in 26 bits.  This means that I may not be
able to branch everywhere in the address space, so when fixing up the relocs in
ld-linux.so I need to do a range check.

I sent a patch for this problem a few weeks ago, and since then Phil Blundell
discovered a bug in my implementation.  I've been investigating this in much
greater detail and have run into something unusual, that I'd like an opinion on.

I've been testing with the following code from elf_machine_rel in dl-machine.h. 
It is used to fix up the relocs in the .text segment.  A small program seems to
work fine with this code, but larger programs like mozilla and orbit have
problems with odd range check failures.

case R_ARM_PC24:
{
  signed int addend;
  signed int offset;

  addend = *reloc_addr & 0x00ffffff;
  if (addend & 0x00800000) addend |= 0xff000000;

  /* Calculate a new 32 bit offset. */
  offset = value - (unsigned int)reloc_addr + (addend << 2);
             
  /* Range check the offset.  The offset is a 26 bit signed
  integer represented in two's complement notation.  Thus the
  offset can be in the range [-2^25 , 2^25 - 1].  The offset is
  stored in 24 bits, the bottom two bits are assumed to be zero.
  This means the offset must be a multiple of 4. 
  Note: 2^25 = 33554432.  */
             
  if ( (offset > 33554431)     /* value > 2^25 - 1 */
    || (offset < -33554432)    /* value < 2^25 */
    || (offset & 0x00000003) )  /* value not a multiple of 4 */
  {
    char buf[] = "R_ARM_PC24 relocation out of range (0x00000000).";
    char *bp;

    char bvalue[] = "value = 0x00000000";
    char baddr[] = "reloc_addr = 0x00000000";
    char breloc[] = "*reloc_addr = 0x00000000";
    char baddend[] = "addend = 0x00000000";
                
    if (NULL != sym) {
       bp = ((char*)map->l_info[DT_STRTAB]->d_un.d_ptr) + sym->st_name;
       _dl_sysdep_message (bp, "\n", NULL);
    }       
    bp = _itoa_word (value, &bvalue[sizeof bvalue - 1], 16, 0);
    _dl_sysdep_message (bvalue, "\n", NULL);
    bp = _itoa_word (reloc_addr, &baddr[sizeof baddr - 1], 16, 0);
    _dl_sysdep_message (baddr, "\n", NULL);
    bp = _itoa_word (*reloc_addr, &breloc[sizeof breloc - 1], 16, 0);
    _dl_sysdep_message (breloc, "\n", NULL);
    bp = _itoa_word (addend, &baddend[sizeof baddend - 1], 16, 0);
    _dl_sysdep_message (baddend, "\n", NULL);
               
    bp = _itoa_word (offset, &buf[sizeof buf - 3], 16, 0);
    _dl_signal_error (0, map->l_name, buf);
  }
               
  offset = offset >> 2;
  value = (*reloc_addr & 0xff000000) | (offset & 0x00ffffff);
  *reloc_addr = value;
}
break;

If I use Orbit as an example test case I get the following output:

[root@hackwrench ORBit-0.5.0]# orbit-event-server 
er_IOP_ServiceContext
value = 0x020010f8
reloc_addr = 0x400f7e34
*reloc_addr = 0xebfffffe
addend = 0xfffffffe
orbit-event-server: error in loading shared libraries: /usr/lib/libIIOP.so.0:
R_ARM_PC24 relocation out of range (0xc1f092bc).

I checked all the math by hand in this case, and the value for assignments in
the above code are correct, given the equations used.  The offset (0xc1f092bc)
is way out of line.  The problem is the contents of `value` as determined by
RESOLVE.

The symbol name is also bogus, and I'm not sure why.  I'm sure the code in
libIIOP.so is attempting to use fflush in libc.  If I look in
orbit-event-server, the PLT entry for fflush is at 0x20010f8, which is what
`value` is set to after RESOLVE is run.  Also if I change:

   bp = ((char*)map->l_info[DT_STRTAB]->d_un.d_ptr) + sym->st_name;
to
   bp = ((char*)map->l_info[DT_STRTAB]->d_un.d_ptr) + refsym->st_name;

then I get `fflush` printed out instead of `er_IOP_ServiceContext`.

The dynamic loader seems to be binding to the PLT entry in the main executable
for fflush, rather than the address of fflush in libc.so.  Phil delved into the
symbol resolution code and noticed that readelf reports the PLT entries as
STB_GLOBAL, so the code in dl-lookup will consider them valid candidates for
binding to.  The main program's symbol table seems to be searched first which is
why the symbols are picked up in preference to the ones in libc.so.  Is this
correct behaviour?

I don't fully understand what should be going on here.  Can someone help me out?

Thanks,

Scott

-- 
Scott Bambrough - Software Engineer
REBEL.COM    http://www.rebel.com
NetWinder    http://www.netwinder.org

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]