This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Incompatibility between GNU-ld and SUN's ld.so.1


"Christian Ehrhardt" <ehrhardt@mathematik.uni-ulm.de> writes:

> Hi,
> 
> first: I'd appreciate to be CC'ed on replies but I'll try to follow
> the thread in the archives.
> 
> [ I still think this is a Problem of Suns ld.so.1 and I have an open
>   CALL with Sun. However, as this problem is triggered by libstdc++
>   and libgcc_s and the Sun behaviour dates back to Solaris 7 (or even
>   earlier) it would be helpful if GNU-ld could work around this problem.
> ]
> 
> Here's the relevant part of my report sent to SUN (I guess you'd
> prefere to use Makefile instead of Makefile.sun. However, note that
> using -nostdlib will cause a different crash due to a missing exit()):
> 
> ----------------- cut here --------------------------------------------
> 
> SUMMARY DESCRIPTION: ld.so.1 fails to relocate certain shared libraries
> 
> DETAILED DESCRIPTION:
> 
> The dynamic runtime linker fails to relocate valid shared libraries
> generated by recent versions of GNU-ld. /usr/local/bin/ld is from
> the GNU binutils-2.13 package:
> 
>        turing$ /usr/local/bin/ld -v
>        GNU ld version 2.13
> 
> How to reproduce:
> 
> Script started on Fri Sep 20 19:46:43 2002
> turing$ cat t2.c
> struct object {
>         int i;
>         int j;
>         int k;
>         int l;
> };
> 
> 
> 
> int func ()
> {
>         static struct object x;
>         struct object * p;
>         p = &x;
>         p->i = 3;
>         return 0;
> }
> 
> turing$ cat t3.c
> extern int func();
> 
> int main ()
> {
>         func();
>         return 0;
> }
> turing$ cat Makefile.sun
> .PHONY: clean
> all:    a.out
> t2.o:   t2.c
>         CC  -c -KPIC t2.c
> libt2.so:       t2.o
>         /usr/local/bin/ld -G t2.o -olibt2.so
> t3.o:   t3.c
>         CC  -c t3.c
> a.out: libt2.so t3.o
>         CC  -lt2 t3.o -L. -R.
> clean:
>         rm -f *.so *.o a.out
> 
> turing$ cat Makefile
> .PHONY: clean
> all:    a.out
> t2.o:   t2.c
>         gcc -c -fPIC t2.c
> libt2.so: t2.o
>         /usr/local/bin/ld -nostdlib -shared -olibt2.so t2.o
> a.out: libt2.so t3.c
>         gcc -nostdlib t3.c libt2.so -L. -R. 
> clean:
>         rm -f *.so *.o a.out core
> 
> turing$ make -f Makefile.sun clean
> rm -f *.so *.o a.out
> turing$ make -f Makefile.sun 
> CC  -c -KPIC t2.c
> /usr/local/bin/ld -G t2.o -olibt2.so
> CC  -c t3.c
> CC  -lt2 t3.o -L. -R.
> turing$ a.out
> Segmentation Fault (core dumped)
> turing$ exit
> 
> script done on Fri Sep 20 19:47:32 2002
> 
> Note that I compiled everything with /opt/SUNWspro/bin/CC to
> rule out bugs in gcc. This problem can be reproduced using
> the second Makefile and gcc with an even smaller resulting
> executable.
> 
> 
> Analyzing the core shows the following:
> turing$ pmap core | grep libt2.so
> FF370000      8K read/exec         libt2.so
> FF380000      8K read/write/exec   libt2.so
> 
> Script started on Fri Sep 20 19:53:10 2002
> turing$ gdb a.out core
> GNU gdb 5.0
> [ ... ]
> #0  0xff370318 in __1cEfunc6F_i_ ()
>    from /home/thales/ehrhardt/ld.so.1-bug/./libt2.so
> (gdb) disass
> Dump of assembler code for function __1cEfunc6F_i_:
> 0xff3702e0 <__1cEfunc6F_i_>:    save  %sp, -112, %sp
> 0xff3702e4 <__1cEfunc6F_i_+4>:  call  0xff3702ec <__1cEfunc6F_i_+12>
> 0xff3702e8 <__1cEfunc6F_i_+8>:  sethi  %hi(0), %o1
> 0xff3702ec <__1cEfunc6F_i_+12>: mov  %o1, %o1   ! 0x0
> 0xff3702f0 <__1cEfunc6F_i_+16>: add  %o7, %o1, %o1
> 0xff3702f4 <__1cEfunc6F_i_+20>: st  %o1, [ %fp + -12 ]
> 0xff3702f8 <__1cEfunc6F_i_+24>: sethi  %hi(0x10000), %o0
> 0xff3702fc <__1cEfunc6F_i_+28>: or  %o0, 0xc4, %o0      ! 0x100c4
> 0xff370300 <__1cEfunc6F_i_+32>: add  %o1, %o0, %l7
> 0xff370304 <__1cEfunc6F_i_+36>: sethi  %hi(0), %g1
> 0xff370308 <__1cEfunc6F_i_+40>: or  %g1, 4, %g1 ! 0x4
> 0xff37030c <__1cEfunc6F_i_+44>: ld  [ %l7 + %g1 ], %o0
> 0xff370310 <__1cEfunc6F_i_+48>: st  %o0, [ %fp + -8 ]
> 0xff370314 <__1cEfunc6F_i_+52>: mov  3, %o1
> 0xff370318 <__1cEfunc6F_i_+56>: st  %o1, [ %o0 ]
> 0xff37031c <__1cEfunc6F_i_+60>: clr  [ %fp + -4 ]
> 0xff370320 <__1cEfunc6F_i_+64>: mov  %g0, %i0
> 0xff370324 <__1cEfunc6F_i_+68>: ret 
> 0xff370328 <__1cEfunc6F_i_+72>: restore 
> 0xff37032c <__1cEfunc6F_i_+76>: mov  %g0, %i0
> 0xff370330 <__1cEfunc6F_i_+80>: ret 
> 0xff370334 <__1cEfunc6F_i_+84>: restore 
> ---Type <return> to continue, or q <return> to quit---
> End of assembler dump.
> (gdb) bt
> #0  0xff370318 in __1cEfunc6F_i_ ()
>    from /home/thales/ehrhardt/ld.so.1-bug/./libt2.so
> #1  0x10884 in main ()
> (gdb) info reg o0
> o0             0xff370000       -13172736
> (gdb) info reg o1
> o1             0x3      3
> (gdb) info reg l7
> l7             0xff3803a8       -13106264
> (gdb) info reg g1
> g1             0x4      4
> (gdb) turing$ exit
> 
> script done on Fri Sep 20 19:54:46 2002
> 
> Looking back at function func from t2.c shows:
> int func ()
> {
> 	static struct object x;
> 	struct object * p;
> 	p = &x;
> 	p->i = 3;      <====== crash is here.
> 	return 0;
> }
> 
> The value of the pointer p is obviously in register o0, i.e. it is
> 0xff370000. This is precisely the BASE address where the shared library
> libt2.so has been mapped to. Register l7 contains the base address of
> the .got section (the global offset table of this library). The
> questionable address is loaded from offset 4 in the global offset table.
> 
> Looking at the contents of the global offset table in the shared
> library shows the following:
> 
> turing$ elfdump -G libt2.so 
> 
> Global Offset Table: 2 entries
>  ndx     addr      value    reloc              addend   symbol
> [00000]  000103a8  00010338 R_SPARC_NONE       00000000 
> [00001]  000103ac  000103b0 R_SPARC_RELATIVE   00000000 
> turing$ 
> 
> Note that we have indeed
> %l7(0xff3803a8) = Offset of .got(0x000103a8) + library base address(0xFF370000)
> 
> The Solaris Linker and Libraries Guide (freshly downloaded from
> docs.sun.com) has this explanation for R_SPARC_RELATIVE:
> 
> |Some relocation types have semantics beyond simple calculation:
> |[ ... ]
> |R_SPARC_RELATIVE
> |  Created by the link-editor for dynamic objects. Its offset member
> |  gives the location within a shared object that contains a value
> |  representing a relative address. The runtime linker computes the
> |  corresponding virtual address by adding the virtual address at which
> |  the shared object is loaded to the relative address. Relocation
> |  entries for this type must specify 0 for the symbol table index.
> 
> This means that the value at offset 0x4 in the global offset
> Table should be
>       library base address  + Value in .got
>       0xFF370000            + 0x000103B0     = 0xFF3803B0
> after relocation. However looking at the value of register o0 we
> see that the .got section obviously contains the value 0xFF37B000
> instead.
> 
> ----------------- cut here --------------------------------------------
> 
> The basic problem is the interpretation of the meaning of
> R_SPARC_RELATIVE. Recall the explanation from above:
> 
> [ The same document also states that the calculation performed by
>   R_SPARC_RELATIVE is B+A (see Terminologie below). IMHO this is
>   overruled by the first sentence quoted below.
> ]
> 
> |Some relocation types have semantics beyond simple calculation:
> |[ ... ]
> |R_SPARC_RELATIVE
> |  Created by the link-editor for dynamic objects. Its offset member
> |  gives the location within a shared object that contains a value
> |  representing a relative address. The runtime linker computes the
> |  corresponding virtual address by adding the virtual address at which
> |  the shared object is loaded to the relative address. Relocation
> |  entries for this type must specify 0 for the symbol table index.
> 
> 
> This explanation is obviously derived from the SHT_REL case where
> the ``relative address'' explained above and the implicit addend
> are the same.
> 
> Terminologie:
> * B is the baseaddress where the library is loaded
> * A is the EXPLICIT addend
> * V is the value stored in the shared library where an implicit addend
>   would reside (IMHO this is what ``relative address'' above describes).
> 
> The SUN-Linker used to always calculate V + B + A for R_SPARC_RELATIVE
> relocations, however, starting with Solaris 7 and the advent of
> DT_RELACOUNT it calculates only B+A (ignoring V completly) iff
> DT_RELACOUNT is actually supplied and explicit addends are used.
> 
> ld could work around this by always storing the relative address in
> the addend and setting V to 0 if explicit addends are used. This is
> what SUN's linker has done for quite some time.

Beware!  IIRC, ld.so on Solaris used (perhaps in 2.5.1?) to always
compute V+B, ignoring A, for R_SPARC_RELATIVE, exactly as the doc
above describes.  To be backwards compatible, it might be necessary to
suppress output of DT_RELACOUNT instead if being built for an older
Solaris system.

-- 
- Geoffrey Keating <geoffk@geoffk.org>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]