This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: [PATCH] PPC64 tls fixes
- From: Steven Munroe <sjmunroe at vnet dot ibm dot com>
- To: libc-alpha at sources dot redhat dot com
- Cc: Alan Modra <amodra at bigpond dot net dot au>
- Date: Tue, 04 Mar 2003 17:48:24 -0600
- Subject: Re: [PATCH] PPC64 tls fixes
- Reply-to: sjmunroe at us dot ibm dot com
Roland McGrath writes:
> As I said, I wasn't able to test on ppc64. I'm sure there are some
> errors. The changes I made were self-evident cleanups very similar to
> the parallel code in powerpc32. The old value calculations were
wonko > as discussed here previously, and your changes didn't affect
that. My
> changes make them match the working code used by other platforms.
The problem was that none of the R_PPC64_JMP_SLOT relocations where done
for ld64.so itself. As a result the loader thought it had relocated
itself and proceed to dl_main and _dl_init_paths and failed on the first
non-local call (to malloc).
This is one area where ppc64 is very different from ppc32. PPC64 is
always PIC so even the loader needs to relocate its own TOC and PLT
before for it can complete it's own setup. PPC64's
elf_machine_fixup_plt actually needs a sym_map (in addition to the link
map) to relocate PLT entries. This is necessary because of circular
order dependencies between R_PPC64_RELATIVE relocs for the opd entries
and the R_PPC64_JMP_SLOT relocs that copy them to the PLT.
In this case the sym_map pointer was uninitialied for the loaders own
relocation due making the RESOLVE_MAP conditional on:
#if defined USE_TLS && !defined RTLD_BOOTSTRAP
Unfortunately the register used for the sym_map value was still 0 from
kernel initialization. This caused elf_machine_fixup_plt to detect
sym_map as a null pointer and return without attempting the PLT
relocation. If sym_map had been a garbage (non-zero) value ld.so would
have failed sooner and would have been easier to debug.
So the core failure can be fixed by moving the RESOLVE_MAP ahead of the
conditional:
@@ -569,8 +591,8 @@
if (__builtin_expect (r_type == R_PPC64_NONE, 0))
return;
+ sym_map = RESOLVE_MAP (&sym, version, r_type);
#if defined USE_TLS && !defined RTLD_BOOTSTRAP
- struct link_map *sym_map = RESOLVE_MAP (&sym, version, r_type);
value = sym == NULL ? 0 : sym_map->l_addr + sym->st_value;
#else
value = RESOLVE (&sym, version, r_type);
Along the way I had to fix up some compile failures and warnings:
include dl-tls.h, remove the duplicate declare of sym_map, and add
parentheses to BIT_INSERT to insure the correct operation order and
elliminate the related compiler warning.
> You've added a lot of *_GOT_* relocs, which are just never necessary.
> Those relocs never come out of ld -shared.
Yes these are long gone in my version as well. I gave my changes to Alan
Modra to review but Alan did not get a chance to respond until Monday.
> I think all my TLS code is correct, cleaner than what you have, fixes
> bugs you have not fixed, and nicely parallels the ppc32 code that
does > in fact work. The 16-bit relocs are missing, but we can add
those as > I did for ppc32.
I am not convinced that your PPC32 changes are completely correct or
should be applied to PPC64.
1) Your macro DO_TLS_RELOC automatically builds case statements for both
corresponding DTPREL and TPREL relocations. The
R_PPC_DTPREL32/R_PPC64_DTPREL64 relocations are required dynamically.
However (at least for ppc64) the DTPREL16* relocs will never generate
dynamic endties.
2) The full expansion of CHECK_STATIC_TLS and TLS_TPREL_VALUE is large
(128 bytes for ppc64). Expanding this code 5 (ppc32) to 10 (ppc64) times
inline seems excessive. Especially when elf_machine_rela is itself
expanded several times. A (out of line) static function
(elf_machine_tprel) would be appropriate to call from within the case
statements that requires this logic.
3) The whole implementation seems overly complicated and hard to
understand.
I'll hold my patch till tomorrow to give you a chance to respond to the
comments above.
> I don't want to revert to the previous crufty version and start over.
> The changes I made are not that extensive nor hard to comprehend, so I
> don't think it will take long to find my errors.
This comment is unwarranted and unfair since you never bothered to look
at my improved implementation. In this case the bug you introduced was
not at all obvious and took 8 hours of solid work to debug and understand.