This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] BZ #14831: Segfault in _dl_profile_fixup with IRELATIVEand LD_AUDIT
On Wed, Nov 14, 2012 at 12:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Nov 13, 2012 at 6:37 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Nov 13, 2012 at 4:50 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Tue, Nov 13, 2012 at 4:26 PM, Roland McGrath <roland@hack.frob.com> wrote:
>>>>> + if (l->l_reloc_result == NULL)
>>>>> + {
>>>>> + /* Skip audit if l_reloc_result is NULL which happens with
>>>>> + IRELATIVE relocations in other DSOs, like libm.so. */
>>>>> + *framesizep = -1;
>>>>
>>>> This needs a more extensive comment about how it arises that we get here
>>>> with l_reloc_result NULL and why it is OK to short-circuit this way.
>>>
>>> R_X86_IRELATIVE lazy relocation in libm.so leads to:
>>>
>>> 00042010 00000507 R_386_JUMP_SLOT 00000000 __get_cpu_features
>>>
>>> But __get_cpu_features isn't set up yet and requires a lazy relocation.
>>> That is why l_reloc_result is NULL. We don't want audit in R_X86_IRELATIVE
>>> relocations. What is what my patch does.
>>>
>>>>> + return _dl_fixup (l, reloc_arg);
>>>>
>>>> How can this be right when ELF_MACHINE_RUNTIME_FIXUP_ARGS is nonempty?
>>>>
>>>
>>> My post has
>>>
>>> ---
>>> Targets which define ELF_MACHINE_RUNTIME_FIXUP_ARGS must find a way
>>> to pass ELF_MACHINE_RUNTIME_FIXUP_ARGS from _dl_fixup/_dl_profile_fixup
>>> to _dl_runtime_fixup.
>>> ---
>>>
>>> Only
>>>
>>> ports/sysdeps/m68k/dl-machine.h:#define ELF_MACHINE_RUNTIME_FIXUP_ARGS
>>> long int save_a0, long int save_a1
>>> sysdeps/sh/dl-machine.h:#define ELF_MACHINE_RUNTIME_FIXUP_ARGS int plt_type
>>>
>>> I can work with sh and m68k maintainers on a solution. It shouldn't be
>>> too hard.
>>
>> Here is the updated patch.
>>
>>
>
> Here is what happened:
>
>
> (gdb) bt
> #0 _dl_profile_fixup (l=0x7ffff7ff6908, reloc_arg=4, retaddr=140737342864537,
> regs=0x7fffffffd8b0, framesizep=0x7fffffffdc08) at ../elf/dl-runtime.c:191
> #1 0x00007ffff7df01e6 in _dl_runtime_profile ()
> at ../sysdeps/x86_64/dl-trampoline.h:48
> #2 0x00007ffff753fc99 in ?? ()
> #3 0x00007fffffffdd50 in ?? ()
> #4 0x00007ffff7de6f41 in elf_machine_lazy_rel (skip_ifunc=<optimized out>,
> reloc=0x7ffff7535110, l_addr=140737342799872, map=0x7ffff7ff6908)
> at ../sysdeps/x86_64/dl-machine.h:511
> #5 elf_dynamic_do_Rela (skip_ifunc=<optimized out>, lazy=<optimized out>,
> nrelative=<optimized out>, relsize=<optimized out>,
> reladdr=<optimized out>, map=0x7ffff7ff6908) at do-rel.h:77
> #6 _dl_relocate_object (scope=0x7ffff7ff6c60, reloc_mode=<optimized out>,
> consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:265
> #7 0x00007ffff7ddfaa2 in dl_main (phdr=<optimized out>, phnum=4160718464,
> user_entry=<optimized out>, auxv=0x7ffff7ffe101 <cache_new+1>)
> at rtld.c:2203
> #8 0x00007ffff7df0f7e in _dl_sysdep_start (
> start_argptr=start_argptr@entry=0x7fffffffdf50,
> dl_main=dl_main@entry=0x7ffff7ddda40 <dl_main>) at ../elf/dl-sysdep.c:241
> #9 0x00007ffff7de0ce6 in _dl_start_final (arg=0x7fffffffdf50) at rtld.c:331
> #10 _dl_start (arg=0x7fffffffdf50) at rtld.c:557
> #11 0x00007ffff7ddd508 in _start ()
> ---Type <return> to continue, or q <return> to quit---
> from /export/build/gnu/glibc-test/build-x86_64-linux/elf/ld.so
> #12 0x0000000000000001 in ?? ()
> #13 0x00007fffffffe262 in ?? ()
> #14 0x0000000000000000 in ?? ()
> (gdb) f 6
> #6 _dl_relocate_object (scope=0x7ffff7ff6c60, reloc_mode=<optimized out>,
> consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:265
> 265 ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
> (gdb) f 4
> #4 0x00007ffff7de6f41 in elf_machine_lazy_rel (skip_ifunc=<optimized out>,
> reloc=0x7ffff7535110, l_addr=140737342799872, map=0x7ffff7ff6908)
> at ../sysdeps/x86_64/dl-machine.h:511
> 511 value = ((ElfW(Addr) (*) (void)) value) ();
> (gdb)
>
> We have
>
> ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
>
> #ifndef PROF
> if (__builtin_expect (consider_profiling, 0))
> {
> /* Allocate the array which will contain the already found
> relocations. If the shared object lacks a PLT (for example
> if it only contains lead function) the l_info[DT_PLTRELSZ]
> will be NULL. */
> if (l->l_info[DT_PLTRELSZ] == NULL)
> {
> errstring = N_("%s: no PLTREL found in object %s\n");
> fatal:
> _dl_fatal_printf (errstring,
> rtld_progname ?: "<program name unknown>",
> l->l_name);
> }
>
> l->l_reloc_result = calloc (sizeof (l->l_reloc_result[0]),
> l->l_info[DT_PLTRELSZ]->d_un.d_val);
> if (l->l_reloc_result == NULL)
> {
> errstring = N_("\
> %s: out of memory to store relocation results for %s\n");
> goto fatal;
> }
> }
> #endif
> }
>
> ld.so crashes because IRELATIVE relocation invokes another relocation
> before l_reloc_result is allocated. My patch simply avoids it. Here is
> the patch with updated comments. OK to install?
So the problem is that the objects are loaded in reverse order (for copy reloc
purposes) and therefore we load libm.so *before* libc.so, and thus
l_reloc_result hasn't been allocated for libc.so yet, and thus trying to
resolve a function in libc.so will fail if we have profiling enabled because
that entry in the link_map is not yet allocated (because we haven't called
_dl_relocate_object for libc.so yet).
We are going to eventually allocate l_reloc_result anyway. Could we instead
refactor the l_reloc_result allocation out of _dl_relocate_object and into
the loop in rtld.c which loads all the objects?
Cheers,
Carlos.