This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: dwfl_module_relocate_address() versus base address

From: Roland McGrath <roland at redhat dot com>
To: Mark Wielaard <mjw at redhat dot com>
Cc: systemtap <systemtap at sources dot redhat dot com>
Date: Thu, 18 Dec 2008 15:40:54 -0800 (PST)
Subject: Re: dwfl_module_relocate_address() versus base address
References: <1229103422.3397.98.camel@dijkstra.wildebeest.org> <20081212200639.23B51FC3AB@magilla.sf.frob.com> <1229342761.3457.13.camel@dijkstra.wildebeest.org> <20081216091138.68D1FFC351@magilla.sf.frob.com> <1229434095.3572.87.camel@dijkstra.wildebeest.org> <20081216230526.7B0F4FC3BC@magilla.sf.frob.com> <1229528182.3553.318.camel@dijkstra.wildebeest.org> <20081217230843.78FA1FC3D1@magilla.sf.frob.com> <1229595628.3455.33.camel@dijkstra.wildebeest.org>

> OK. Lets go to the start and see if from that it does or doesn't follow
> that I want to adjust for the p_vaddr of the segment that the address is
> in to get a relative offset.

We can start out by saying that it doesn't, because libdwfl's purpose in
life is to make this easier for you than that.  Of course, that faith might
imply that libdwfl is broken or wrong-headed or needs more calls to help you.
But you happen to have its authors at your beck and call, so your first
answer to such issues can always be, "Hey libdwfl, help me over here!"
That's what it's there for.  (And then once we've resolved that, we can
figure out if you need a workaround/backport solution for an interim while
waiting for libdwfl improvements.)

> [...] Since we don't know where the segment will be mapped into
> the user address space beforehand we need the symbol addresses to be
> relative to the start of segment that gets loaded.

What you need is addresses that you know how you'll adjust at runtime.

Some useful facts:

The ELF format spec says about PT_LOAD phdrs: "Loadable segment entries in
the program header table appear in ascending order, sorted on the p_vaddr
member."  This means the first PT_LOAD segment is the lowest-addressed,
unless the ELF file is invalid (and then all bets are off anyway).  (In
actual fact, Linux ELF files have phdrs sorted by p_paddr.  This is the
desireable thing for directing boot loaders to load the kernel into
physical memory.  It never affects a case you're thinking about, because in
all normal binaries, p_paddr==p_vaddr, so the effect is that Linux ELF
files follow the spec in ordering by p_vaddr except for the special case of
the kernel binary, which is special in plenty of other ways too.)

The lowest-addressed (first) PT_LOAD segment (phdrs[i]) is placed at some
runtime-chosen address (p) regardless of its given p_vaddr.  Each
successive PT_LOAD (phdrs[j]) is then placed exactly at:
	phdrs[j].p_vaddr - phdrs[i].p_vaddr + p
This is the only way that ld.so behaves, even on ia64 where the ABI spec is
looser.  (Any intervening holes in the address space are filled with
PROT_NONE pages.  This comes up often on x86_64, where ABI p_align is 2MB
but actual page size is normally 4KB.  In actual fact, what ld.so does is
reserve the whole range from p to PAGE_ALIGN(phdrs[last_j].p_vaddr +
phdrs[last_j].p_memsz - phdrs[i].p_vaddr + p) in the mmap call that loads
the first segment (made without MAP_FIXED), and then calls mmap with
MAP_FIXED for each successive segment (splitting/overwriting the initial
mapping just made), interleaving mprotect calls for each hole of a whole
page or more, to set those to PROT_NONE.)

(Those asides had lots of detail not directly apropos, but I know you often
find the gritty details helpfully elucidating.)  The upshot is that for
ET_DYN libraries it makes sense to think of a single "load address" for the
library, being the address at which its lowest-addressed PT_LOAD segment
was actually loaded.  The "ELF bias" is the difference between that address
and the p_vaddr of that PT_LOAD.  (In a normal DSO, that p_vaddr is 0, so
the bias is exactly the load address.  In a prelinked DSO, that p_vaddr is
nonzero, and the actual load address may be equal, greater, or lesser.)

In libdwfl, the "module start" address for a DSO module is that load address.

> - dwfl_module_getsymtab() to get the number of symbols in the module.
>   Then for each symbol:
>   - dwfl_module_getsym() to get the symbol values.

This yields an absolute address (st_value) in the Dwfl address space.  In
offline mode (which stap always uses), that means an arbitrary placement.

>   - dwfl_module_relocate_address() to adjust the address to the base.

Modulo the recent bug, this yields an address R such that:

	B + R = st_value

where B is the "relocation base" for which dwfl_module_relocate_address
returns the "relocation base index".

>   - dwfl_module_relocation_info() to get the base information
>     (so we know the section name, etc so we know whether it is a
>      dynamic address.)

For ET_DYN, this yields SHN_ABS, "" to describe the relocation base (with
index 0, the only valid index in this case).  This means that the relocation
base is the module's start address (so sayeth the libdwfl.h comment).

For DSOs that weren't prelinked, and for DSOs that were prelinked after
separating their debuginfo (i.e. prelinking of a package-installed DSO, the
normal case for libc et al), the old code returned this correctly.  The
debuginfo (or unprelinked original) is relative to load address 0, so the
debug.bias is exactly the load address.  The fix I did the other day was in
fact wrong for prelinked DSOs.  The right fix is to subtract low_addr (the
module start address), not main.bias.  (main.bias == low_addr for a DSO
that was not prelinked.)

http://git.fedorahosted.org/git/elfutils.git?p=elfutils.git;a=commitdiff;h=7d9b821db6bc494417a57321b419c6b9481a2128

>   - Get the p_vaddr of the segment that the symbol address is in
>     so we can make the address relative to the segment load address.
> 
> It is the last step that seems a bit cumbersome. So if we can come up
> with a way to avoid it that would be nice.

You are not supposed to need this, since a fixed
dwfl_module_relocate_address wlll yield the right relative address in the
first place.  To work around the old one, you only need to do what the
fixed one does.  That is, take the absolute address (the input to
dwfl_module_relocate_address) and substract the module start address.
This address is an argument to all module callback functions,
and also queryable with dwfl_module_info.

Now you have an address relative to a base intended to be straightforward
for you.  For ET_DYN, this means the load address of the DSO, which is the
start of the lowest-addressed mapping of the file's offset 0.  If you were
later also using libdwfl at runtime, then your runtime Dwfl would be
populated as by dwfl_linux_proc_report, so the module start addresses for
DSO modules would yield what you want.  (This is the whole idea of why the
offline/relocation interfaces are supposed to be intuitive to use, which
evidently hasn't worked out so well, though perhaps that's really only due
to the bugs leading you astray.)  Of course you're not using it, since your
runtime reality is in kernel code.  But the hope was that this view of it
makes natural sense to what you're looking at.  i.e., compare your runtime
layout tracking to eu-unstrip -n -p PID, which shows you the libdwfl view
of things at runtime.  Also compare that to eu-unstrip -n -e foo.so, which
is the same case the stap translator is using internally when resolving a
user module.  (Or -k to compare the kernel case to the offline kernel case,
which is -K, the same case the stap translator is using internally.)

This stuff really hasn't been exercised much for the user-mode ET_DYN cases
(i.e. coping with all permutations of prelinking and separate debuginfo).
So there might well be more bugs in there.

I've been working through a cold this week, so I can't entirely vouch for
the clarity of my thinking either yesterday or today (and they don't match).
But I hope now I've given enough information in enough corners that you can
validate for yourself whether what I've said last is really true.

(I haven't gone into detail on the ET_REL cases in this message.  But those
are already dealt with fine AFAIK.  I can elaborate those differences if
you ask.)


Thanks,
Roland

References:
- dwfl_module_relocate_address() versus base address
  - From: Mark Wielaard
- Re: dwfl_module_relocate_address() versus base address
  - From: Roland McGrath
- Re: dwfl_module_relocate_address() versus base address
  - From: Mark Wielaard
- Re: dwfl_module_relocate_address() versus base address
  - From: Roland McGrath
- Re: dwfl_module_relocate_address() versus base address
  - From: Mark Wielaard
- Re: dwfl_module_relocate_address() versus base address
  - From: Roland McGrath
- Re: dwfl_module_relocate_address() versus base address
  - From: Mark Wielaard
- Re: dwfl_module_relocate_address() versus base address
  - From: Roland McGrath
- Re: dwfl_module_relocate_address() versus base address
  - From: Mark Wielaard

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]