This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RFC: ELF prelinker - 0.1.1


Hi!

I've uploaded a new version of ELF prelinker to:

ftp://people.redhat.com/jakub/prelink/prelink-20010704.tar.bz2

It has already an --all mode, so if your libraries/binaries have reserved
.dynamic entries (that is a precondition for prelinking) and if you have
glibc patched with the included patch installed, you can e.g. just
prelink -avmR (prelink all, be verbose, conserve virtual memory usage
	       by only assigning non-overlapping virtual address space slots
	       to libraries which appear together and start at random base,
	       so that exploiting buffer overruns is slightly more difficult)
and it will prelink all binaries and needed libraries from directory trees
specified in /etc/prelink.conf.

I'll try to describe the details what it actually does, so that you can
comment on it even without reading all of the source or testing it.

The prelinker first relocates shared libraries to desired base virtual
addresses (this changes the library to what you'd get by linking it with
...
  . = 0x41abcdef + SIZEOF_HEADERS;
  .hash          : { *(.hash)           }
...
instead of usual 0x0 + SIZEOF_HEADERS).
Still unfinished in this area is relocating of Dwarf/Dwarf-2 (but Stabs are
already done) - the debugging formats are tricky because they don't have
corresponding .rel* sections in shared libraries, so one has to understand
the format and see what values need to be adjusted and what not.

Then (both for libraries and binaries), on REL architectures where it is
necessary (e.g. on IA-32 if there are any R_386_32 or R_386_PC32
relocations) all .rel.* sections but .rel.plt are converted to RELA (by
extracting the value from where the reloc points to into r_addend).
This step is IMO necessary, because REL relocations have the addend stored
where the reloc points to, but prelinking will store the final value into
that location, so r_addend would be lost (it is needed for the case where
one of the dependant library changes, or the binary which loads it changes,
or things are dlopened). RELA format seems like the natural thing to do when
the memory locations cannot be used for addends, especially when the dynamic
linker needs to support RELA anyway for replaying of conflicts (see below).
Do you agree here? I can provide examples if needed.

Next step is merging all .rel{,a}.* but .rel{,a}.plt sections into a single
reloc section, .gnu.reloc (similarly to .SUNW_reloc in Solaris linker).
The reason for this is that the relocations can be sorted:
- relocations against the same symbol can be groupped together, otherwise the
  non-RELATIVE relocations are sorted with ascending r_offset.
- all RELATIVE relocations come last and DT_REL{,A}COUNT dynamic entry is
  added with the count of RELATIVE relocs
The advantage of doing of putting relocs agains the same symbol is that the
dynamic linker can more easily cache symbol lookups (it can just cache the
last lookup), which can result in speedups even if prelinking cannot be used
(e.g. in dlopen) - glibc patch included in the package does this.
The advantage of grouping RELATIVE relocs together is that if the library is
successfully mmaped to the VMA base it wants (ie. l_addr == 0), then all the
RELATIVE relocs can be skipped (the glibc patch does not do this yet).

Next step is running special dynamic linker mode which tells the prelinker
details about symbol lookups and possibly conflicts (symbol lookups which
are different in binary's global scope to lookup only in scope of library
which is being relocated at that point).

The prelinker then uses this information to replay the relocations and
stores it into the library (or binary).

For binaries, it creates a .gnu.conflict section (sh_type SHT_RELA) out of
the found conflicts. This section contains array of ElfW(Rela) structures,
where r_offset is the absolute location where one of the libraries has to be
patched (say if libc.so is relocated to 0x41000000 and the conflict is against
libc.so, then r_offset will be 0x410XXXXX), r_info is ELFW(R_INFO)(0, reloc_type)
and r_addend is value which should be stored in there (it is ElfW(Rela), so
the previous memory content does not matter). ELFW(R_SYM) is 0 because all
.gnu.conflict relocations are absolute, they don't require any symbol
lookups at all. This section is sh_flags SHF_ALLOC and is put into some
PF_R|PF_X PT_LOAD segment.
Its address is stored in DT_GNU_CONFLICT dynamic entry, its size in
DT_GNU_CONFLICTSZ.

If the binary contains any R_*_COPY relocations, the prelinker tries to
split .bss section into SHF_ALLOC|SHF_WRITE SHT_PROGBITS .dynbss portion
and remaining the rest (.bss, usual SHF_ALLOC|SHF_WRITE SHT_NOBITS) and does
the copying from libraries at prelink time (handling even tricky things like
struct x { struct x *a; } a = { &a }; in shared library where main binary
accesses a). Like this, no copying needs to be done at runtime.

A list of all dependent libraries (in dynamic linker's search order) is then
stored into a special .gnu.liblist section (sh_type SHT_GNU_LIBLIST).
This section contains array of ElfW(Lib) structures where each structure
describes one dependant library. This section is SHF_ALLOC for binaries
(because dynamic linker needs to check this at runtime) and non-alloced for
libraries (it only servers the prelinker as check). sh_link contains number
of SHT_STRTAB section which contains library SONAMES (.dynstr for binaries,
non-SHF_ALLOC .gnu.libstr for libraries) referenced by l_name field,
l_time_stamp field contains a copy of DT_GNU_PRELINKED dynamic tag of the
dependant library, l_checksum field contains a copy of DT_CHECKSUM field.
The remaining two are 0 for now.
For binaries, the address of .gnu.liblist section is stored in
DT_GNU_LIBLIST dynamic tag, its size in DT_GNU_LIBLISTSZ dynamic tag.

For libraries, crc32 of all SHF_ALLOC sections is computed and stored into
DT_CHECKSUM dynamic tag and time(NULL) is stored into DT_GNU_PRELINKED
dynamic tag.

Glibc dynamic linker, if it finds DT_GNU_PRELINKED dynamic tag in itself,
can skip first relocation of itself, since it is already done.
Then, it mmaps all libraries as it always did and computes global search
scope. Now, if it finds DT_GNU_LIBLIST dynamic tag in the binary, it
compares libraries mentioned in this section (and their
DT_CHECKSUM/DT_GNU_PRELINKED dynamic tags) with the recorded values in
.gnu.liblist. If they match and if all libraries have l_addr == 0, then
.gnu.conflict section is replayed (with dummy RESOLVE macro because it needs
no symbol lookups), otherwise prelinking cannot be used and the dynamic
linker relocates each library and main binary as it always did.

The problematic part (appart from needing up to 5 free .dynamic section
slots which is IMHO not a big price to pay) is where to create
.gnu.conflict, .gnu.liblist sections and where to possibly expand .dynstr
and .gnu.reloc (the last one only in the rare case binary's relocations on
IA-32 contain R_386_32 or R_386_PC32 relocs). All these are read-only
sections, so should if at all possible go into read-only PT_LOAD segment.
Unfortunately, binaries are not relocatable.
The methods prelinker uses currently are:
- if it finds unused gaps inside of read-only PT_LOAD segment big enough
  to store/grow sections there, it does that
- if (mainly on IA-32), binary's base address is not a power of two
  (on IA-32 it is 0x8048000), it attempts to grow the first read-only
  PT_LOAD segment downwards (e.g. decrease base to 0x8047000 or even less),
  moving the sections which really must be in the first page (.interp,
  .note.ABI-tag) down (and if needed others which can be moved that way
  too). The drawback of this is that even if just a few bytes are needed,
  whole pages are taken.
- if both of the above methods fail, it tries to create a new read-only
  PT_LOAD segment, which is present in the binary after the PT_LOAD
  segment with .bss). Currently this only works as long as there is a gap
  after end of ElfW(Phdr) and first section, but eventually the prelinker
  could just move the smallest section (but .interp, .note.ABI-tag) from
  the first PT_LOAD segment to this one. The drawback of this is clear,
  it requires one more mmap on program startup and consumes even more memory
  (well, usually it comes back by less dirty pages (and thus more possible
  sharing) because most of the relocations are not needed with prelinking,
  but anyway).
The thing I'm wondering is whether it would not be a good idea to reserve
some minimal gaps, so that for typical small applications the first point
could trigger. From looking at how big most of /bin/* and /usr/bin/*
.gnu.liblist and .gnu.conflict sections are, .gnu.liblist is usually <= 0x50
bytes, .gnu.conflict is usually <= 0x120 bytes, plus there is a tiny .dynstr
growth (since most of the SONAMES are already in .dynstr because they are in
DT_NEEDED tags, this usually means just adding /lib/ld-linux.so.2 string,
ie. say <= 0x20 bytes. This would mean 400 bytes gap in read-only PT_LOAD
segment would help for about 50% of applications. The question is if this is
reasonable price to pay or not (would be of course helpful for dynamically
linked binaries only, not for static binaries nor libraries).
Ideas?

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]