This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: Use stable sort for ld -r relocs
- From: Alan Modra <amodra at gmail dot com>
- To: Jiong Wang <jiong dot wang at arm dot com>
- Cc: "binutils at sourceware dot org" <binutils at sourceware dot org>
- Date: Wed, 16 Sep 2015 18:16:31 +0930
- Subject: Re: Use stable sort for ld -r relocs
- Authentication-results: sourceware.org; auth=none
- References: <20150826140258 dot GB24814 at bubble dot grove dot modra dot org> <n99si6fu8kw dot fsf at arm dot com>
On Tue, Sep 15, 2015 at 01:58:08PM +0100, Jiong Wang wrote:
>
> Alan Modra writes:
>
> > A number of targets emit multiple relocs at a given r_offset, and
> > depend on those relocs staying in their original order. qsort doesn't
> > satisfy this requirement, although it appears from my non-rigorous
> > testing that glibc's msort.c implementation may in fact be stable.
> >
> > I made the mistake of backporting my PR 17666 fix to 2.25.1, thinking
> > the code had enough time to settle on mainline, but for anyone with a
> > system libc that provides an unstable qsort this will mean 2.25.1
> > ld -r may be broken on some targets.
> >
> > PR 18867
> > * elflink.c (cmp_ext32l_r_offset, cmp_ext32b_r_offset): Delete.
> > (cmp_ext64l_r_offset, cmp_ext64b_r_offset): Delete.
> > (ext32l_r_offset, ext32b_r_offset, ext64l_r_offset, ext64b_r_offset):
> > New functions.
> > (elf_link_adjust_relocs): Use an insertion sort to sort relocs.
> >
>
> Alan,
>
> After this patch, the linking speed becomes much slower when linking
> AArch64 kernel on x86 cross environment.
I'm tempted to leave the code as is. I've long said that it is wrong
for the kernel to use ld -r as a packaging system, and perhaps a
really slow ld -r would finally force a change. :)
Anyway, reproduced. The big slowdown happens with .debug_info, which
has 4367968 relocs. Quite a sizeable run of those is out of order due
to not processing files in order. (See how elf_link_input_bfd is
called in elflink.c, and the testcase in pr17666.)
I'll commit the following after a bit more testing.
PR 18867
* elflink.c (elf_link_adjust_relocs): Modify insertion sort to
insert a run. Return status in case of malloc failure.
Adjust callers.
diff --git a/bfd/elflink.c b/bfd/elflink.c
index 6027d35..faa50bf 100644
--- a/bfd/elflink.c
+++ b/bfd/elflink.c
@@ -8201,7 +8201,7 @@ ext64b_r_offset (const void *p)
referenced must be updated. Update all the relocations found in
RELDATA. */
-static void
+static bfd_boolean
elf_link_adjust_relocs (bfd *abfd,
struct bfd_elf_section_reloc_data *reldata,
bfd_boolean sort)
@@ -8267,7 +8267,7 @@ elf_link_adjust_relocs (bfd *abfd,
bfd_vma r_off;
size_t elt_size;
bfd_byte *base, *end, *p, *loc;
- bfd_byte buf[sizeof (Elf64_External_Rela)];
+ bfd_byte *buf = NULL;
if (bed->s->arch_size == 32)
{
@@ -8290,12 +8290,12 @@ elf_link_adjust_relocs (bfd *abfd,
abort ();
}
- /* Must use a stable sort here. Insertion sort, since the
- relocs are mostly sorted already. */
+ /* Must use a stable sort here. A modified insertion sort,
+ since the relocs are mostly sorted already. */
elt_size = reldata->hdr->sh_entsize;
base = reldata->hdr->contents;
end = base + count * elt_size;
- if (elt_size > sizeof (buf))
+ if (elt_size > sizeof (Elf64_External_Rela))
abort ();
/* Ensure the first element is lowest. This acts as a sentinel,
@@ -8315,12 +8315,13 @@ elf_link_adjust_relocs (bfd *abfd,
/* Don't just swap *base and *loc as that changes the order
of the original base[0] and base[1] if they happen to
have the same r_offset. */
- memcpy (buf, loc, elt_size);
+ bfd_byte onebuf[sizeof (Elf64_External_Rela)];
+ memcpy (onebuf, loc, elt_size);
memmove (base + elt_size, base, loc - base);
- memcpy (base, buf, elt_size);
+ memcpy (base, onebuf, elt_size);
}
- for (p = base + elt_size; (p += elt_size) < end; )
+ for (p = base + elt_size; p < end; )
{
/* base to p is sorted, *p is next to insert. */
r_off = (*ext_r_off) (p);
@@ -8331,15 +8332,48 @@ elf_link_adjust_relocs (bfd *abfd,
loc += elt_size;
if (loc != p)
{
- memcpy (buf, p, elt_size);
- memmove (loc + elt_size, loc, p - loc);
- memcpy (loc, buf, elt_size);
+ /* Chances are there is a run of relocs to insert here,
+ from one of more input files. Files are not always
+ linked in order due to the way elf_link_input_bfd is
+ called. See pr17666. */
+ size_t sortlen = p - loc;
+ bfd_vma r_off2 = (*ext_r_off) (loc);
+ size_t runlen = elt_size;
+ size_t buf_size = 96 * 1024;
+ while (p + runlen < end
+ && (sortlen <= buf_size
+ || runlen + elt_size <= buf_size)
+ && r_off2 > (*ext_r_off) (p + runlen))
+ runlen += elt_size;
+ if (buf == NULL)
+ {
+ buf = bfd_malloc (buf_size);
+ if (buf == NULL)
+ return FALSE;
+ }
+ if (runlen < sortlen)
+ {
+ memcpy (buf, p, runlen);
+ memmove (loc + runlen, loc, sortlen);
+ memcpy (loc, buf, runlen);
+ }
+ else
+ {
+ memcpy (buf, loc, sortlen);
+ memmove (loc, p, runlen);
+ memcpy (loc + runlen, buf, sortlen);
+ }
+ p += runlen;
}
+ else
+ p += elt_size;
}
/* Hashes are no longer valid. */
free (reldata->hashes);
reldata->hashes = NULL;
+ free (buf);
}
+ return TRUE;
}
struct elf_link_sort_rela
@@ -11585,10 +11619,12 @@ bfd_elf_final_link (bfd *abfd, struct bfd_link_info *info)
continue;
sort = bed->sort_relocs_p == NULL || (*bed->sort_relocs_p) (o);
- if (esdo->rel.hdr != NULL)
- elf_link_adjust_relocs (abfd, &esdo->rel, sort);
- if (esdo->rela.hdr != NULL)
- elf_link_adjust_relocs (abfd, &esdo->rela, sort);
+ if (esdo->rel.hdr != NULL
+ && !elf_link_adjust_relocs (abfd, &esdo->rel, sort))
+ return FALSE;
+ if (esdo->rela.hdr != NULL
+ && !elf_link_adjust_relocs (abfd, &esdo->rela, sort))
+ return FALSE;
/* Set the reloc_count field to 0 to prevent write_relocs from
trying to swap the relocs out itself. */
--
Alan Modra
Australia Development Lab, IBM