This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
[PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC
- From: "Maciej W. Rozycki" <macro at imgtec dot com>
- To: Alan Modra <amodra at gmail dot com>
- Cc: Nick Clifton <nickc at redhat dot com>, <binutils at sourceware dot org>
- Date: Mon, 27 Mar 2017 12:39:07 +0100
- Subject: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC
- Authentication-results: sourceware.org; auth=none
Complement commit 902e9fc76a0e ("PR ld/20828: Move symbol version
processing ahead of GC symbol sweep"), commit b531344c34b0 ("PR
ld/20828: Reorder the symbol sweep stage of section GC") and commit
81ff47b3a546 ("PR ld/20828: Fix linker script symbols wrongly forced
local with section GC"), and prevent symbols forcibly entered in the
output file with the use of the `--undefined=' or `--require-defined='
linker command line options or the EXTERN linker script command from
being swept in section garbage collection and consequently recorded in
the dynamic symbol table as local entries. This happens in certain
circumstances, where a symbol reference also exists in one of the static
input files, however only in a section which is garbage-collected and
does not make it to the output file, and the symbol is defined in a
dynamic object present in the link.
For example with the `i386-linux' target and the `pr21233.s' and
`pr21233-l.s' sources, and the `pr21233.ld' linker script included with
this change we get:
$ as -o pr21233-l.o pr21233-l.s
$ ld -shared -T pr21233.ld -o libpr21233.so pr21233-l.o
$ as -o pr21233.o pr21233.s
$ ld --gc-sections -e foo --require-defined=bar -T pr21233.ld -o pr21233 pr21233.o libpr21233.so
$ readelf --dyn-syms pr21233
Symbol table '.dynsym' contains 2 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 OBJECT LOCAL DEFAULT UND bar
$
which makes the run-time `bar' dependency of the `pr21233' executable
different from its corresponding link-time dependency, i.e. the presence
of `libpr21233.so' and its `bar' symbol is required at the link time,
however at the run time a copy of `libpr21233.so' without `bar' will do.
Similarly with `--undefined=' and EXTERN which do not actually require
the reference to the symbol requested to be satisfied with a definition
at the link time, however once the definition has been pulled at the
link time, so it should at the dynamic load time.
Additionally with the `mips-linux' target we get:
$ ld --gc-sections -e foo --require-defined=bar -T pr21233.ld -o pr21233 pr21233.o libpr21233.so
ld: BFD (GNU Binutils) 2.28.51.20170324 assertion fail .../bfd/elfxx-mips.c:3861
$
as the target is not prepared to handle such a local dynamic symbol.
With this change in effect we get:
$ readelf --dyn-syms pr21233
Symbol table '.dynsym' contains 2 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 OBJECT GLOBAL DEFAULT UND bar
$
instead, for both targets.
ld/
PR ld/21233
* ldlang.c (insert_undefined): Set `mark' for ELF symbols.
* testsuite/ld-elf/pr21233.sd: New test.
* testsuite/ld-elf/pr21233-l.sd: New test.
* testsuite/ld-elf/pr21233.ld: New test linker script.
* testsuite/ld-elf/pr21233-e.ld: New test linker script.
* testsuite/ld-elf/pr21233.s: New test source.
* testsuite/ld-elf/pr21233-l.s: New test source.
* testsuite/ld-elf/shared.exp: Run the new tests.
---
NB I haven't checked if a dynamic executable with an undefined unused
local symbol is going to succeed or fail execution with the dynamic
loader, but in any case I see our current semantics as inconsistent.
And as I previously noted in the course of addressing PR ld/20828 in any
case there's no point in having a local symbol (other than perhaps for
special cases some psABIs may define) in the dynamic symbol table. This
brings us with two potential solutions: either retaining the global scope
for the symbol or removing the symbol altogether. I think the former
approach is more consistent with the definitions of the command-line
options and the linker script command concerned, which is why I chose it
over the latter.
There have been no regressions across my usual targets with this change
applied. The new test cases do however fail across a number of targets.
The cause is their handling of copy relocations. As `bar' is a dynamic
data symbol, space in the `.dynbss' section is allocated and a copy
relocation produced. As `.dynbss' is discarded from output the relocation
is lost, and the local copy of the `bar' symbol converted to an absolute
symbol, e.g. with `m68k-linux':
$ readelf --dyn-syms pr21233
Symbol table '.dynsym' contains 2 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 1 OBJECT GLOBAL DEFAULT ABS bar
$
(NB this variant of `bar' is similarly entered as a local symbol if the
executable is linked without the change discussed here). The exact list
of targets affected I have identified is as follows:
aarch64-linux
am33_2.0-linux
arc-linux
arm-eabi
arm-linuxeabi
arm-netbsdelf
crisv32-linux
m68k-elf
m68k-linux
mn10300-elf
nios2-linux
vax-linux
vax-netbsdelf
Additionally `bfin-elf' and `bfin-uclinux' fail with:
ld: the bfin target does not currently support the generation of copy relocations
ld: failed to set dynamic section sizes: File format not recognized
I have identified the cause of this phenomenon to be the reverse order
`elf_gc_sweep' and `_bfd_elf_link_check_relocs' are called in on these
targets, causing `->non_got_ref' to be set for the symbol even though the
reference is later discarded. The possibility to change the order has
been introduced with commit d968975277ba ("Check ELF relocs after opening
all input files"), using CHECK_RELOCS_AFTER_OPEN_INPUT, after a discussion
started at <https://www.sourceware.org/ml/binutils/2016-04/msg00295.html>
which indicates the intent to gradually swap the order for all targets.
After the initial change for x86 this has only been since done for SH (cf
PR ld/17739), i.e. I gather we are still in the middle of the move.
Which brings me a question to our general maintainers: which of the
following 3 options shall I pick for the purpose of this test case:
1. Leave the new failures as they are and let maintainers handle them as
they find need or time; there may be more to be done for individual
targets beyond the lone change to CHECK_RELOCS_AFTER_OPEN_INPUT.
2. File a PR referring to commit d968975277ba and its discussion and KFAIL
the affected test cases for the problematic targets.
3. Tweak the tests to accept absolute symbols (and exclude `bfin-*-*' from
testing as a hopeless case). I'd rather avoid this option as I see
such an outcome as invalid, defeating the purpose of the command-line
options and the linker script command concerned.
NB I have verified that using `-z nocopyreloc' does not change anything
for most of these problematic targets, so this is not a viable workaround.
Maciej
binutils-ld-insert-undefined-mark.diff
Index: binutils/ld/ldlang.c
===================================================================
--- binutils.orig/ld/ldlang.c 2017-03-22 15:26:43.651680031 +0000
+++ binutils/ld/ldlang.c 2017-03-22 18:44:47.798267237 +0000
@@ -3429,6 +3429,8 @@ insert_undefined (const char *name)
{
h->type = bfd_link_hash_undefined;
h->u.undef.abfd = NULL;
+ if (is_elf_hash_table (link_info.hash))
+ ((struct elf_link_hash_entry *) h)->mark = 1;
bfd_link_add_undef (link_info.hash, h);
}
}
Index: binutils/ld/testsuite/ld-elf/pr21233-e.ld
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233-e.ld 2017-03-22 18:44:54.976834149 +0000
@@ -0,0 +1,2 @@
+EXTERN (bar)
+INCLUDE pr21233.ld
Index: binutils/ld/testsuite/ld-elf/pr21233-l.s
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233-l.s 2017-03-22 18:44:54.994117791 +0000
@@ -0,0 +1,6 @@
+ .data
+ .globl bar
+ .type bar, %object
+bar:
+ .byte 1
+ .size bar, . - bar
Index: binutils/ld/testsuite/ld-elf/pr21233-l.sd
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233-l.sd 2017-03-22 19:18:22.818303465 +0000
@@ -0,0 +1,6 @@
+# Make sure global `bar' is present in the dynamic symbol table, e.g.:
+# Num: Value Size Type Bind Vis Ndx Name
+# 1: 00000000 1 OBJECT GLOBAL DEFAULT 5 bar
+#...
+ *[0-9]+: +[0-9a-f]+ +1 +OBJECT +GLOBAL +DEFAULT +[0-9]+ +bar
+#pass
Index: binutils/ld/testsuite/ld-elf/pr21233.ld
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233.ld 2017-03-22 18:44:55.060208515 +0000
@@ -0,0 +1,17 @@
+SECTIONS
+{
+ .hash : { *(.hash) }
+ .dynsym : { *(.dynsym) }
+ .dynstr : { *(.dynstr) }
+ .rel.dyn : { *(.rel.dyn) }
+ .text : { *(.text) }
+ .dynamic : { *(.dynamic) }
+ .data : { *(.data) }
+ .symtab : { *(.symtab) }
+ .strtab : { *(.strtab) }
+ .shstrtab : { *(.shstrtab) }
+ .plt : { *(.plt) }
+ .got.plt : { *(.got.plt) }
+ .got : { *(.got) }
+ /DISCARD/ : { *(*) }
+}
Index: binutils/ld/testsuite/ld-elf/pr21233.s
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233.s 2017-03-22 18:44:55.080529486 +0000
@@ -0,0 +1,8 @@
+ .text
+ .globl foo
+ .type foo, %function
+foo:
+ .size foo, . - foo
+
+ .data
+ .dc.a bar
Index: binutils/ld/testsuite/ld-elf/pr21233.sd
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233.sd 2017-03-22 18:44:55.092782869 +0000
@@ -0,0 +1,9 @@
+# Make sure the `bar' reference is global rather than local
+# in the dynamic symbol table, e.g.:
+# Num: Value Size Type Bind Vis Ndx Name
+# 1: 00000000 0 OBJECT GLOBAL DEFAULT UND bar
+# vs:
+# 1: 00000000 0 OBJECT LOCAL DEFAULT UND bar
+#...
+ *[0-9]+: +[0-9a-f]+ +0 +OBJECT +GLOBAL +DEFAULT +UND +bar
+#pass
Index: binutils/ld/testsuite/ld-elf/shared.exp
===================================================================
--- binutils.orig/ld/testsuite/ld-elf/shared.exp 2017-03-22 15:26:44.259343252 +0000
+++ binutils/ld/testsuite/ld-elf/shared.exp 2017-03-22 18:54:52.902130284 +0000
@@ -115,6 +115,45 @@ if { [check_gc_sections_available] } {
{{objdump -p pr20828-v.od}} \
"pr20828-v-2"]]
}
+# PR ld/21233 check for correct dynamic symbol table entries where:
+# - a symbol has been defined in a shared library used in the link,
+# - the symbol has been referenced from a section swept in garbage collection,
+# - the symbol has also been forced to be entered in the output file as an
+# undefined symbol, either with a command-line option or a linker script
+# command.
+# Verify that the undefined symbol is global rather than local.
+if { [check_gc_sections_available] } {
+ run_ld_link_tests [list \
+ [list \
+ "PR ld/21233 dynamic symbols with section GC\
+ (auxiliary shared library)" \
+ "$LFLAGS -shared -T pr21233.ld" "" "$AFLAGS_PIC" \
+ {pr21233-l.s} \
+ {{readelf --dyn-syms pr21233-l.sd}} \
+ "libpr21233.so"] \
+ [list \
+ "PR ld/21233 dynamic symbols with section GC (--undefined)" \
+ "$LFLAGS --gc-sections -e foo --undefined=bar -T pr21233.ld" \
+ "tmpdir/libpr21233.so" "" \
+ {pr21233.s} \
+ {{readelf --dyn-syms pr21233.sd}} \
+ "pr21233-1"] \
+ [list \
+ "PR ld/21233 dynamic symbols with section GC (--require-defined)" \
+ "$LFLAGS --gc-sections -e foo --require-defined=bar\
+ -T pr21233.ld" \
+ "tmpdir/libpr21233.so" "" \
+ {pr21233.s} \
+ {{readelf --dyn-syms pr21233.sd}} \
+ "pr21233-2"] \
+ [list \
+ "PR ld/21233 dynamic symbols with section GC (EXTERN)" \
+ "$LFLAGS --gc-sections -e foo -T pr21233-e.ld" \
+ "tmpdir/libpr21233.so" "" \
+ {pr21233.s} \
+ {{readelf --dyn-syms pr21233.sd}} \
+ "pr21233-3"]]
+}
# Check to see if the C compiler works
if { [which $CC] == 0 } {