This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Bfd/binutils inconsistency re: *UND* and *ABS* and relocs against dummy (STT_NOTYPE) symbol entries in elf symbol tables!?


    Hi binutils team,

  This is a bit of a long one, so I'll try and sum it up more briefly right
here at the start, in case someone else recognizes this situation from where
it's cropped up before: (couldn't find it in the ml archives, though).

    Summary:
    ========

 -  ELF symbol tables all begin with a dummy entry, followed by a symbol for
the filename, followed by section symbols, and then general symbols.  (I
don't know if this format is mandated or merely conventional or even just
coincidental.)

 -  ELF relocs are based against symbols, and contain an index into the ELF
symbol table.

 -  It's ambiguous what to do when you encounter a reloc against symbol #0,
which is (at least on-disk) the dummy entry of type STT_NOTYPE.

 -  Sometimes (objdump) this dummy entry is discarded when the symbol table
is read; then a reloc against symbol #0 actually refers to the first real
entry in the symbol table, which corresponds to the filename symbol.

 -  Sometimes (readelf, ld), the dummy entry is kept in the table; then a
reloc against symbol #0 is treated as against none/invalid symbol.

    Issues:
    =======

 -  The code that is emitting these relocs against symbol #0 *thinks* it's
emitting  a reloc against the *ABS* section.  As bfd appears to use the
filename symbol internally to represent the *ABS* section, and the filename
symbol is always zeroth in the table (once the dummy STT_NOTYPE symbol has
been discarded), this works, but I don't know if that's just a coincidence
or not: i.e., is the deliberate and approved way of issuing a reloc against
an *ABS* target to base it against the file symbol, or is that just a
private convention of bfd, or just a quirk of the particular backend
(dlx-elf32) that I'm working on, such that emitting relocs against it does
*work*, but you aren't actually supposed to according to the ELF standard?
I've found comments in other target backends which say things like 
	  /* r_symndx will be zero only for relocs against symbols
	     from removed linkonce sections, or sections discarded by
	     a linker script.  */
which make me wonder if the assembler is issuing these relocs incorrectly in
the first place.

 -  XXXX_relocate_section gets really confused by this whole situation,
because it's generally written like this:

	[...loop across all relocs....]
      r_symndx = ELF32_R_SYM (rel->r_info);
      if (r_symndx < symtab_hdr->sh_info)
	{
	  sym = local_syms + r_symndx;
	  sec = local_sections[r_symndx];
	  sym_name = bfd_elf_local_sym_name (input_bfd, sym);

	  relocation = _bfd_elf_rela_local_sym (output_bfd, sym, sec, rel);
	}
	[...else find out some other way...]

and if ELF32_R_SYM returns zero, sym ends up pointing to the dummy
STT_NOTYPE entry and sec to the *UND* section.  That's not right.

    Details:
    ========

  Right at the start of every ELF symbol table, is an entry that contains
all zeros and is generally ignored.  (Generally the section symbols come
next in the table).  For example, here's some printfs I added to
bfd_elf_get_elf_syms to show the actual values of the st_XXX fields as elf
external symbols are read from a disk file.

GET 56 SYMS at 0 for /artimi/firmware/build/pci-dlx/mac/monitor/monitor.o
[$0a0453c8]isym#0: val 0 size 0 name 0 info 0 other 0 shndx 0
[$0a0453dc]isym#1: val 0 size 0 name 1 info 4 other 0 shndx 65521
[$0a0453f0]isym#2: val 0 size 0 name 0 info 3 other 0 shndx 1
[$0a045404]isym#3: val 0 size 0 name 0 info 3 other 0 shndx 3
[$0a045418]isym#4: val 0 size 0 name 0 info 3 other 0 shndx 4
[$0a04542c]isym#5: val 0 size 0 name 0 info 3 other 0 shndx 5
[$0a045440]isym#6: val 0 size 0 name 0 info 3 other 0 shndx 6
[$0a045454]isym#7: val 0 size 160 name 11 info 1 other 0 shndx 6
[$0a045468]isym#8: val 1084 size 20 name 21 info 2 other 0 shndx 1

  Right.  Now relocs are based against a symbol, and I've been having
particular trouble with one kind in particular.  The relocs in question
contain what are supposed to be absolute addresses, i.e. based in the *ABS*
section.  Here's an example:

READ RELOCS For monitor.o section 27
[$0a084e58]Rel#0: offset $000c, info 1031, addend 0
[$0a084e64]Rel#1: offset $0010, info 9734, addend 0
[$0a084e70]Rel#2: offset $0014, info 9735, addend 0
[....snippped lots more perfectly reasonable looking relocs...]
[$0a0853bc]Rel#115: offset $07c4, info 9, addend 0
[....snippped lots more perfectly reasonable looking relocs...]

(This output was generated by printf statements in
elf_link_read_relocs_from_section and again, simply dumps the r_XXXX fields
to stdout.)

The problem with the above reloc is that the symbol index that it is issued
against (high 24 bits of the info field) is #0.  And that corresponds to the
dummy null symbol at the start of the table.

Now I'm having conceptual problems with a reloc that is issued against
symbol #0.  And it seems I'm not the only one, since objdump and readelf
disagree about the nature of this particular reloc.

In objdump, the first few entries in the symbol table are output like this:

SYMBOL TABLE:
00000000 l    df *ABS*  00000000 monitor.c
00000000 l    d  .text  00000000
00000000 l    d  .data  00000000
00000000 l    d  .bss   00000000
00000000 l    d  .rodata.str1.1 00000000
00000000 l    d  .rodata        00000000
00000000 l     O .rodata        000000a0 _commands
0000043c l     F .text  00000014 _CmdBoot

Note that the file symbol comes first, followed by the section symbols.  The
dummy symbol is nowhere to be seen, and indeed, that's because in objdump,
bfd_elf_get_elf_syms is called from elf_slurp_symbol_table, which quite
deliberately discards the first symbol table entry when it goes to build
canonical asymbol structs from the Elf_Internal_Sym structs that it
generates from the data it reads from disk, as shown by the comment:

	      /* Skip first symbol, which is a null dummy.  */

The consequence of that is that when objdump comes to display the reloc
details, it shows us this nice output:

000007c4 R_DLX_RELOC_26_PCREL  *ABS*

and again, in the object code disassembly I see:

 7c4:   58 d8 00 0c     jal     e024 <_SkipSpace+0xd800>
                        7c4: R_DLX_RELOC_26_PCREL       *ABS*

which looks good (0xe024 is the actual destination of the jump target; we're
using Rel, not Rela relocs in this target).

  readelf, on the other hand, doesn't discard the dummy entry.  It treats a
symbol # of 0 as referring to the *UND* section, rather than
semi-accidentally finding it pointing at the file symbol.  So it dumps the
start of the symbol table like this:

Symbol table '.symtab' contains 56 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 FILE    LOCAL  DEFAULT  ABS monitor.c
     2: 00000000     0 SECTION LOCAL  DEFAULT    1
     3: 00000000     0 SECTION LOCAL  DEFAULT    3
     4: 00000000     0 SECTION LOCAL  DEFAULT    4
     5: 00000000     0 SECTION LOCAL  DEFAULT    5
     6: 00000000     0 SECTION LOCAL  DEFAULT    6
     7: 00000000   160 OBJECT  LOCAL  DEFAULT    6 _commands

and describes the same reloc like this:

Relocation section '.rel.text' at offset 0x1390 contains 122 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
000007c4  00000009 R_DLX_RELOC_26_PC

    Question:  (at last!)
    ========= 

  So who's right?  objdump/gas or readelf/ld?  IOW, how should you really
issue a reloc based against an absolute address?  For the moment, I have
implemented a crude workaround in my version of the linker that says

        if (!r_symndx)
		r_symndx = 1;

which seems to be making the linker link correctly, but should I be making
the assembler issue the reloc against symbol #1 (the file symbol) in the
first place?  IOW, what it comes down to is the identification between the
(notional) ABS section and the file name symbol: is this a guaranteed
identity?

    cheers, 
      DaveK
-- 
Can't think of a witty .sigline today....


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]