This is the mail archive of the archer@sourceware.org mailing list for the Archer project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Cross-CU C++ DIE references vs. mangling

From: Sami Wagiaalla <swagiaal at redhat dot com>
To: Roland McGrath <roland at redhat dot com>
Cc: Jan Kratochvil <jan dot kratochvil at redhat dot com>, archer at sourceware dot org, Keith Seitz <keiths at redhat dot com>
Date: Mon, 12 Apr 2010 14:46:48 -0400
Subject: Re: Cross-CU C++ DIE references vs. mangling
References: <20100310191833.GA2816@host0.dyn.jankratochvil.net> <20100310193207.GA6147@host0.dyn.jankratochvil.net> <20100311060305.B177A7D5E@magilla.sf.frob.com>

So after a few (really, many) reads of this email I think I can
summarize the issues and solutions discussed there. I just wanted to
make sure I have a proper understanding of the issue before filing a gcc
feature request. So, Is this a correct summary:

The goal is the help gdb find the proper location for variables where
declarations and definitions are separated over CU's or so's.

Why cant gdb do this by itself ? Because:

- It requires a search of all other CU's/so' to locate the definition.
  Which is inefficient but also inaccurate since

- The scope of the declaration can be different from that of the
  definition (e.g. class members). If DW_AT_MIPS_linkage_name is
  available it can be used to resolve this, however

- if the definition is in a stripped DSO there is indeed a definition
  (ELF) but nowhere is there a DW_AT_location pointing to it. Also,

- it is possible to have two names defined in two separate so's with the
  same linkage name. eg:

> Consider:
> 
> 	$ g++ -g -c -fPIC -o foo1.o -xc++ <(echo 'namespace internal __attribute__((visibility("hidden"))) { int i; };')
> 	$ g++ -g -c -fPIC -o foo2.o -xc++ <(echo 'namespace internal __attribute__((visibility("hidden"))) { extern int i; }; int foo () { return internal::i; }')
> 	$ gcc -g -shared -o foo.so foo1.o foo2.o
> 	$ g++ -g -c -fPIC -o bar1.o -xc++ <(echo 'namespace internal { int i; };')
> 	$ g++ -g -c -fPIC -o bar2.o -xc++ <(echo 'namespace internal { extern int i; }; int bar () { return internal::i; }')
> 	$ gcc -g -shared -o bar.so bar1.o bar2.o
> 	$ eu-readelf -sr -winfo foo.so bar.so
> 
> Now imagine a program linking in both foo.so and bar.so.  There are
> two different things that are both separate but equal and both truly
> internal::i and both truly _ZN8internal1iE.  By any method, there is
> no one answer to, "What is internal::i?"  The only answers are
> context-specific.
> 

Proposed solution:

Teach the compiler to generate a DW_AT_location for a non defining
declaration that is applicable in that die's scope. That location
expression would be parallel to the assembly generated for the symbol

> The key is that you can have the same(ish) relocs using the same
> symbols in the code and DWARF as assembled.  Then whatever happens
> in linking stages later should be the same[...]

So,

> For non-PIC code, the actual code looks like:
> 
> 	movl	_ZN8internal1iE(%rip), %eax
> 
> and the DWARF bit could look like:
> 
> 	.byte DW_OP_addr
> 	.quad _ZN8internal1iE
> 
[...]
> These get resolved at link time to absolute addresses, et voila.

And,

> In a PIC access, what the final code will actually do is not really
> related to anything about ELF symbols.  It's just memory indirection.
> The PIC code is:
> 
> 	movq	_ZN8internal1iE@GOTPCREL(%rip), %rax
> 	movl	(%rax), %eax
> 
[...]
> 	.byte DW_OP_addr
> 	.quad _ZN8internal1iE@GOT
> 	.byte DW_OP_deref
> 
> This generates R_X86_64_GOT64.  At link time, this too goes away and
> becomes the "absolute" address of the .got slot.  

The following part I don't quite understand:

> We could certainly teach GCC to do this.
> It would then be telling us more pieces of direct truth about the code.
> Would that not be the best thing ever?
> Well, almost.
> 
> First, what about a defining declaration in a PIC CU?  
> 
> In the abstract, a defining declaration can be considered as talking
> about two different things.  One is its declarationhood, wherein it
> says that the containing scope has this name visible.  For that
> purpose, it could reasonably be expected to be like a non-defining
> declaration: say how code in this scope accesses the variable--the
> truth about what's in the assembly code for any accesses in that CU.
> But the other thing is its definitionhood, wherein it says what data
> address contains the data cell and thus (optionally) implies what
> object file position holds the initializer image--another truth about
> what's in the assembly code for the definition in this CU.
> 
> In non-PIC code, these two truths match.  Both use direct address
> constants (as relocated at link time).  But in PIC code, the truth
> about the definition is an address constant, while the truth about the
> access is an indirection through .got.  (If you have PIC code that
> uses __attribute__((visibility("hidden"))) then it's direct access,
> though PC-relative, and thus "non-PIC" ("absolute") for DWARF
> purposes, so both truths match as in truly non-PIC code.)
> 
> Personally, I would be all for having it both ways.  In a CU where a
> defining declaration is actually used by PIC accesses, then you could
> generate a second non-defining declaration (even for C).  Give it
> DW_AT_artificial, DW_AT_declaration, DW_AT_specification pointing to
> the defining declaration (in lieu of DW_AT_name, DW_AT_type, et al),
> and then DW_AT_location with the PIC style using indirection.
> 
> With that, you could know that if you got a DW_AT_location from any
> DIE with DW_AT_declaration then you're done and have the real truth
> for accesses.  If we presume no CUs from pre-apocalyptic compilers now
> that we are in these here end times, then we are finally free from
> ever having to rely on discerning the right ELF symbol from a name we
> surmised from DWARF (be it via DW_AT_MIPS_linkage_name or mangling).
> 

Why is there a need for second artificial location describing die ? As I
understand it declarationhood is specified by the die's nesting in the
die hierarchy not its DW_AT_location. In other words, what is missing in
the current way gcc specifies locations for defining declarations ?

This summary does not include the part starting with "Before dynamic
linker startup" to the end of the email. Mainly because I am assuming
that the main use case is after dynamic linker startup.

Follow-Ups:
- Re: Cross-CU C++ DIE references vs. mangling
  - From: Roland McGrath

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]