This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: C++ nested classes, namespaces, structs, and compound statements

From: Daniel Berlin <dan at dberlin dot org>
To: Jim Blandy <jimb at redhat dot com>
Cc: gdb at sources dot redhat dot com, Benjamin Kosnik <bkoz at redhat dot com>
Date: Sat, 6 Apr 2002 01:05:53 -0500 (EST)
Subject: Re: C++ nested classes, namespaces, structs, and compound statements

On Fri, 5 Apr 2002, Jim Blandy wrote:

> 
> At the moment, GDB doesn't handle C++ namespaces or nested classes
> very well.  I have a general idea of how we could address these
> limitations, which I'd like to put up for shredding M-DEL discussion.
> 
> Let me admit up front that I don't really know C++, so I may be saying
> stupid things.  Please set me straight if you notice something.
> 
> In C, structs are essentially lists of member names, types, and
> locations (offsets from the structure's base address):
> 
>   struct S { int x; char y; struct T t; }
> 
> (Unions are just the same, except that the offsets are all zero.  That
> relationship carries through the entire discussion here, so I'm not
> going to talk about unions any more.)
> 
> If you think about it just right (or just wrong), this is really very
> similar to the set of local variables associated with a compound
> statement:
> 
>   {
>     int x;
>     char y;
>     struct T t;
> 
>     ...
>   }
> 
> As far as scoping is concerned, this compound statement is also just a
> list of names, types, and locations.  The locations here are a bit
> less restricted: whereas a struct's members' locations are all offsets
> from the start of the struct, a compound statement's variables'
> locations can be registers, regions of the stack frame, fixed
> addresses (i.e., static variables), and so on.  But just as a struct
> type divides up a block of storage into individual members with types,
> a compound statement's local variables divide up a function
> invocation's stack frame and registers into individual variables with
> types.
> 
> The analogy isn't perfect, of course.  Structs don't enclose blocks of
> code.  And a compound statement is less restricted: it can also
> contain typedefs, definitions of struct and enum tags, and so on:
> 
>   {
>     int x;
>     char y;
>     struct T t;
>     struct L { int j, k; };
>     typedef struct L L_t;
> 
>     ...
>   }
> 
> Here the definitions of `struct L' and L_t are local to the compound
> statement.  In structs, however, things behave differently: struct
> tags defined within another struct have the same scope as the
> containing struct; and you can't put typedefs in a struct at all.  So
> structs are really very restricted with regards to what they can
> contain.
> 
> However, C++ loosens a lot of these restrictions, generalizing structs
> and classes until they really begin to look very much like compound
> statements.  (The only difference between structs and classes in C++
> is whether members are public by default.  So I'm not going to talk
> about classes any more.)
> 
> For example, in C++, you can declare typedefs inside structs:
> 
>   $ cat local-typedef.C
>   struct S
>   {
>     typedef int smootz;
> 
>     smootz a, b;
>   };
> 
>   smootz c;
>   $ $GccB/g++ -c local-typedef.C 
>   local-typedef.C:8: 'smootz' is used as a type, but is not defined as a type.
> 
> The compiler accepts the definition of the typedef `smootz' and its
> use within `struct S', but outside of S the typedef isn't visible.
> Struct tags behave similarly.
> 
> You can also declare "static" struct members --- you can access them
> with the `->' and `.' operators, just like ordinary members, but
> they're actually variables at fixed addresses in the .data segment ---
> much like a "static" variable in a C compound statement.  But this
> means that a simple offset from a base address is no longer sufficient
> to describe a struct's member's location --- you actually start
> needing something like GDB's enum address_class.  Multiple inheritance
> and virtual base classes introduce further complexity here.
> 
> There's another difference between compound statements and structs
> goes away.  In C, you can only reference a struct's members using the
> `.' and `->' operators, whereas you refer to a compound statement's
> variables by simply naming them.  But in C++, a struct's member
> functions can refer to the struct's members by simply naming them.
> The struct's bindings become another rib in the search path for
> identifier bindings.
> 
> In summary, the data structure GDB needs to represent C++ structs
> (classes, unions, whatever) has a lot of similarities to the structure
> GDB needs to represent the local variables of a compound statement.
> They both need to carry bindings for several namespaces (ordinary
> identifiers and structure tags).  The names can refer to any manner of
> things: variables, functions, namespaces, base classes, and so on.
> For variables, there are a variety of locations they might occupy.
> 
> 
> So I would like to introduce to GDB a new type, `struct environment'
> (or is `struct env' better?) which does about the same thing that the
> `nsyms' and `sym' members of `struct block', and the `nfields' and
> `fields' members of `struct type', do now: it's just a bunch of
> bindings for names.  We would use `struct environment':
> 
> - in `struct block', to represent the block's local variables, replacing
>   `nsyms' and `sym';
> - in `struct type', to represent a struct's members, instead of
>   `struct fields'; and
> - in our representation for C++ namespaces, which seem pretty much
>   like structs that can only contain static members and member
>   functions (i.e., you can't ever create an instance of one).


Except.

You can alias them:
#include <string>

using namespace bob = std;

bob::string a;

Would work;


You also have issues with anonymous namespaces, unions, and structs.

yes, you can do 
namespace {
int a;
}

It's not as easy to handle as you think you can't just point have a 
simple pointer, you need lists, and have to order them right.

It gets more fun:

namespace { int i; } // unique::i
void f() { i++; } //unique::i++
namespace A {
	namespace {
		int i; //A::unique::i
		int j; //A::unique::j
	}
	void g() { i++; } // A::unique::i++
}
using namespace A;
void h()
{
	i++; // ambiguity error (unique::i or A::unique::i)
	A::i++;  //A::unique::i++;
	j++;  //A::unique::j++;
}
You can get some real hairy stuff.
Think of the memory cost for representing this.
It needs to be as shared as possible.

because I can do:

void f();
namespace A
{
	void g();
}
namespace X {
	using ::f;
	using A::g;
}
void h()
{
	X::f(); //calls ::f()
	X::g(); //calls A::g()
}	


Obviously, you don't want to do massive namespace injection to support 
this, you ideally want to directly have A::g's symbol (named "g") inside 
X.


To support resolution of names properly, struct environments probably want 
a set of lookup function pointers.

That way, the lookup works properly, regardless of what language the 
current frame is, and what language the symbol you are asking for is 
in.

The actual structure storing the symbols and whatnot inside the 
environment should be opaque, so that we 
shouldn't have to do major work to replace the indexing structures 
used, etc.

--Dan

References:
- C++ nested classes, namespaces, structs, and compound statements
  - From: Jim Blandy

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]