This is the mail archive of the
gdb-patches@sources.redhat.com
mailing list for the GDB project.
Re: [WIP] New dwarf2 reader - updated 07-02-2001
- To: Jim Blandy <jimb at zwingli dot cygnus dot com>
- Subject: Re: [WIP] New dwarf2 reader - updated 07-02-2001
- From: Daniel Berlin <dan at cgsoftware dot com>
- Date: Tue, 31 Jul 2001 19:30:51 -0400
- cc: gdb-patches at sources dot redhat dot com
- References: <npae1k8zgu.fsf@zwingli.cygnus.com>
--On Tuesday, July 31, 2001 6:12 PM -0500 Jim Blandy
<jimb@zwingli.cygnus.com> wrote:
>
> Daniel Berlin <dan@cgsoftware.com> writes:
>> > Using 64 bits, as you intended, is much better, but why not use the
>> > full 128? Honestly, why play this kind of game at all?
>> Because it's faster than trying to keep all the die data.
>> > It's not that
>> > hard to do the job perfectly --- just use the significant die contents
>> > themselves as the tree key. No collisions, no hash function, simpler.
>> >
>>
>> Um, think about how much memory that would use.
>> We'd have to keep copies of die contents around.
>
> Use the struct die_info objects themselves as the keys, and the
> values. Just define an arbitrary comparison function with the right
> semantics. We need the struct die_info objects around anyway.
Not past the life of a compilation unit anymore.
We just reread the compilation unit if we've thrown it out already.
And since we need to be able to eliminate over multiple compilation units,
we can't rely on die_info pointers staying live.
>
>> >> > - In read_comp_unit_dies, when you find a duplicate die, you skip to
>> >> > its sibling. What if the parent die is identical, but the
>> >> > children dies differ?
>> >>
>> >> I don't believe this is possible in any language.
>> >
>> > How about this:
>> >
>> > namespace X {
>> > namespace A {
>> > int m, n;
>> > }
>> > }
>> >
>> > namespace Y {
>> > namespace A {
>> > int o, p;
>> > }
>> > }
>> >
>> > The dies for the two 'A' namespaces have the name DW_AT_name
>> > attribute, but they're clearly different.
>> >
>> This won't do it, they'll have different decl line attributes.
>> And we only eliminate at the top level anyway.
>
> Okay, I hadn't noticed the `nesting_level == 1' test. This still
> isn't okay, though. Consider the following C program:
>
> struct {
> int a, b;
> } x;
>
> struct {
> int c, d;
> } y;
>
> We get the following dies for this, at top level:
>
> ...
>
> .byte 0x2 # uleb128 0x2; (DIE (0x5a) DW_TAG_structure_type)
> .long 0x7b # DW_AT_sibling
> .byte 0x8 # DW_AT_byte_size
> .byte 0x1 # DW_AT_decl_file
> .byte 0x3 # DW_AT_decl_line
> .byte 0x3 # uleb128 0x3; (DIE (0x62) DW_TAG_member)
> .ascii "a\0" # DW_AT_name
> .byte 0x1 # DW_AT_decl_file
> .byte 0x2 # DW_AT_decl_line
> .long 0x7b # DW_AT_type
> .byte 0x2 # DW_AT_data_member_location
> .byte 0x23 # DW_OP_plus_uconst
> .byte 0x0 # uleb128 0x0
> .byte 0x3 # uleb128 0x3; (DIE (0x6e) DW_TAG_member)
> .ascii "b\0" # DW_AT_name
> .byte 0x1 # DW_AT_decl_file
> .byte 0x2 # DW_AT_decl_line
> .long 0x7b # DW_AT_type
> .byte 0x2 # DW_AT_data_member_location
> .byte 0x23 # DW_OP_plus_uconst
> .byte 0x4 # uleb128 0x4
> .byte 0x0 # end of children of DIE 0x5a
> .byte 0x4 # uleb128 0x4; (DIE (0x7b) DW_TAG_base_type)
> .ascii "int\0" # DW_AT_name
> .byte 0x4 # DW_AT_byte_size
> .byte 0x5 # DW_AT_encoding
> .byte 0x2 # uleb128 0x2; (DIE (0x82) DW_TAG_structure_type)
> .long 0xa3 # DW_AT_sibling
> .byte 0x8 # DW_AT_byte_size
> .byte 0x1 # DW_AT_decl_file
> .byte 0x7 # DW_AT_decl_line
> .byte 0x3 # uleb128 0x3; (DIE (0x8a) DW_TAG_member)
> .ascii "c\0" # DW_AT_name
> .byte 0x1 # DW_AT_decl_file
> .byte 0x6 # DW_AT_decl_line
> .long 0x7b # DW_AT_type
> .byte 0x2 # DW_AT_data_member_location
> .byte 0x23 # DW_OP_plus_uconst
> .byte 0x0 # uleb128 0x0
> .byte 0x3 # uleb128 0x3; (DIE (0x96) DW_TAG_member)
> .ascii "d\0" # DW_AT_name
> .byte 0x1 # DW_AT_decl_file
> .byte 0x6 # DW_AT_decl_line
> .long 0x7b # DW_AT_type
> .byte 0x2 # DW_AT_data_member_location
> .byte 0x23 # DW_OP_plus_uconst
> .byte 0x4 # uleb128 0x4
> .byte 0x0 # end of children of DIE 0x82
>
> ...
>
> The two DW_TAG_structure_type members are distinguished only by their
> DW_AT_decl_line attributes. But those are optional --- a compiler can
> omit them entirely.
But none do, in practice, since debuggers want to be able to associate the
two.
The solution here, by the way, is something i had written down but not
implemented, for some reason. When dealing with a die with no name,
recursively checksum it's children.
>
> So this patch has the correctness of the Dwarf 2 reader depending on
> the presence of optional source location information for declarations.
>
Just like the previous dwarf2 reader depends on the fact that gcc doesn't
generate absolute die references, etc.
I've not removed our dependence on gcc in the sense that compilers that
generate *less* info than gcc won't always come up with correct results.
>
>> > Just in principle, this seems sloppy. Dies are just arbitrary tree
>> > nodes. There's no reason to assume that if two parent nodes are
>> > identical, we can skip the second one and all its children.
>> It's not sloppy.
>> Elimination at the top level is normal in other debuggers i know of that
>> do elimination.
>
> I've no objection to the idea, just the implementation.
And i have no problem trying to correct your criticisms of the code,
however, the solution to the first problem you've given me thusfar won't
work. Which is why i started using md5 in the first place.