This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 26/40] Optimize .gdb_index symbol name searching


On 06/02/2017 05:22 AM, Pedro Alves wrote:

> I got, before the previous patch (-O2, x86-64):
> 
>  real    0m1.773s
>  user    0m1.737s
>  sys     0m0.040s
> 
> and after this patch:
> 
>  real    0m1.361s
>  user    0m1.315s
>  sys     0m0.040s

The results on my computer are slightly more dramatic, running with no
optimization, your test case (using Fedora 21 system gdb w/index debuginfo)
goes from about 15 seconds down to about 2.5 seconds. Very nice!

> That resulted in 1351355 name_components.  Each entry takes 8 bytes,
> so that's 10810840 bytes (ignoring std::vector overhead), or ~10.3 MB.
> That's IMO too small to worry about, given GDB was using over 7400MB
> total at that point.  I.e., we're talking about 0.1% increase.

Indeed. I'd sacrifice that kind of memory for the kind of speed increase
you've achieved -- in a heartbeat!

> with only 8-bit and 32-bit tables, that'd be:
> 
>  1349057 * 1 + 2298 * 4 + 4 * 1351355 = 6763669 bytes, or ~6.5MB.
> 
> I don't think we need to bother though.

I'm all for memory usage optimization and whatnot, but since the benefit is
so small (55% of these new tables saved but only 0.06% of total memory),
I prefer simplicity. So you won't get anything but agreement from me on this.

> I also timed:
> 
>  $ time gdb --batch -q -p `pidof firefox`
>  $ time gdb --batch -q -p `pidof firefox` -ex "b main"
>  $ time gdb --batch -q -p `pidof firefox` -ex "set max-completion unlimited" -ex "complete b "

I'd like to reproduce this, but my computer is incapable of running this test.
I'll take your word for it. ;-)

> gdb/ChangeLog:
> yyyy-mm-dd  Pedro Alves  <palves@redhat.com>
> 
> 	* dwarf2read.c 
> 	(mapped_index::name_components): New field.
> 	(mapped_index::symbol_name_at): New method.

Silly nit: Isn't the form most are using "(tag name) <field>: New field."?
I know I've relied on this several times to find changes in the ChangeLog.

> 	(create_addrmap_from_index): Call mapped_index ctor.

I don't see any changes to this function in the patch -- attributed to wrong
function?

> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
> index f523326..e955131 100644
> --- a/gdb/dwarf2read.c
> +++ b/gdb/dwarf2read.c
> @@ -178,6 +178,51 @@ DEF_VEC_I (offset_type);
[snip]
> +
> +/* An index into a (C++) symbol name component in a symbol name as
> +   recorded in the mapped_index's symbol table.  For each C++ symbol
> +   in the symbol table, we record one entry for the start of each
> +   component in the symbol in a table of name components, and then
> +   sort the table, in order to be able to binary search symbol names,
> +   ignoring leading namespaces, both completion and regular look up.
> +   For example, for symbol "A::B::C", we'll have an entry that points
> +   to "A::B::C", another that points to "B::C", and another for "C".
> +   Note that function symbols in GDB index have no parameter
> +   information, just the function/method names.  You can convert a
> +   name_component to a "const char *" using the
> +   'mapped_index::symbol_name_at(offset_type)' method.  */

missing nl?

> +struct name_component
> +{
> +  /* Offset in the symbol name where the component starts.  Stored as
> +     a (32-bit) offset instead of a pointer to save memory and improve
> +     locality on 64-bit architectures.  */
> +  offset_type name_offset;
> +
> +  /* The symbol's index in the symbol and constant pool tables of a
> +     mapped_index.  */
> +  offset_type idx;
> +};
> +
>  /* A description of the mapped index.  The file format is described in
>     a comment by the code that writes the index.  */
>  struct mapped_index
> @@ -3390,6 +3424,7 @@ dwarf2_read_index (struct objfile *objfile)
>    create_addrmap_from_index (objfile, &local_map);
>  
>    map = XOBNEW (&objfile->objfile_obstack, struct mapped_index);
> +  map = new (map) mapped_index ();
>    *map = local_map;
>  
>    dwarf2_per_objfile->index_table = map;

This function (dwarf2_read_index) is not mentioned in the ChangeLog. Could
this actually be incorrectly listed in the ChangeLog under
create_addrmap_from_index?

> @@ -4095,6 +4134,22 @@ gdb_index_symbol_name_matcher::matches (const char *symbol_name)
>  }
>  
>  static void
> +dw2_expand_marked_cus
> +  (mapped_index &index, offset_type idx,
> +   struct objfile *objfile,
> +   gdb::function_view<expand_symtabs_file_matcher_ftype> file_matcher,
> +   gdb::function_view<expand_symtabs_exp_notify_ftype> expansion_notify,
> +   search_domain kind);
> +
> +static void
> +dw2_expand_symtabs_matching_symbol
> +  (mapped_index &index,
> +   const lookup_name_info &lookup_name_in,
> +   gdb::function_view<expand_symtabs_symbol_matcher_ftype> symbol_matcher,
> +   enum search_domain kind,
> +   gdb::function_view<void (offset_type)> on_match);

Isn't it rather unusual for us to have forward decls in the middle of a file?

> +
> +static void
>  dw2_expand_symtabs_matching
>    (struct objfile *objfile,
>     gdb::function_view<expand_symtabs_file_matcher_ftype> file_matcher,
> @@ -4186,30 +4239,214 @@ dw2_expand_symtabs_matching
[snip]
> +static void
> +dw2_expand_symtabs_matching_symbol
> +  (mapped_index &index,
> +   const lookup_name_info &lookup_name,
> +   gdb::function_view<expand_symtabs_symbol_matcher_ftype> symbol_matcher,
> +   enum search_domain kind,
> +   gdb::function_view<void (offset_type)> match_callback)
> +{
[snip]
> +
> +      /* Sort name_comp elements by name.   */

I presume that "name_comp" is really "name_components"?

[snip]

> +  std::vector<offset_type> matches;
> +  matches.reserve (std::distance (lower, upper));
> +
> +  for (;lower != upper; ++lower)
> +    {
> +      const char *qualified = index.symbol_name_at (lower->idx);
> +
> +      if (!lookup_name_matcher.matches (qualified)
> +	  || (symbol_matcher != NULL && !symbol_matcher (qualified)))
>  	continue;
>  
> -      /* The name was matched, now expand corresponding CUs that were
> -	 marked.  */
> -      vec = (offset_type *) (index->constant_pool
> -			     + MAYBE_SWAP (index->symbol_table[idx + 1]));
> +      matches.push_back (lower->idx);
> +    }
> +
> +  std::sort (matches.begin (), matches.end ());
> +
> +  /* Finally call the callback, once per match.  */
> +  ULONGEST prev = -1;
> +  for (offset_type idx : matches)
> +    {
> +      if (prev != idx)
> +	{
> +	  match_callback (idx);
> +	  prev = idx;
> +	}
> +    }

I admit, I'm a little surprised by the number of steps involved here: push
every element in the range into a vector, sort, then de-dup & perform callback.
Had I implemented this, my initial attempt would have been to use a htab_up
and take the sorting time on insertion.

I can imagine that for very large ranges, your approach could outperform
an htab; do you expect these ranges to be that large? Have I overlooked
something? [I'm just curious. Not suggesting any changes need to be made.]

> +
> +  /* Above we use a type wider than idx's for 'prev', since 0 and
> +     (offset_type)-1 are both possible values.  */
> +  static_assert (sizeof (prev) > sizeof (offset_type), "");
> +}
> +
[snip]

Keith


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]