This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Fear and loathing in lookup_block_symbol


On Thu, Sep 06, 2001 at 11:41:01AM -0700, Jason Molenda wrote:
> This is a little long and involved, but there is a big problem in one of 
> the lowest level symbol lookup functions, so (someone) please bear with 
> it all.


I'll offer a little incentive by explaining how much of a performance
loss this is.  As I mentioned in the note, gdb usually does its
per-block symbol lookup with a binary search.  This bug degrades
that log-time algorithm so that it clocks in around O(1/2 * nsyms)
time.

This adds up quick.  I did some measurements of the impact it has
if you're debugging gdb itself.  If the binary search algorithm is
working correctly, doing "print kjdkjdkjd" in (top-gdb) will
ultimately do 865 strcmp()'s.  With the bug, doing "print kjdkjd"
will do 33,000 string comparisons.  And it's linear, so if you
choose an alphabetically lower symbol, say "aaaa", the number of
string comparisons goes up - to around 82,000.

As an aside, in Daniel's August 2000 mondo-patch, one of the things
he addressed is code like this:

  ALL_SYMTABS (objfile, s)
  {
    bv = BLOCKVECTOR (s);
    block = BLOCKVECTOR_BLOCK (bv, GLOBAL_BLOCK);
    sym = lookup_block_symbol (block, name, namespace);
[...]

The problem here is that there are many symtabs in a object file,
but few global blocks.  In gdb's case, there are around 330 symtabs,
and only 23 global blocks.  The result?  We call lookup_block_symbol
many, many times more than necessary, and often (307 times in this
case) it's just re-searching the same old blocks.  

Daniel's approach was to add a sixteen entry hash table in
lookup_symbol which would try to catch re-searching of a block.
I only looked it over briefly, but I don't think the patch is
suitable as-is.  However, he is trying to address an important
problem in gdb and it behooves us to look at this and clean up his
approach, or come up with another way of addressing the problem.

I don't know much about symtabs.  If a symtab (e.g. "foobaz.h")
has no global symbols, wouldn't it make sense for it to have no
global block associated with it?  Or a zero-entry block?  I don't
really know here, I'm just trying to think of another approach to
the problem - if we could eliminate the many-symtabs-to-few-global-blocks,
that would be another route.


This binary search breakage is the important one to address.  As
bad as it was when debugging gdb, I'd hate to see what happens if
you were debugging a C++ program with all those header files those
folks love to use.  Link in a little KDE or Mozilla code and you
might as well get coffee while your debugger is searching for a
nonexistant symbol. :-)  This _does_ represent a regression over
gdb 5.0, so IMHO it'd be worth addressing in 5.1 (it looks lke
y'all have branched for that release already).


The symtab-vs-global-block thing looks like an area ripe for some
easy speedup, but when you've got the binary search algorithm
working right it removes a lot of the incentive to bother with this.
C++ programmers may have a different opinion on this, though. :-)


Jason


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]