This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: suggestion for dictionary representation

From: David Carlton <carlton at math dot stanford dot edu>
To: Daniel Berlin <dberlin at dberlin dot org>
Cc: Daniel Jacobowitz <drow at mvista dot com>, Jim Blandy <jimb at redhat dot com>, gdb at sources dot redhat dot com
Date: 24 Sep 2002 09:33:26 -0700
Subject: Re: suggestion for dictionary representation
References: <6FDBFE18-CF55-11D6-BA45-000393575BCC@dberlin.org>

On Mon, 23 Sep 2002 20:34:50 -0400, Daniel Berlin <dberlin@dberlin.org> said:

>> I'm also curious about how it would affect the speed of reading in
>> symbols.  Right now, that should be O(n), where n is the number of
>> global symbols, right?

>> If we used expandable hash tables, then I think it would be
>> amortized O(n) and with the constant factor larger.

> Our string hash function is O(N) right now (as are most). Hash
> tables are only O(N) when the hash function is O(1).

[ Here, of course, my 'n' is the number of global symbols, and
Daniel's 'N' is the maximum symbol length. ]

This is true, but I'm not sure that it's relevant to this sort of
theoretical analysis.  After all, skip lists depend on N, as well:
they put symbols in order, and the amount of time to do that depends
on the length of the symbols.

And it's entirely reasonable to think of 'N' as a constant.  Or
perhaps two constants: one for C programs with short names, one for
C++ programs with long names.  (And I'm not really sure that the C++
names will ultimately turn out to be that much longer: once the proper
namespace support has been added, then looking up a C++ name will
probably be a multistep process (looking up pieces of the demangled
name in turn), and for each those steps, we'll be looking at a name
that will be of the appropriate length for a C program.)

But even if we consider N to be a constant, your broader point stands:
the constant factors that different algorithms differ by are
important, and in practice large constants can have more of an affect
than logarithmic terms.  Fortunately, one of the advantages of the
refactoring that I'm doing right now is that it'll be easy to drop in
different dictionary implementations for testing purposes: it should
boil down to writing the code that we'd have to do to get skip list
support anyways, changing one or two function calls, and recompiling.

David Carlton
carlton@math.stanford.edu

Follow-Ups:
- Re: suggestion for dictionary representation
  - From: Daniel Berlin
- Re: suggestion for dictionary representation
  - From: Jim Blandy

References:
- Re: suggestion for dictionary representation
  - From: Daniel Berlin

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]