This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch 2/2] iFort compat.: case insensitive symbols (PR 11313)


On Mon, 22 Nov 2010 20:30:41 +0100, Joel Brobecker wrote:
> I was actually wondering about the change in the hash algorithm more
> than the cost of calling tolower.  For instance, "tmp" and "Tmp" would
> have had different hash values, but not anymore.  So, presumably, when
> you start looking up for "tmp", the associated hash bucket will also
> contain "Tmp" whereas it wouldn't before. I need to look at the actual
> hashing parameters to see if we can figure out whether this should have
> any real effect in practice...  If the number of elements in each bucket
> is reasonable, a few more iterations shouldn't be an issue.

This is a more general issue.

I think (I did not measure it) most of the symbols differ even after tolower.
The symbols like tmp<->Tmp exist but rarely.  I agree the hashing function
will get worse but I did not even measure it considering the change
negligible.

There is more an issue MINIMAL_SYMBOL_HASH_SIZE is constant:
	#define MINIMAL_SYMBOL_HASH_SIZE 2039

Some objfiles have many symbols:
	libwebkit.so.debug: 54980 symbols
		/MINIMAL_SYMBOL_HASH_SIZE = 27
		log2(54980)=16
	gdb symtab: 36452 symbols
		/MINIMAL_SYMBOL_HASH_SIZE = 18
		log2(54980)=16

In such case in fact the whole hash table makes no sense and it is even
cheaper to just do binary search on objfile->msymbols which is already
qsort-ed and be done with it.

Still a hash table should be faster than a binary search but the hash table
size would need to be adaptable.

But rather than optimizations of this which reduce just the CPU load which was
in my measurements 2% during GDB startup (due to its waiting on disk).  We
could for example rather delay searching+loading any objfiles' symbols we do
not need which would do another major GDB startup time reduction like
.gdb_index did.  This is the reason I did not intend to spend any time on some
CPU discutable optimizations, they IMO do not make sense with the current
state of gdb performance.


Thanks,
Jan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]