This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [rfa] symbol hashing, part 1/n - updates to hash functions



On Thursday, October 11, 2001, at 07:58  PM, Elena Zannoni wrote:

> Daniel Jacobowitz writes:
>> This patch still has two logical parts; if you strongly prefer I can 
>> break
>> it up further, but they are somewhat intertwined and I think neither 
>> should
>> be objectionable.  They are:
>>   - Fix a looping bug in msymbol_hash_iw.  It would not stop on '(' if 
>> there
>> was whitespace before it.
>>   - Update to use the identifier hash function that libiberty uses, and
>> more buckets.
>>
>> Is this OK?
>
> Looks ok to me in theory. Except that, why was the
>
>  '% MINIMAL_SYMBOL_HASH_SIZE;'
>
> bit moved outside of the msymbol_hash and msymbol_hash_iw functions?
So msymbol_hash and hash_iw could be used elsewhere.

> You still do the same operation with the results returned by the two
> functions anyway.
>
Except, now they are just hash functions, not hash functions that only 
work for the minsym hash tables.

> Also, where are these 2 functions used besides mynsyms.c?
In a further symbol hashing patch, unless he changed it.

> I think we
> should make them static and remove the extern from symtab.h.
>

> Can you give me an example where the '(' error comes up? (Just so I
> understand it better).  How did you come up with the number of
> buckets?

Averaging based on a large number of gnome, kde, and other real 
applications (ie emacs), compiled with debug info.
349 is way too small, we ended up with chains > length 100, all the time.

>  Is this also used in libiberty?

Which, the hash function?
> Can you fix it and resubmit?
>
> Thanks
> Elena
>
>>
>> --
>> Daniel Jacobowitz                           Carnegie Mellon University
>> MontaVista Software                         Debian GNU/Linux Developer
>>
>> 2001-10-01  Daniel Jacobowitz  <drow@mvista.com>
>>
>> 	* minsyms.c (msymbol_hash): Use better hash function.
>> 	Return hash value without taking modulus.
>> 	(msymbol_hash_iw): Likewise.  Terminate loop at '(' properly.
>> 	(add_minsym_to_hash_table): Take modulus of msymbol_hash's return
>> 	value.
>> 	(add_minsym_to_demangled_hash_table): Likewise for msymbol_hash_iw.
>> 	(lookup_minimal_symbol): Likewise for both.
>>
>> 	* objfiles.h: Increase MINIMAL_SYMBOL_HASH_SIZE to match modern
>> 	binaries.
>>
>> Index: gdb/minsyms.c
>> ===================================================================
>> RCS file: /cvs/src/src/gdb/minsyms.c,v
>> retrieving revision 1.17
>> diff -u -p -r1.17 minsyms.c
>> --- minsyms.c	2001/05/29 10:45:10	1.17
>> +++ minsyms.c	2001/10/01 22:20:47
>> @@ -96,10 +96,12 @@ msymbol_hash_iw (const char *string)
>>        while (isspace (*string))
>>  	++string;
>>        if (*string && *string != '(')
>> -	hash = (31 * hash) + *string;
>> -      ++string;
>> +        {
>> +	  hash = hash * 67 + *string - 113;
>> +	  ++string;
>> +	}
>>      }
>> -  return hash % MINIMAL_SYMBOL_HASH_SIZE;
>> +  return hash;
>>  }
>>
>>  /* Compute a hash code for a string.  */
>> @@ -109,8 +111,8 @@ msymbol_hash (const char *string)
>>  {
>>    unsigned int hash = 0;
>>    for (; *string; ++string)
>> -    hash = (31 * hash) + *string;
>> -  return hash % MINIMAL_SYMBOL_HASH_SIZE;
>> +    hash = hash * 67 + *string - 113;
>> +  return hash;
>>  }
>>
>>  /* Add the minimal symbol SYM to an objfile's minsym hash table, 
>> TABLE.  */
>> @@ -120,7 +122,7 @@ add_minsym_to_hash_table (struct minimal
>>  {
>>    if (sym->hash_next == NULL)
>>      {
>> -      unsigned int hash = msymbol_hash (SYMBOL_NAME (sym));
>> +      unsigned int hash = msymbol_hash (SYMBOL_NAME (sym)) % 
>> MINIMAL_SYMBOL_HASH_SIZE;
>>        sym->hash_next = table[hash];
>>        table[hash] = sym;
>>      }
>> @@ -134,7 +136,7 @@ add_minsym_to_demangled_hash_table (stru
>>  {
>>    if (sym->demangled_hash_next == NULL)
>>      {
>> -      unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME 
>> (sym));
>> +      unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME 
>> (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
>>        sym->demangled_hash_next = table[hash];
>>        table[hash] = sym;
>>      }
>> @@ -162,8 +164,8 @@ lookup_minimal_symbol (register const ch
>>    struct minimal_symbol *found_file_symbol = NULL;
>>    struct minimal_symbol *trampoline_symbol = NULL;
>>
>> -  unsigned int hash = msymbol_hash (name);
>> -  unsigned int dem_hash = msymbol_hash_iw (name);
>> +  unsigned int hash = msymbol_hash (name) % MINIMAL_SYMBOL_HASH_SIZE;
>> +  unsigned int dem_hash = msymbol_hash_iw (name) % 
>> MINIMAL_SYMBOL_HASH_SIZE;
>>
>>  #ifdef SOFUN_ADDRESS_MAYBE_MISSING
>>    if (sfile != NULL)
>>
>> Index: gdb/objfiles.h
>> ===================================================================
>> RCS file: /cvs/src/src/gdb/objfiles.h,v
>> retrieving revision 1.8
>> diff -u -p -r1.8 objfiles.h
>> --- objfiles.h	2001/03/06 08:21:11	1.8
>> +++ objfiles.h	2001/10/01 22:20:53
>> @@ -202,7 +202,7 @@ extern void print_objfile_statistics (vo
>>  extern void print_symbol_bcache_statistics (void);
>>
>>  /* Number of entries in the minimal symbol hash table.  */
>> -#define MINIMAL_SYMBOL_HASH_SIZE 349
>> +#define MINIMAL_SYMBOL_HASH_SIZE 2039
>>
>>  /* Master structure for keeping track of each file from which
>>     gdb reads symbols.  There are several ways these get allocated: 1.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]