This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

_bfd_link_section_stabs: hash value weakness


Forgive me if, as a newcomer, I heedlessly stumble over an over-
discussed issue.  I recently encountered a case (on Linux) in which
STABS data in several .o files for a certain header file were
improperly merged in the executable.  That is, the version of the
STABS data for one include file from one .o file was removed, and
replaced with an EXCL reference to the STABS data for the same include
file from another .o file, even though the two versions were not, in
fact identical (specifically, the type indices did not quite agree).

A little rooting around led quickly to _bfd_line_section_stabs, which uses
a simple hashing scheme to determine if two versions of data between
matching BINCL and EINCL entries are identical.  The problem is that the
hashing scheme is a little TOO simple: simply remove all file numbers
from the type references and then add all character values.  

Is mine really the ONLY instance of this sort of collision?  If so,
I'm a little surprised: that particular hashing algorithm is example
#1 of a bad hashing function that I use when teaching data structures
classes!  Admittedly, for this particular purpose, I'd expect it
usually to work, and it IS really simple.  On the other hand, when
there is a collision, the results are rather mysterious and annoying.
Has there been any discussion of using a slightly stronger hashing
function?  I don't think time is an issue; my workstation can do an
MD5 checksum of 18MBytes in 0.3 sec, and a MUCH simpler hash function
than that would do a fine job in this application.

Paul Hilfinger
Ada Core Technologies, Inc.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]