This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Strange ld.gold segmentation error issues.


> Oh, I am using -gsplit-dwarf switch to gcc, g++, and
> pass -gdb-index to gold. Does it matter? (with the older version of
> GNU gold, it did not cause this segmentation error.)

The crash is in the code that creates the .gdb_index section, so that
option is definitely significant.

> Initially, I suspected that it could be an OOM issue since there are
> many processes running "make -j4 ...", but then I found I have more
> than 2.5 GB of main memory is free at the time the GNU gold  binary was
> invoked to produce a .so library just before the segfault occurs The
> library is NOT THAT big.
>
> Also, to my surprise, changing "-j4" make switch to "-j3" did not
> change the issue. Even with less number of processes invoked by make,
> the segfault still occurred. So OOM is unlikely, and come to think of
> it, if OOM had happened, the kernel should have recorded it, but I did
> not see such messages in kernel logs.
>
> Anyway, so, I modified my local version of "ld" to invoke a shell
> script, which
> checks if the particular library (here libnspr4.so) is going to be
> created, and if so, invokes ld.new (gold binary) under gdb and see
> what happens. Otherwise it simply invokes gold binary with the passed
> arguments.
> (In the previous posting, I thought it was libmozalloc.so that caused
> the blowup, but as it turned out it is the next target, libnspr4.so)
>
>
> Funny, the first few times, it did not trip ?!??
> Maybe I was doing something wrong.
> On the third try, I could capture the stacktrace.

This is puzzling. I'd think a problem like this would be reproducible,
unless there's some sort of race going on with the .o files. Does the
problem go away if you change make to use "-j1"?

> [I obtained three dumps.
> One with my stock ~/.gdbinit tailored to mozilla thunderbird
> debugging. But it contained a set of spurious warnings related to
> files referenced in .gdbinit.
> The 2nd ONE was obtained after this .gdbinit file renamed to .gdbinit.save
> to remove the spurious warning.
> The 3rd one was obtained after I cleared ccache completely.
> I cleared ccache's cache to make sure
> that I am not using corrupt object files (for some mysterious
> reason). I use a version of ccache enhanced to support -gsplit-dwarf.
> https://bitbucket.org/zephyrus00jp/ccache-gsplit-dwarf-support
> https://bugzilla.samba.org/show_bug.cgi?id=10005
>
> The second and third stack trace matched completely (except for the
> process ID that is printed at the end.) So I am sure ccache is not
> involved with the problem.
> So I am showing the 3rd dump below.
>
> Funny thing is that I can re-invoke top-most make -f client.mk with
> suitable environment variable setting, etc., and can create a working
> mozilla thunderbird (!?) I wonder in what condition the left over
> libnspr4.so is. Maybe the link/build system of mozilla thunderbird is
> clever enough to figure out that libnspr4.a is used instead(?), but I
> digress.

Since the linker is crashing early during the first pass, it will not
have even created the output file yet, so you are probably left with
an older copy left over from a link that did not crash.

> Program received signal SIGSEGV, Segmentation fault.
> gold::Gdb_index::add_symbol (this=0x901e90, cu_index=3,
>     sym_name=0x2aaaaaaec000 <Address 0x2aaaaaaec000 out of bounds>,
>     flags=0 '\000') at gdb-index.cc:1128
> 1128          reinterpret_cast<const unsigned char*>(sym_name));
> (gdb) #0  gold::Gdb_index::add_symbol (this=0x901e90, cu_index=3,
>     sym_name=0x2aaaaaaec000 <Address 0x2aaaaaaec000 out of bounds>,
>     flags=0 '\000') at gdb-index.cc:1128
> #1  0x0000000000517602 in gold::Gdb_index_info_reader::read_pubtable (
>     this=0x7fffffff5a30, table=0x9022d0, offset=<optimized out>)
>     at gdb-index.cc:879

This is definitely helpful -- thanks for going through so much trouble
to get these stack traces. This shows that we are in the middle of
hashing a name from the .debug_pubnames (or .debug_gnu_pubnames)
table, but for some reason we have a name that runs off the end of the
table with no null-termination. That should not happen, and suggests a
corrupt .o file. It would be helpful to figure out which .o file we're
reading at this point, but I'll need you to do a bit more to collect
that...

> #2  0x00000000005176c9 in
> gold::Gdb_index_info_reader::read_pubnames_and_pubtypes
> (this=0x7fffffff5a30, die=0x7fffffff5960) at gdb-index.cc:942
> #3  0x0000000000518009 in gold::Gdb_index_info_reader::visit_top_die (
>     this=0x7fffffff5a30, die=0x7fffffff5960) at gdb-index.cc:379
> #4  0x00000000005180d3 in
> gold::Gdb_index_info_reader::visit_compilation_unit
>     (this=0x7fffffff5a30, cu_offset=<optimized out>,
>     cu_length=<optimized out>, root_die=<optimized out>) at gdb-index.cc:326
> #5  0x000000000062a8f2 in gold::Dwarf_info_reader::do_parse<false> (
>     this=this@entry=0x7fffffff5a30) at dwarf_reader.cc:1363
> #6  0x000000000062746e in gold::Dwarf_info_reader::parse (
>     this=this@entry=0x7fffffff5a30) at dwarf_reader.cc:1234
> #7  0x00000000005187b1 in gold::Gdb_index::scan_debug_info (this=0x901e90,
>     is_type_unit=is_type_unit@entry=false, object=object@entry=0x946f90,
>     symbols=0x2aaaaaaeb150 "", symbols@entry=0xb <Address 0xb out of
> bounds>,
>     symbols_size=symbols_size@entry=504, shndx=<optimized out>,
>     reloc_shndx=9, reloc_type=4) at gdb-index.cc:1119
> #8  0x0000000000550939 in gold::Layout::add_to_gdb_index<64, false> (
>     this=this@entry=0x7fffffff6f30, is_type_unit=is_type_unit@entry=false,
>     object=object@entry=0x946f90, symbols=0xb <Address 0xb out of bounds>,
>     symbols@entry=0x2aaaaaaeb150 "", symbols_size=symbols_size@entry=504,
>     shndx=<optimized out>, reloc_shndx=9, reloc_type=4) at layout.cc:1569

In frame #8, the value of object->name_ would tell you which .o file
it's reading. If you can find this and send me a copy of that .o file,
I'd like to take a look at it. (Since you say this is actually a
fairly small link, you could just send me all the .o files.)

-cary


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]