This is the mail archive of the
gdb-prs@sourceware.org
mailing list for the GDB project.
Re: corefiles/2121: gdb crashes sometimes on huge segment mappings
- From: Lynn Kerby <lfk at kerbit dot net>
- To: nobody at sources dot redhat dot com
- Cc: gdb-prs at sources dot redhat dot com,
- Date: 3 May 2006 21:28:01 -0000
- Subject: Re: corefiles/2121: gdb crashes sometimes on huge segment mappings
- Reply-to: Lynn Kerby <lfk at kerbit dot net>
The following reply was made to PR corefiles/2121; it has been noted by GNATS.
From: Lynn Kerby <lfk@kerbit.net>
To: Daniel Jacobowitz <drow@false.org>
Cc: gdb-gnats@sources.redhat.com
Subject: Re: corefiles/2121: gdb crashes sometimes on huge segment mappings
Date: Wed, 3 May 2006 14:26:20 -0700
--Apple-Mail-15-880655953
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
format=flowed
On May 3, 2006, at 1:35 PM, Daniel Jacobowitz wrote:
> On Wed, May 03, 2006 at 08:18:37PM -0000, lfk@kerbit.net wrote:
>> I'm trying to debug core files (and live processes) that have large
>> address spaces (> 2 gig data) and have found that gdb is unable to
>> deal with some of the resulting core files. I had been using the
>> 6.1post packages from RedHat which contain fixes for largefile
>> support in the BFD library and other areas and recently decided to
>> try upgrading to the 6.4 release (the number of patches in 6.1post is
>> quite large) because I would occasionally come across a process/core
>> that GDB couldn't handle.
>>
>> There is a fundamental issue with the code that handles segment
>> mappings in the inferior process or core file when those mappings are
>> larger than SSIZE_MAX (2G - 1 on x86/32bit). The fread/fwrite stdio
>> calls simply return 0 if the size (after being munged into a signed
>> type) is < 0. This results in both unreadable core files and gdb
>> crashes at times.
>
>>> How-To-Repeat:
>
>> Create a program that mallocs a 2.2GB chunk of space (note a large
>> memory RedHat ES/AS config may be required), run it under gdb.
>
> That's not much information about how to reproduce it. What did you
> have to _do_ in GDB to cause a problem?
Sorry, I did leave out one step - attempt to generate a core via the
gcore command. It really isn't terribly difficult to reproduce.
Here is the output using the base 6.4 code compiled for 64 bit BFD:
> [root@lumpref20 gdb]# ./gdb /usr/src/redhat/SPECS/x
> GNU gdb 6.4
> Copyright 2005 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for
> details.
> This GDB was configured as "i686-pc-linux-gnu"...(no debugging symbols
> found)
> Using host libthread_db library "/lib/tls/libthread_db.so.1".
>
> Setting up the environment for debugging gdb.
> Function "internal_error" not defined.
> Function "info_command" not defined.
> .gdbinit:8: Error in sourced command file:
> No breakpoint number 0.
> (gdb) r
> Starting program: /usr/src/redhat/SPECS/x
> (no debugging symbols found)
> (no debugging symbols found)
> pid 9720
>
> Program received signal SIGINT, Interrupt.
> 0x008a97bb in __nanosleep_nocancel () from /lib/tls/libc.so.6
> (gdb) gcore
> Segmentation fault (core dumped)
I'm also attaching the source file x.c which does a large malloc,
prints its pid (for easy attach), and sleeps.
--Apple-Mail-15-880655953
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
x-unix-mode=0644;
name="x.c"
Content-Disposition: attachment;
filename=x.c
#include <stdio.h>
#include <stdlib.h>
main()
{
void *buf;
buf = malloc(2300LL * 1024 * 1024); /* just over 2.2G */
if (buf != NULL) {
fprintf(stderr, "pid %d\n", getpid());
sleep(10000);
} else {
fprintf(stderr, "malloc of 2.2gb buffer failed\n");
}
exit(0);
}
--Apple-Mail-15-880655953
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
format=flowed
> I think your patch is barking up the wrong tree; these interfaces are
> perfectly fine, they're just not supposed to be used for unboundedly
> large transfers. If GDB ever needs to malloc a 2.2GB buffer to read
> into, it's already doing something wrong.
I agree with you in the last part, GDB shouldn't need to malloc a 2.2GB
buffer to read/write into but...
Changing the type of the length arg from int to bfd_signed_vma and
LONGEST on some of the calls was done because initially it appeared to
be an overflow/underflow related to sign extending at various places.
I could probably be easily convinced that those changes are not
"required".
As for barking up the wrong tree.... You could be right, but I've
spent a fair bit of time debugging this issue and have found that the
patch works. I may be too close to the hardware here and seeing things
a little skewed as a result. If so, please feel free to point me at a
simpler solution (I'll even settle for more correct and complicated).
In fact, I think only the gdb/target.c change should be required to fix
this issue, but somehow the BFD stuff is getting involved (via a call
through the bfd_set_section_contents call).
> --
> Daniel Jacobowitz
> CodeSourcery
Hopefully this will clear things up.
Lynn Kerby
--Apple-Mail-15-880655953--