This is the mail archive of the
gdb@sources.redhat.com
mailing list for the GDB project.
Re: GDB locks up -- Cannot find new threads: generic error
- From: Daniel Jacobowitz <drow at false dot org>
- To: David Lecomber <david at allinea dot com>
- Cc: Andreas Schwab <schwab at suse dot de>, gdb <gdb at sources dot redhat dot com>
- Date: Tue, 3 May 2005 10:48:44 -0400
- Subject: Re: GDB locks up -- Cannot find new threads: generic error
- References: <1114627357.31720.81.camel@cpc4-oxfd5-5-0-cust111.oxfd.cable.ntl.com> <20050427190108.GA28978@nevyn.them.org> <jefyxbvnwd.fsf@sykes.suse.de> <1115130086.1638.27.camel@delmo.priv.wark.uk.streamline-computing.com>
On Tue, May 03, 2005 at 03:21:25PM +0100, David Lecomber wrote:
> > >> The system is:
> > >> kernel-2.4.21-27.EL
> > >> glibc-2.3.2-95.30
> > >
> > > At a guess, your kernel is buggered. You really should never see that
> > > warning. The unexpected signal is SIGCHLD; your kernel has accepted
> > > the SETOPTIONS but obviously failed to stop when the test thread
> > > vforked.
> >
> > I think that can happen when the 32 bit ptrace emulation is incomplete,
> > especially if PTRACE_GETEVENTMSG is not properly emulated. That should be
> > fixed in recent (< 9 months) kernels.
>
> Hello Andreas,
>
> I can reproduce this on a SuSE Opteron machine - running 2.6.8-24.13 -
> and 2.6.8 came out 13th August (?). How - other than brokenness - can I
> test if this PTRACE_GETEVENTMSG is the problem?
I assume your GDB is built as a 32-bit application?
If it is broken, than the result will be 64-bit despite the fact that
GDB is a 32-bit binary. We could detect this and disable the feature,
but better still would be to detect and handle it. All relevant code
is in linux-nat.c.
Create a type:
union event_msg {
long l;
long long ll;
};
Initialize LL to zero. Pass that to ptrace instead of &second_pid.
Check the result. If L is non-zero, we can use that; if it isn't,
but LL is non-zero, we need to use LL. Save the result of this test in
a global variable and update all callers. This won't catch all cases,
depending on endianness, but it ought to work anyway.
I don't see how it's going to help x86 though. Little endian; the
worst that would happen is a couple bytes on the stack clobbered.
The PID should be OK.
Anyway, like the attached. Want to try it? I left it noisy for
testing. It seems to do the expected thing on i386.
--
Daniel Jacobowitz
CodeSourcery, LLC
Index: linux-nat.c
===================================================================
RCS file: /cvs/src/src/gdb/linux-nat.c,v
retrieving revision 1.27
diff -u -p -r1.27 linux-nat.c
--- linux-nat.c 6 Mar 2005 16:42:20 -0000 1.27
+++ linux-nat.c 3 May 2005 14:48:05 -0000
@@ -109,6 +109,20 @@ static int linux_supports_tracefork_flag
static int linux_supports_tracevforkdone_flag = -1;
+/* Normally PTRACE_GETEVENTMSG returns a long int. But on some 64-bit
+ systems, even with 32-bit long, it will return a long long. For
+ instance, some x86_64 kernels had broken 32-bit emulation for this
+ option. MIPS n32 also does this. */
+
+union ptrace_event_msg
+{
+ long l;
+ long long ll;
+ long la[2];
+};
+
+static int linux_geteventmsg_uses_long_long = 0;
+
/* Trivial list manipulation functions to keep track of a list of
new stopped processes. */
@@ -189,6 +203,7 @@ static void
linux_test_for_tracefork (int original_pid)
{
int child_pid, ret, status;
+ union ptrace_event_msg event;
long second_pid;
linux_supports_tracefork_flag = 0;
@@ -247,8 +262,30 @@ linux_test_for_tracefork (int original_p
if (ret == child_pid && WIFSTOPPED (status)
&& status >> 16 == PTRACE_EVENT_FORK)
{
- second_pid = 0;
- ret = ptrace (PTRACE_GETEVENTMSG, child_pid, 0, &second_pid);
+ event.la[0] = 0;
+ event.la[1] = 0x42000000;
+ ret = ptrace (PTRACE_GETEVENTMSG, child_pid, 0, &event);
+ if (event.la[0] == 0 && event.la[1] == 0x42000000)
+ {
+ second_pid = 0;
+ warning ("linux_test_for_tracefork: No response");
+ }
+ else if (event.la[0] == 0 && event.la[1] != 0x42000000)
+ {
+ linux_geteventmsg_uses_long_long = 1;
+ second_pid = event.ll;
+ warning ("linux_test_for_tracefork: Needed to use long long");
+ }
+ else if (event.la[0] != 0 && event.la[1] == 0x42000000)
+ {
+ second_pid = event.l;
+ warning ("linux_test_for_tracefork: Needed to use long, as expected");
+ }
+ else
+ {
+ second_pid = event.l;
+ warning ("linux_test_for_tracefork: Needed to use long, but second half was clobbered");
+ }
if (ret == 0 && second_pid != 0)
{
int second_status;
@@ -484,10 +521,12 @@ linux_handle_extended_wait (int pid, int
if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK
|| event == PTRACE_EVENT_CLONE)
{
+ union ptrace_event_msg event_msg;
unsigned long new_pid;
int ret;
- ptrace (PTRACE_GETEVENTMSG, pid, 0, &new_pid);
+ ptrace (PTRACE_GETEVENTMSG, pid, 0, &event_msg);
+ new_pid = linux_geteventmsg_uses_long_long ? event_msg.ll : event_msg.l;
/* If we haven't already seen the new PID stop, wait for it now. */
if (! pull_pid_from_list (&stopped_pids, new_pid))