This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: Should this be on the blocker list for the 7.10 release?
- From: Pedro Alves <palves at redhat dot com>
- To: Joel Brobecker <brobecker at adacore dot com>, Simon Marchi <simon dot marchi at ericsson dot com>
- Cc: gdb-patches <gdb-patches at sourceware dot org>
- Date: Tue, 07 Jul 2015 19:04:38 +0100
- Subject: Re: Should this be on the blocker list for the 7.10 release?
- Authentication-results: sourceware.org; auth=none
- References: <559AE482 dot 1010109 at ericsson dot com> <20150707132459 dot GA16734 at adacore dot com> <559BFBBD dot 4000303 at redhat dot com>
On 07/07/2015 05:18 PM, Pedro Alves wrote:
> On 07/07/2015 02:24 PM, Joel Brobecker wrote:
>
>> Not sure. I think Pedro would be in a better position to answer.
>> For now, I've put this issue as a "maybe" for 7.10; so we will not
>> release until this is fixed, or we explicitly decide it's OK for 7.10.
>>
>> Pedro?
>
> Let me take a look and understand this better.
OK, the issue is that the new clone thread is found while inside
the linux_stop_and_wait_all_lwps call in this new bit of
code in linux-thread-db.c:
linux_stop_and_wait_all_lwps ();
ALL_LWPS (lp)
if (ptid_get_pid (lp->ptid) == pid)
thread_from_lwp (lp->ptid);
linux_unstop_all_lwps ();
We reach linux_handle_extended_wait with the "stopping"
parameter set to 1, and because of that we don't mark the
new lwp as resumed. As consequence, the subsequent
resume_stopped_resumed_lwps (called first from that
linux_unstop_all_lwps) never resumes the new LWP...
There's lots of cruft in linux_handle_extended_wait that no
longer makes sense. This seems to fix your github test
for me, and causes no testsuite regressions.
Did you try converting your test case to a proper
GDB test? That'd be much appreciated.
---
>From a4f205a18dffaff3344b31e9b8009b1c0de8ba80 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Tue, 7 Jul 2015 17:42:52 +0100
Subject: [PATCH] fix
---
gdb/linux-nat.c | 91 +++++++++++++++++++++++++--------------------------------
1 file changed, 40 insertions(+), 51 deletions(-)
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index be429f8..ea38ebb 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -2086,43 +2086,7 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
new_lp = add_lwp (ptid_build (ptid_get_pid (lp->ptid), new_pid, 0));
new_lp->cloned = 1;
new_lp->stopped = 1;
-
- if (WSTOPSIG (status) != SIGSTOP)
- {
- /* This can happen if someone starts sending signals to
- the new thread before it gets a chance to run, which
- have a lower number than SIGSTOP (e.g. SIGUSR1).
- This is an unlikely case, and harder to handle for
- fork / vfork than for clone, so we do not try - but
- we handle it for clone events here. We'll send
- the other signal on to the thread below. */
-
- new_lp->signalled = 1;
- }
- else
- {
- struct thread_info *tp;
-
- /* When we stop for an event in some other thread, and
- pull the thread list just as this thread has cloned,
- we'll have seen the new thread in the thread_db list
- before handling the CLONE event (glibc's
- pthread_create adds the new thread to the thread list
- before clone'ing, and has the kernel fill in the
- thread's tid on the clone call with
- CLONE_PARENT_SETTID). If that happened, and the core
- had requested the new thread to stop, we'll have
- killed it with SIGSTOP. But since SIGSTOP is not an
- RT signal, it can only be queued once. We need to be
- careful to not resume the LWP if we wanted it to
- stop. In that case, we'll leave the SIGSTOP pending.
- It will later be reported as GDB_SIGNAL_0. */
- tp = find_thread_ptid (new_lp->ptid);
- if (tp != NULL && tp->stop_requested)
- new_lp->last_resume_kind = resume_stop;
- else
- status = 0;
- }
+ new_lp->resumed = 1;
/* If the thread_db layer is active, let it record the user
level thread id and status, and add the thread to GDB's
@@ -2136,19 +2100,23 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
}
/* Even if we're stopping the thread for some reason
- internal to this module, from the user/frontend's
- perspective, this new thread is running. */
+ internal to this module, from the perspective of infrun
+ and the user/frontend, this new thread is running until
+ it next reports a stop. */
set_running (new_lp->ptid, 1);
- if (!stopping)
- {
- set_executing (new_lp->ptid, 1);
- /* thread_db_attach_lwp -> lin_lwp_attach_lwp forced
- resume_stop. */
- new_lp->last_resume_kind = resume_continue;
- }
+ set_executing (new_lp->ptid, 1);
- if (status != 0)
+ if (WSTOPSIG (status) != SIGSTOP)
{
+ /* This can happen if someone starts sending signals to
+ the new thread before it gets a chance to run, which
+ have a lower number than SIGSTOP (e.g. SIGUSR1).
+ This is an unlikely case, and harder to handle for
+ fork / vfork than for clone, so we do not try - but
+ we handle it for clone events here. */
+
+ new_lp->signalled = 1;
+
/* We created NEW_LP so it cannot yet contain STATUS. */
gdb_assert (new_lp->status == 0);
@@ -2162,7 +2130,6 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
new_lp->status = status;
}
- new_lp->resumed = !stopping;
return 1;
}
@@ -3673,9 +3640,31 @@ resume_stopped_resumed_lwps (struct lwp_info *lp, void *data)
{
ptid_t *wait_ptid_p = data;
- if (lp->stopped
- && lp->resumed
- && !lwp_status_pending_p (lp))
+ if (!lp->stopped)
+ {
+ if (debug_linux_nat)
+ fprintf_unfiltered (gdb_stdlog,
+ "RSRL: NOT resuming stopped-resumed LWP %s, "
+ "not stopped\n",
+ target_pid_to_str (lp->ptid));
+ }
+ else if (!lp->resumed)
+ {
+ if (debug_linux_nat)
+ fprintf_unfiltered (gdb_stdlog,
+ "RSRL: NOT resuming stopped-resumed LWP %s, "
+ "not resumed\n",
+ target_pid_to_str (lp->ptid));
+ }
+ else if (lwp_status_pending_p (lp))
+ {
+ if (debug_linux_nat)
+ fprintf_unfiltered (gdb_stdlog,
+ "RSRL: NOT resuming stopped-resumed LWP %s, "
+ "has pending status\n",
+ target_pid_to_str (lp->ptid));
+ }
+ else
{
struct regcache *regcache = get_thread_regcache (lp->ptid);
struct gdbarch *gdbarch = get_regcache_arch (regcache);
--
1.9.3