This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Should this be on the blocker list for the 7.10 release?

From: Pedro Alves <palves at redhat dot com>
To: Joel Brobecker <brobecker at adacore dot com>, Simon Marchi <simon dot marchi at ericsson dot com>
Cc: gdb-patches <gdb-patches at sourceware dot org>
Date: Tue, 07 Jul 2015 19:04:38 +0100
Subject: Re: Should this be on the blocker list for the 7.10 release?
Authentication-results: sourceware.org; auth=none
References: <559AE482 dot 1010109 at ericsson dot com> <20150707132459 dot GA16734 at adacore dot com> <559BFBBD dot 4000303 at redhat dot com>

On 07/07/2015 05:18 PM, Pedro Alves wrote:
> On 07/07/2015 02:24 PM, Joel Brobecker wrote:
> 
>> Not sure. I think Pedro would be in a better position to answer.
>> For now, I've put this issue as a "maybe" for 7.10; so we will not
>> release until this is fixed, or we explicitly decide it's OK for 7.10.
>>
>> Pedro?
> 
> Let me take a look and understand this better.

OK, the issue is that the new clone thread is found while inside
the linux_stop_and_wait_all_lwps call in this new bit of
code in linux-thread-db.c:

      linux_stop_and_wait_all_lwps ();

      ALL_LWPS (lp)
	if (ptid_get_pid (lp->ptid) == pid)
	  thread_from_lwp (lp->ptid);

      linux_unstop_all_lwps ();

We reach linux_handle_extended_wait with the "stopping"
parameter set to 1, and because of that we don't mark the
new lwp as resumed.  As consequence, the subsequent
resume_stopped_resumed_lwps (called first from that
linux_unstop_all_lwps) never resumes the new LWP...

There's lots of cruft in linux_handle_extended_wait that no
longer makes sense.  This seems to fix your github test
for me, and causes no testsuite regressions.

Did you try converting your test case to a proper
GDB test?  That'd be much appreciated.

---
>From a4f205a18dffaff3344b31e9b8009b1c0de8ba80 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Tue, 7 Jul 2015 17:42:52 +0100
Subject: [PATCH] fix

---
 gdb/linux-nat.c | 91 +++++++++++++++++++++++++--------------------------------
 1 file changed, 40 insertions(+), 51 deletions(-)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index be429f8..ea38ebb 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -2086,43 +2086,7 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
 	  new_lp = add_lwp (ptid_build (ptid_get_pid (lp->ptid), new_pid, 0));
 	  new_lp->cloned = 1;
 	  new_lp->stopped = 1;
-
-	  if (WSTOPSIG (status) != SIGSTOP)
-	    {
-	      /* This can happen if someone starts sending signals to
-		 the new thread before it gets a chance to run, which
-		 have a lower number than SIGSTOP (e.g. SIGUSR1).
-		 This is an unlikely case, and harder to handle for
-		 fork / vfork than for clone, so we do not try - but
-		 we handle it for clone events here.  We'll send
-		 the other signal on to the thread below.  */
-
-	      new_lp->signalled = 1;
-	    }
-	  else
-	    {
-	      struct thread_info *tp;
-
-	      /* When we stop for an event in some other thread, and
-		 pull the thread list just as this thread has cloned,
-		 we'll have seen the new thread in the thread_db list
-		 before handling the CLONE event (glibc's
-		 pthread_create adds the new thread to the thread list
-		 before clone'ing, and has the kernel fill in the
-		 thread's tid on the clone call with
-		 CLONE_PARENT_SETTID).  If that happened, and the core
-		 had requested the new thread to stop, we'll have
-		 killed it with SIGSTOP.  But since SIGSTOP is not an
-		 RT signal, it can only be queued once.  We need to be
-		 careful to not resume the LWP if we wanted it to
-		 stop.  In that case, we'll leave the SIGSTOP pending.
-		 It will later be reported as GDB_SIGNAL_0.  */
-	      tp = find_thread_ptid (new_lp->ptid);
-	      if (tp != NULL && tp->stop_requested)
-		new_lp->last_resume_kind = resume_stop;
-	      else
-		status = 0;
-	    }
+	  new_lp->resumed = 1;
 
 	  /* If the thread_db layer is active, let it record the user
 	     level thread id and status, and add the thread to GDB's
@@ -2136,19 +2100,23 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
 	    }
 
 	  /* Even if we're stopping the thread for some reason
-	     internal to this module, from the user/frontend's
-	     perspective, this new thread is running.  */
+	     internal to this module, from the perspective of infrun
+	     and the user/frontend, this new thread is running until
+	     it next reports a stop.  */
 	  set_running (new_lp->ptid, 1);
-	  if (!stopping)
-	    {
-	      set_executing (new_lp->ptid, 1);
-	      /* thread_db_attach_lwp -> lin_lwp_attach_lwp forced
-		 resume_stop.  */
-	      new_lp->last_resume_kind = resume_continue;
-	    }
+	  set_executing (new_lp->ptid, 1);
 
-	  if (status != 0)
+	  if (WSTOPSIG (status) != SIGSTOP)
 	    {
+	      /* This can happen if someone starts sending signals to
+		 the new thread before it gets a chance to run, which
+		 have a lower number than SIGSTOP (e.g. SIGUSR1).
+		 This is an unlikely case, and harder to handle for
+		 fork / vfork than for clone, so we do not try - but
+		 we handle it for clone events here.  */
+
+	      new_lp->signalled = 1;
+
 	      /* We created NEW_LP so it cannot yet contain STATUS.  */
 	      gdb_assert (new_lp->status == 0);
 
@@ -2162,7 +2130,6 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
 	      new_lp->status = status;
 	    }
 
-	  new_lp->resumed = !stopping;
 	  return 1;
 	}
 
@@ -3673,9 +3640,31 @@ resume_stopped_resumed_lwps (struct lwp_info *lp, void *data)
 {
   ptid_t *wait_ptid_p = data;
 
-  if (lp->stopped
-      && lp->resumed
-      && !lwp_status_pending_p (lp))
+  if (!lp->stopped)
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "RSRL: NOT resuming stopped-resumed LWP %s, "
+			    "not stopped\n",
+			    target_pid_to_str (lp->ptid));
+    }
+  else if (!lp->resumed)
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "RSRL: NOT resuming stopped-resumed LWP %s, "
+			    "not resumed\n",
+			    target_pid_to_str (lp->ptid));
+    }
+  else if (lwp_status_pending_p (lp))
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "RSRL: NOT resuming stopped-resumed LWP %s, "
+			    "has pending status\n",
+			    target_pid_to_str (lp->ptid));
+    }
+  else
     {
       struct regcache *regcache = get_thread_regcache (lp->ptid);
       struct gdbarch *gdbarch = get_regcache_arch (regcache);
-- 
1.9.3

Follow-Ups:
- Re: Should this be on the blocker list for the 7.10 release?
  - From: Simon Marchi

References:
- Should this be on the blocker list for the 7.10 release?
  - From: Simon Marchi
- Re: Should this be on the blocker list for the 7.10 release?
  - From: Joel Brobecker
- Re: Should this be on the blocker list for the 7.10 release?
  - From: Pedro Alves

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]