This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Fix spurious SIGTRAP when in non-stop mode


This spurious SIGTRAP was triggered when running pthreads.exp with the
backend in non-stop mode.

Since a few weeks, wait_for_inferior/fetch_inferior_event no longer
flushes the register cache before calling target_wait, as we do it
instead prior to resuming the target (within target_wait itself).

But, there are places in the linux-nat.c target that resume an LWP
without the core seeing the stop.  Those all call registers_changed
before resuming the LWP.  All, except one.  And that's the one causing
the trouble.

This is what I wrote when debugging the problem:

As can be seen below, LWP 23365 hits a breakpoint at 0x4007df, and has
it's PC rewinded, and the SIGTRAP discarded behind the core's back.
Later, we see that

 RSRL: resuming stopped-resumed LWP Thread 0x2aaaab90a700 (LWP 23365): step=0

triggers, and so LWP 23365 is re-resumed, which immediately re-hits
the breakpoint.  Thus it's PC is advanced by 1 (this is x86_64), but
since the regcache hadn't been flushed, we still see the PC pointing
at the breakpoint address, and thus miss doing the decr_pc_after_break
adjustment in adjust_pc_after_break.  Shortly after, there's a
context-switch, which calls switch_to_thread, and flushes the
regcache.  From that that point on, we now see the thread's real, and
unadjusted PC (0x4007e0), and because there's no breakpoint there, we
can't explain the SIGTRAP, reporting it to the user as spurious.

infrun: Waiting for specific thread Thread 0x2aaaab709700 (LWP 23364).
linux_nat_wait: [Thread 0x2aaaab709700 (LWP 23364)]
LLW: enter
LLW: Waiting for specific LWP Thread 0x2aaaab709700 (LWP 23364).
LNW: waitpid(-1, ...) returned 23365, ERRNO-OK
LLW: waitpid 23365 received Trace/breakpoint trap (stopped)
LLTA: KILL(SIG0) Thread 0x2aaaab90a700 (LWP 23365) (OK)
CB: Push back breakpoint for Thread 0x2aaaab90a700 (LWP 23365)
LNW: waitpid(-1, ...) returned 23364, ERRNO-OK
LLW: waitpid 23364 received Trace/breakpoint trap (stopped)
LLTA: KILL(SIG0) Thread 0x2aaaab709700 (LWP 23364) (OK)
LLW: Candidate event Trace/breakpoint trap (stopped) in Thread 0x2aaaab709700 (LWP 23364).
LLW: trap ptid is Thread 0x2aaaab709700 (LWP 23364).
LLW: exit
infrun: target_wait (23361.23364.0 [Thread 0x2aaaab709700 (LWP 23364)], status) =
infrun:   23361.23364.0 [Thread 0x2aaaab709700 (LWP 23364)],
infrun:   status->kind = stopped, signal = SIGTRAP
infrun: not adjusted PC
infrun: infwait_thread_hop_state
infrun: TARGET_WAITKIND_STOPPED
displaced: restored Thread 0x2aaaab709700 (LWP 23364) 0x4006f2
displaced: fixup (0x4007df, 0x4006f2), insn = 0x8b 0x81 ...
displaced: restoring reg 2 to 0x2aaaab21f5ad
displaced: relocated %rip from 0x4006f8 to 0x4007e5
infrun: stop_pc = 0x4007e5
infrun: no stepping, continue
infrun: resume (step=0, signal=0), trap_expected=0, current thread [Thread 0x2aaaab709700 (LWP 23364)] at 0x4007e5
infrun: do_target_resume: executing
infrun: do_target_resume: executing
infrun: resuming threads individually
infrun: Not resuming Thread 0x2aaaab90a700 (LWP 23365) (executing)
infrun: resuming Thread 0x2aaaab709700 (LWP 23364) (no pending status)
LLR: Preparing to resume Thread 0x2aaaab709700 (LWP 23364), 0, inferior_ptid Thread 0x2aaaab709700 (LWP 23364)
LLR: PTRACE_CONT process 23364, 0 (resume event thread)
infrun: Not resuming Thread 0x2aaaab506b40 (LWP 23361) (executing)
infrun: prepare_to_wait
linux_nat_wait: [process -1]
RSRL: resuming stopped-resumed LWP Thread 0x2aaaab90a700 (LWP 23365): step=0
LLW: enter
LNW: waitpid(-1, ...) returned 0, ERRNO-OK
LLW: exit (ignore)
infrun: target_wait (-1.0.0, status) =
infrun:   -1.0.0 [process -1],
infrun:   status->kind = ignore
infrun: TARGET_WAITKIND_IGNORE
infrun: prepare_to_wait
sigchld
sigchld
linux_nat_wait: [process -1]
LLW: enter
LNW: waitpid(-1, ...) returned 23365, ERRNO-OK
LLW: waitpid 23365 received Trace/breakpoint trap (stopped)
LLTA: KILL(SIG0) Thread 0x2aaaab90a700 (LWP 23365) (OK)
LLW: Candidate event Trace/breakpoint trap (stopped) in Thread 0x2aaaab90a700 (LWP 23365).
LLW: trap ptid is Thread 0x2aaaab90a700 (LWP 23365).
LLW: exit
infrun: target_wait (-1.0.0, status) =
infrun:   23361.23365.0 [Thread 0x2aaaab90a700 (LWP 23365)],
infrun:   status->kind = stopped, signal = SIGTRAP
infrun: not adjusted PC
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x4007df
infrun: context switch
infrun: Switching context from Thread 0x2aaaab709700 (LWP 23364) to Thread 0x2aaaab90a700 (LWP 23365)
infrun: random signal 5

Program received signal SIGTRAP, Trace/breakpoint trap.

...

[Switching to Thread 0x2aaaab90a700 (LWP 23365)]
0x00000000004007e0 in common_routine (arg=2) at ../../../gdb/gdb/testsuite/gdb.threads/pthreads.c:52
52	  if (verbose) printf("common_routine (%d)\n", arg);


Tested on x86_64-linux.  Will apply in a bit.

2011-12-05  Pedro Alves  <pedro@codesourcery.com>

	gdb/
	* linux-nat.c (resume_stopped_resumed_lwps): Call registers_changed.
---
 gdb/linux-nat.c |   14 +++++++++-----
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index d54f303..19b4b57 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -3921,24 +3921,28 @@ resume_stopped_resumed_lwps (struct lwp_info *lp, void *data)
       && lp->status == 0
       && lp->waitstatus.kind == TARGET_WAITKIND_IGNORE)
     {
+      struct regcache *regcache = get_thread_regcache (lp->ptid);
+      struct gdbarch *gdbarch = get_regcache_arch (regcache);
+      CORE_ADDR pc = regcache_read_pc (regcache);
+
       gdb_assert (is_executing (lp->ptid));
 
       /* Don't bother if there's a breakpoint at PC that we'd hit
 	 immediately, and we're not waiting for this LWP.  */
       if (!ptid_match (lp->ptid, *wait_ptid_p))
 	{
-	  struct regcache *regcache = get_thread_regcache (lp->ptid);
-	  CORE_ADDR pc = regcache_read_pc (regcache);
-
 	  if (breakpoint_inserted_here_p (get_regcache_aspace (regcache), pc))
 	    return 0;
 	}
 
       if (debug_linux_nat)
 	fprintf_unfiltered (gdb_stdlog,
-			    "RSRL: resuming stopped-resumed LWP %s\n",
-			    target_pid_to_str (lp->ptid));
+			    "RSRL: resuming stopped-resumed LWP %s at %s: step=%d\n",
+			    target_pid_to_str (lp->ptid),
+			    paddress (gdbarch, pc),
+			    lp->step);
 
+      registers_changed ();
       linux_ops->to_resume (linux_ops, pid_to_ptid (GET_LWP (lp->ptid)),
 			    lp->step, TARGET_SIGNAL_0);
       lp->stopped = 0;


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]