This is the mail archive of the gdb-prs@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug gdb/10833] GDB crashes on debugging multithreaded program onARM SMP dual core system

From: "pedro at codesourcery dot com" <sourceware-bugzilla at sourceware dot org>
To: gdb-prs at sourceware dot org
Date: Sat, 10 Sep 2011 17:42:55 +0000
Subject: [Bug gdb/10833] GDB crashes on debugging multithreaded program onARM SMP dual core system
Auto-submitted: auto-generated
References: <bug-10833-4717@http.sourceware.org/bugzilla/>

http://sourceware.org/bugzilla/show_bug.cgi?id=10833

--- Comment #11 from Pedro Alves <pedro at codesourcery dot com> 2011-09-10 17:42:55 UTC ---
Yes, I'd suspect something like that.  If there's a flush missing, it'll be a
bug on the kernel side --- ptrace does that for us.  (It's a common-ish kernel
bug on new ports.)

In this particular case, the crash happens while stepping over a magic
breakpoint gdb sets in libc, to be able to track DSO loads/unloads.  We call
this the shlib event breakpoint.

> infrun: stop_pc = 0x40059d64
> bpstat_what: bp_shlib_event

Here it is being hit.

> infrun: BPSTAT_WHAT_SINGLE

GDB decides it should single-step past this breakpoint.  (remove the
breakpoint, single-step, reinsert, resume)

At this point, the breakpoint is removed.

> infrun: no stepping, continue
> infrun: resume (step=1, signal=0), trap_expected=1

Single-step.  But arm-linux doesn't support single-steps, actually.
So instead, GDB computed the possible destinations the instruction could take,
places magic breakpoints there, and lets the thread run until one of the
possible destinations is hit.

> LLR: Preparing to resume process 827, 0, inferior_ptid process 827
> LLR: PTRACE_CONT process 827, 0 (resume event thread)

Here's the resume.

> infrun: prepare_to_wait
> linux_nat_wait: [process -1]
> LLW: waitpid 827 received Trace/breakpoint trap (stopped)
> LLW: Candidate event Trace/breakpoint trap (stopped) in process 827.
> LLW: trap ptid is process 827.
> infrun: target_wait (-1, status) =
> infrun:   827 [process 827],
> infrun:   status->kind = stopped, signal = SIGTRAP
> infrun: infwait_normal_state
> infrun: TARGET_WAITKIND_STOPPED
> infrun: stop_pc = 0x4004e11c
> infrun: software single step trap for process 827

Alright, "software single-step" complete.  The program moved like:

  0x40059d64 -> 0x4004e11c

> infrun: no stepping, continue
> infrun: resume (step=0, signal=0), trap_expected=0

And here we let the program run free again, since we've successfully
moved past the breakpoint we wanted to step over.

> LLR: Preparing to resume process 827, 0, inferior_ptid process 827
> RC: Not resuming sibling process 827 (not stopped)
> LLR: PTRACE_CONT process 827, 0 (resume event thread)
> infrun: prepare_to_wait
> linux_nat_wait: [process -1]
> LLW: waitpid 827 received Illegal instruction (stopped)
> LLW: Candidate event Illegal instruction (stopped) in process 827.
> infrun: target_wait (-1, status) =
> infrun:   827 [process 827],
> infrun:   status->kind = stopped, signal = SIGILL
> infrun: infwait_normal_state
> infrun: TARGET_WAITKIND_STOPPED
> infrun: stop_pc = 0x4004e11c
> infrun: random signal 4

Yet, the instruction at 0x4004e11c doesn't execution -- the kernel claims it's
an illegal instruction.

I suggest disassembling 0x40059d64 and making sure it looks like the next
instruction to execute would really be 0x4004e11c.  And then, disassembling
0x4004e11c, trying to understand if it really should be a valid instruction, or
if we're pointing somewhere in the middle of garbage.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]