This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ttrace: Protocal error


> Note, I know nothing about ttrace and HP-UX.

That makes us equal.

> On Friday 08 August 2008 19:33:06, John David Anglin wrote:
> > While were on the subject of threads, it seems we are still not in
> > a position to debug the vla6.f90 failure:
> 
> What's this test doing different?

It's not entirely clear.  However, it is using emulated TLS support
and multiple lwp threads.  This support may be initialized by a constructor
run directly by the dynamic loader.  There's a timing or some other
random effect associated with the failure (could be some variable is
being randomly intialized).

> > #4  0x000a3390 in target_resume (ptid=3D
> >     {pid =3D 1953788513, lwp =3D 1667563520, tid =3D 774778670}, step=3D0,
> >     signal=3DTARGET_SIGNAL_0) at ../../src/gdb/target.c:1789
> 
>              ^^^^^^^^^^        ^^^^^^^^^^        ^^^^^^^^^
> 
> I assume this ptid is GDB getting bogus info, right?

That's pretty common for optimized code. 

> This should be setting the dying flag on the thread, but
> it is still listed in gdb's thread table.

Yes.

>    case TTEVT_LWP_EXIT:
>       if (print_thread_events)
> 	printf_unfiltered (_("[%s exited]\n"), target_pid_to_str (ptid));
>       ti =3D find_thread_pid (ptid);
>       gdb_assert (ti !=3D NULL);
>       ((struct inf_ttrace_private_thread_info *)ti->private)->dying =3D 1;
>       inf_ttrace_num_lwps--;
>       ttrace (TT_LWP_CONTINUE, ptid_get_pid (ptid),
>               ptid_get_lwp (ptid), TT_NOPC, 0, 0);
>       /* If we don't return -1 here, core GDB will re-add the thread.  */
>       ptid =3D minus_one_ptid;
>       break;

The dying flag is set when the resume is attempted.

> inf_ttrace_resume:
> 
>   if (ptid_equal (ptid, minus_one_ptid))
>     {
>       /* Let all the other threads run too.  */
>       iterate_over_threads (inf_ttrace_resume_callback, NULL);
>       iterate_over_threads (inf_ttrace_delete_dying_threads_callback, NULL);
>     }
> 
> Is this the first resume after that "exit" notification?
> Any chance we're trying to resume a dead thread here then?

Yes.  That's what I think is happening.

> What happens when you delete the dying threads before resuming?
> 
>       iterate_over_threads (inf_ttrace_delete_dying_threads_callback, NULL);
>       iterate_over_threads (inf_ttrace_resume_callback, NULL);
>       iterate_over_threads (inf_ttrace_delete_dying_threads_callback, NULL);
> 
> Hmmm, I assume not, if my sources match yours, your the program is stopped
> at a syscall event:
> 
>       /* Be careful not to try to gather much state about a thread
>          that's in a syscall.  It's frequently a losing proposition.  */
>     case TARGET_WAITKIND_SYSCALL_ENTRY:
>       if (debug_infrun)
>         fprintf_unfiltered (gdb_stdlog, "infrun:=20
> TARGET_WAITKIND_SYSCALL_ENTRY\n");
>       resume (0, TARGET_SIGNAL_0);
>       prepare_to_wait (ecs);
>       return;
> 
> So, there should have already been a resume in between.
> 
> Could you check which thread got the syscall event?  Is it the same
> thread we fail to resume?  Is it possibly to disable syscall events,
> just for checking if it is related?

I don't know how to disable syscall events.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]