This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug runtime/12642] New: utrace: taskfinder misses events when main thread does not go through at least one quiesce


http://sourceware.org/bugzilla/show_bug.cgi?id=12642

           Summary: utrace: taskfinder misses events when main thread does
                    not go through at least one quiesce
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
        AssignedTo: systemtap@sourceware.org
        ReportedBy: raysonlogin@gmail.com


If a multi-threaded program with the main thread not going through at least one
quiesce after systemtap/uprobes tracing starts, then Systemtap is not able to
detect the events.

Example:

#include <pthread.h>
#include <stdio.h>

void spool_write_script(int jobid) __attribute__ ((noinline));

void spool_write_script(int jobid)
{
 printf("sleeping... %d sec\n", 1 + jobid % 2);
 sleep(2 + jobid % 2);
}

mythread()
{
 int i;
 for (i=0;i<30;i++)
  spool_write_script(i);
}

int main()
{
 pthread_t tid;

 printf("pid = %d\n", getpid());

#ifdef MAINWAITS
 getchar();
#endif

 pthread_create(&tid, NULL, mythread, NULL);

 pthread_join(tid, NULL);

}


Reproduce with:
stap hangs when the process is traced with:
# stap -e 'probe process("./a.out").function("spool_write_script")
{printf("called")}' -x <PID>

However, when the testcase is compiled with -DMAINWAITS, and when stap is
started before inputing a key (getchar()), then stap is able to get the events.
(Also, starting the program with -c would make it work.)


Analysis:
In the systemtap runtime, __stp_utrace_task_finder_target_quiesce() checks if
it is encountering the main thread:

        /* Call the callbacks.  Assume that if the thread is a
         * thread group leader, it is a process. */
        __stp_call_callbacks(tgt, tsk, 1, (tsk->pid == tsk->tgid));

For slave threads, tsk->pid == tsk->tgid is false, and the callback function in
uprobes only handles the main thread, as the address space is shared by all the
threads. In stap_uprobe_process_found():

  if (! process_p) return 0; /* ignore threads */

And thus stap_uprobe_change_plus() is not called for the threads.

(As a hack) Simply passing 1 instead of (tsk->pid == tsk->tgid) would make it
work for the testcase above.

Severity:
Normal - The testcase is from a real application (daemon) that creates slave
threads, and the main thread waits until clean-up time (ie. daemon restart,
exit, etc...).

A workaround is to force the main thread to quiesce after stap runs, e.g. by
attaching gdb.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]