This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: stpd issues
- From: Martin Hunt <hunt at redhat dot com>
- To: Hien Nguyen <hien at us dot ibm dot com>
- Cc: Thomas Zanussi <trz at us dot ibm dot com>, SystemTAP <systemtap at sources dot redhat dot com>
- Date: Sun, 21 Aug 2005 14:36:28 -0700
- Subject: Re: stpd issues
- Organization: Red Hat Inc.
- References: <43065321.1060902@us.ibm.com>
On Fri, 2005-08-19 at 14:46 -0700, Hien Nguyen wrote:
> Hi Martin, Tom,
>
> I observed some issues with the new transport with the attached module
> (insert jprobes and kretprobes on all system calls). Without #define
> STP_RELAYFS
> No problem loading the module, but when I pressed CTRL-C, stpd appears
> to hang.
>
> With #define STP_RELAYFS.
> No problem loading the module, but there is no stdout until terminating
> stpd with Ctrl-C. It looks like data bufferred but not flushed out to
> stdout.
Thank you for trying out the latest code so quickly.
The problem here is that your code probes sys_read() which is now used
by stpd to read the output from the probes. Each time sys_read() is
called, jprobes prints some data and flushes it, which causes stpd to
call sys_read(), which causes more data to be printed, etc.
The same problem would be hit with relayfs if you put a probe on some
relayfs functions. (Or netlink.) It doesn't happen with sys_read
because relayfs collects data into large per-cpu buffers so that each
sys_read gets the results from many stp_prints.
I don't have a simple solution. In its main loop, stpd only uses read()
and write(). Maybe the best thing to do would be to encourage the use of
a simple check on sys_read() and sys_write() to filter out all probes
caused by stpd. The pid for stpd is always _stp_pid.
For example,
ssize_t inst_sys_read(unsigned int fd, char __user *buf, size_t count)
{
if (current->pid != _stp_pid) {
_stp_printf("sys_read :executable : %s pid=%d, cpu=%d\n", current->comm, current->pid,
smp_processor_id());
_stp_printf("Args sys_read : \n");
_stp_counter_add (read, 1);
_stp_print_flush();
}
jprobe_return();
return 0;
}
An alternative would be to have _stp_print_flush() do the check on the
pid and refuse to print for pid == _stp_pid.
Or I could change the new transport to be link relayfs and aggregate IO
into large buffers, which prevents every _stp_print_flush() resulting in
a sys_read(), resulting in another _stp_print_flush(), etc. However, we
would still see a lot of data created from stpd, which we probably don't
want and we would lose the simplicity of having a simple stream,
resulting in making the new transport put sequence numbers into per-cpu
buffers, etc.
I'll think some more on this. But I'm leaning towards the first
solution. systemtap will need changed to handle this too.
What do you think?
Martin