This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Updating from 0303 to 0505 has problems


On Thu, May 10, 2007 at 01:28:44AM -0400, Martin Hunt wrote:
On Wed, 2007-05-09 at 22:22 -0500, Quentin Barnes wrote:


The problem is the "poll warning: Interrupted system call" diagnostic message from staprun that's throwing off expect. Below is a patch that fixes this problem for this message. Now whether this is a fix or a workaround I don't know. There are two possible fixes: 1) change stap_run.exp to account for the extra diagnostic, 2) change staprun's reader_thread() not to generate the diagnostic in the first place. I chose the former for this patch. Which approach would be the better fix?

I think I checked in a fix to this a few days ago. Certainly that error
message no longer is in the sources. Of course, there could be new
problems now... Could you give either the current CVS version a try or a
snapshot from May 8 or later?

I tried it with the 0512 snapshot and it still has problems.


Running it with add.stp:
====================
# staprun .systemtap/cache/51/stap_51971f530865fabd17b2f297433cc8f6_473.ko systemtap starting probe
[pressed ^C here]
#
====================


After I pressed ^C there should have been the additional messages:
====================
systemtap ending probe
systemtap test success
====================

But there was nothing.  It just exited.  This is now also causing
a lot of "eof" failures with the test suite since those messages
are lost.


Rerunning now with -vvv: ==================== # staprun -vvv .systemtap/cache/51/stap_51971f530865fabd17b2f297433cc8f6_473.ko main:141 modpath=".systemtap/cache/51/stap_51971f530865fabd17b2f297433cc8f6_473.ko", modname="stap_51971f530865fabd17b2f297433cc8f6_473" run_stp_check:191 executing /usr/local/libexec/systemtap/stp_check init_ctl_channel:59 Opening /sys/kernel/debug/systemtap/stap_51971f530865fabd17b2f297433cc8f6_473/cmd stp_main_loop:335 in main loop stp_main_loop:423 STP_SYMBOLS request received stp_main_loop:416 STP_MODULES request received init_relayfs:104 initializing relayfs init_relayfs:120 attempting to open /sys/kernel/debug/systemtap/stap_51971f530865fabd17b2f297433cc8f6_473/trace0 init_relayfs:120 attempting to open /sys/kernel/debug/systemtap/stap_51971f530865fabd17b2f297433cc8f6_473/trace1 init_relayfs:126 ncpus=1 init_relayfs:164 starting threads stp_main_loop:382 probe_start() returned 0 systemtap starting probe [pressed ^C here] sigproc:33 sigproc 2 (Interrupt) reader_thread:74 poll=-1 errno=4 # stp_main_loop:375 got STP_EXIT, closed=0 cleanup_and_exit:288 CLEANUP AND EXIT closed=0 close_relayfs:183 closing close_relayfs:190 sent SIGUSR2 close_relayfs:204 done cleanup_and_exit:301 closing control channel cleanup_and_exit:305 removing module sigproc:33 sigproc 17 (Child exited) ====================

I think what is going on is that the ^C forces the child in
reader_thread() to asynchronously send the STP_EXIT command making
for an early abort prior to it getting down to its read/write loop
to send the rest of the final messages first.

My guess at what you need to do is not have sigproc() actually make
the send_request(STP_EXIT, NULL, 0) call itself, but to just set
the "stop_threads" state to 1.  Only after letting reader_thread()
finish up its inner while and its outer do/while loops, have it only
then send the STP_EXIT directive.

I don't know the code well enough to know how complex this is to do
this tweak with its current multi-threaded design.

Martin

Quentin



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]