This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: User application hangs with systemtap 2.3


On 09/27/2013 06:42 AM, Martin Martin wrote:
> Thanks for your help.  I tried it on a VM where I had installed Ubuntu
> from scratch, and it works fine there.  The machine where it's failing
> is our continuous integration server, running on AWS.  It was created
> from scratch maybe 6 months ago, but originally had an earlier kernel
> and I upgraded the kernel in the hopes of getting systemtap to work.

How similar is your Ubuntu VM to that AWS machine?

And what virtualization software are you using locally?  I believe AWS
is running on Xen, which has had some unique issues in the past too...

> Thanks for the note about -g.  In our actual application we need it,
> but I just tried without it, and got the same problem.

Right, -g shouldn't hurt anything - I just try to discourage it for
scripts that don't need it.  (a sort of least-privilege policy)

> Here's the dmesg output on the problem machine:
> 
> stap_1f8d3a6e59c6857e0a55d3689cfb95_15281: systemtap: 2.3/0.152, base:
> ffffffffa0156000, memory: 54data/40text/0ctx/2058net/9alloc kb,
> probes: 1
> 
> and on the working machine:
> 
> stap_f91baab8de4703e9d5e2594fc982b45_3593: systemtap: 2.3/0.152, base:
> ffffffffa02d9000, memory: 54data/40text/1ctx/2058net/17alloc kb,
> probes: 1
> 
> Is the "1ctx" vs "0ctx" significant?  What about "9alloc kb" vs "17alloc kb"?

Ok, those lines are usual, but there was no other error message from the
kernel?

The ctx number is num_online_cpus()*sizeof(struct context)/1024.  That
struct size varies with each script, but it's minimally around 300 bytes
on 32-bit, 600 bytes on 64-bit.  It should be identical for the same
script and bitsize, so I guess your two machines just have different cpu
counts.

That probably explains the alloc difference too, as most things are
allocated per-cpu.

> Where do I give the -DSTP_ALIBI?  When compiling systemtap, when
> compiling my user code, or on the stap command line?  From poking
> around, it seems like on the stap command line.  Here's the result on
> the broken system:

Sorry, yes, -DSTP_ALIBI is for the stap command line.

> $ g++-4.7 -Wall -Wextra ./test.cpp -o test && sudo stap -DSTP_ALIBI -c
> './test' temp.stp
> About to hit probe.
> Back from probe.
> ^CWARNING: Child process exited with signal 2 (Interrupt)
> ----- probe hit report:
> WARNING: /usr/bin/staprun exited with status: 1
> Pass 5: run failed.  [man error::pass5]

OK, so it appears to fail in exactly the same way with the alibi.  That
seems to implicate the kernel itself, but systemtap isn't completely
exonerated yet.

> and on the working system:
> 
> $ g++-4.7 -Wall -Werror ./test.cpp -o test && sudo stap -DSTP_ALIBI=1
> -c ./test temp.stp
> About to hit probe.
> Back from probe.
> This output doesn't show up.
> ----- probe hit report:
> process("/home/likewise-open/SILVERLINING/martin/test").statement(0x40077f),
> (temp.stp:1:1), hits: 1, from: process("./test").statement(0x40077f)
> from: process("./test").provider("nfs").mark("writeBackendFailed")
> from: process("./test").provider("nfs").mark("writeBackendFailed"),
> index: 0
> 
> I just tried uninstalling & reinstalling systemtap on the broken
> system, no luck.  Any suggestions for what I should try next?

It would be good to try separating this issue completely from systemtap,
using perf or debugfs tracing.  You may have seen the other threads here
talking about new SDT support in perf, but you can do this manually too.

(Note, I'm using a pretty new perf on Fedora 19, so I'm not sure every
command below works the same with your older kernel...)

First get the sdt note from the file:
> $ readelf -n ./test
[...]
> Notes at offset 0x00001078 with length 0x00000044:
>   Owner                 Data size	Description
>   stapsdt              0x00000030	NT_STAPSDT (SystemTap probe descriptors)
>     Provider: nfs
>     Name: writeBackendFailed
>     Location: 0x00000000004007b3, Base: 0x0000000000400905, Semaphore: 0x0000000000000000
>     Arguments: 

With perf probe, you specify FUNC+OFF, so we have to find that address.
> $ nm --numeric-sort --defined-only ./test
> 0000000000400618 T _init
> 00000000004006b0 T _start
> 00000000004006e0 t deregister_tm_clones
> 0000000000400710 t register_tm_clones
> 0000000000400750 t __do_global_dtors_aux
> 0000000000400770 t frame_dummy
> 00000000004007a0 T main
> 00000000004007d9 t _Z41__static_initialization_and_destruction_0ii
[...]

So my SDT probe at 4007b3 is main+0x13.  Now add it:
> $ sudo perf probe -x ./test --add nfs_writeBackendFailed=main+0x13
> Added new event:
>   probe_test:nfs_writeBackendFailed (on 0x7b3)
> 
> You can now use it in all perf tools, such as:
> 
> 	perf record -e probe_test:nfs_writeBackendFailed -aR sleep 1

And use it:
> $ sudo perf record -e probe_test:nfs_writeBackendFailed ./test
> About to hit probe.
> Back from probe.
> This output doesn't show up.
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.012 MB perf.data (~511 samples) ]

If this fails too, then we've successfully eliminated stap as the
problem.  Which doesn't help you a whole lot, but at least you can start
looking at the kernel and/or virtualization system to blame.

If it worked, "sudo perf annotate main" should show the probe hit on the
SDT nop.  And we'll have to look back at what stap is doing wrong.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]