This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: User-space probes: Plan B+


systemtap-owner@sourceware.org wrote on 25/08/2006 21:14:44:

> On 25 Aug 2006 11:22:51 -0700, Jim Keniston <jkenisto@us.ibm.com> wrote:
> > On Fri, 2006-08-25 at 01:11, James Dickens wrote:
> > > On 24 Aug 2006 18:13:24 -0700, Jim Keniston <jkenisto@us.ibm.com>
wrote:
> > ...
> > > >
> > > > I tried an approach based on ptrace, with no kernel enhancements,
but
> > > > it lacked certain necessary features (e.g., #2-5 below), probe
overhead
> > > > was 12-15x worse than Prasanna's approach, and I couldn't get it to
> > > > work when probing multiple processes.  (Frank Eigler independently
> > > > suggested this approach and termed it "Plan B from outer space.")
> > > >
> > >
> > > is 12-15x worse than the current solution used in strace?
> >
> > Slightly worse.  When just counting the occurrences of 1 system call, I
> > clocked strace at about 10 usec/hit.  See
> > http://sourceware.org/ml/systemtap/2006-q2/msg00572.html
> > And some folks reportedly consider strace too slow.
> >
> > ...
> > > > 1. Instrumentation can be coded entirely as a user-space app...
> > >
> > > sounds like a nightmare waiting to happen, if i want to trace
> > > something from userland into the kernel and back, i start writing
> > > userland code, then into kernel code, and quite possibly having
kernel
> > > code access variables and statisics stored in userland, meaning lots
> > > of checks that the user remembers to call the routines that safely
> > > move data back and forth between the two?
> >
> > Well, sure, users could get confused and do things wrong.  And your
> > scenario below where you migrate a piece of instrumentation from user
> > space to kernel space would have to be managed carefully, just like any
> > other design change.
> >
> if you do it entirely in the kernel, then you don't have to deal with
> design changes based on how busy the target system is, so we can use
> the same script the developer used to analzye during debugging even
> when its in production with 1000 times the workload.
>
> Probing a function that is called often would be a major slowdown, as

Possibly, but not necessarily. It depends on the execution time of the
probe handler compared with the mean time to return to the probe. Beacuase
some cases might not suit this technique is not reason to deny its use in
other cases.


> soon as you fire a probe the entire application stops, instrumenting
> something like malloc creates a huge slow down as your process, goes

Show us your measurements.


> to the kernel, then back userland to run the script, and then back
> even if the probe wasn't even interested in the particular event.
>
> It gets worse with a multithreaded task, not only do you have the

Not necessarily. Originally locking was global and would serialize all
probes across all processors and that of course would slow things up a bit
when a 2nd probe fired before the current handler had ended. But the code
has been enhanced quite considerably to ensure that locking is granular.
And there are further improvements that can be applied.

> probe firing more often, the application becomes serialized, so whole
> process slows down tremendously making it not usable in a production
> environment, it would also eliminates races. So users will either say
> once I turned on the probes performance dies, or that the problem
> disappears, the race is gone. The more scalable the application the

Example?

> worse the slowdown.
>
>
>
> > But I think it's better to provide a feature for which a need has been
> > identified -- even if the feature requires careful use and a few
minutes
> > to understand -- than to withhold the feature to protect people from
> > failing.  (I consider asm statements in gcc an extreme example of this
> > philosophy. :-))
> >
>
> its better to design the system with safety and security in mind. This

Not necessarily, but as it happens, we have done.

> can and has been done. They ended up with a solution that works for
> the expert programmer and overworked system administrator, as well as
> the weekend home user just hoping to help out a project find a
> bottleneck.
>
> > >
> > > how is this better than just enhancing a debugger such as gdb?
> >
> > Among other things, gdb -batch is relatively slow (I measured 111 usec
> > per hit just to count breakpoint hits) and has no facility for
> > interacting with kernel-space instrumentation.
> >
> > > how are
> > > stacks dealt with, since you quite possibly having one process
> > > investigate another, if you don't get everything perfect the program
> > > being watched can corrupt the data of the second?
> >
> > Well, somebody with root privileges could register a handler that
> > scribbles just about anywhere, as is the case currently with kprobes.
> > But there's no reason to expect that there's any danger of the
> > particular problems you mention.
> >
> > >
> > > >
> > > > 2. ... but in situations where performance is critical, uprobes can
> > > > run a named kernel handler without waking up the tracer process.
> > > >
>
> To avoid the aforementioned multithreaded problem,  we have to resort
> to counting probe fires without any intelligence about when we record

If you're happy with such a significant limitation then fine. But I'm not.
Furthermore I distinguish two major needs:

1) applicaiton debugging - which is of less personal interest to me but
nonetheless important;
2) system debugging - which may necessitate reference to user-space data or
indeed necessitate a probe to be triggered when code executes in
user-space. This is very much of interest to me and something that the
original design catered well for.

> the information and what information to store when we are called, it
> may be beneficial to do time expensive things like a stack trace, if
> we meet a certain criteria, or to slow down one thread occasionally to
> look for races.
>
> James Dickens
> uadmin.blogspot.com
>
>
> > > now if we start out coding our script to only work in userland, then
> > > all of a sudden we decide we need better performance, we have to go
> > > back and recode parts to work in kernel land and quite possibly break
> > > our algorythms that were talking to kernel land, or probes in the
> > > kernel that accessed userland data that just moved back into the
> > > kernel?
> >
> > See above.
> >
> > >
> > > > 3. A user-mode tracer can invoke a previously registered
kernel-mode
> > > > handler, so we have simple and efficient communication between
user-
> > > > and kernel-mode instrumentation.
> > >
> > > how do you keep a userland program from exploiting systemtaps
> > > arcutecture and executing kernel probes from other active systemtap
> > > scripts, isn't this a huge back door for rootkits especially once
> > > people start using systemtaps methods for monitoring systems
> > > continuously?

No more a back door than allowing any system tool to be used by the
non-privileged user.

> >
> > I've certainly thought about the potential for abuse via
> > uprobe_run_khandler().  If you had the connivance of somebody with root
> > privileges who installed a pernicious handler, you could do all sorts
of
> > bad stuff (and make it relatively hard to track).  That's a big if,
> > though.  If a bad guy has root privileges, you're toast anyway.

You don't need systenm tap to do bad things if you have an untrustworthy
root user.


> >
> > And if you're worried about the handler reading/writing the wrong
> > process's address space, you can specify when you register the handler

Isn't this scenario fantasy-land?


> > that it can apply only to the process in the caller-provided uprobe
> > object -- and only when the caller has permission to trace that
process.
> >
> > ...
> > > >
> > > > 8. Handlers run in process context -- the tracee's context (see
> > > > requirement 2) or the tracer's context while the tracee is stopped
> > > > (see requirement 3).
> > > >
> > >
> > > stack corruption or even slight stack placement differences, would
> > > serverly limit the usefulness of the solution,
> >
> > Well, yes, both we and the user will have to be careful.  That's the
> > nature of programming.
> >
> > > it will have the same
> > > effect as debugging an app in gdb, the app only breaks when the
> > > userland debugger is not running.
> >
> > That (minimizing probe overhead) is one of the points of being able to
> > avoid unnecessary context switches, by just running a handler in the
> > kernel.  (See requirement #2.)
> >
> > >
> > >
> > > James Dickens
> > > uadmin.blogspot.com
> >
> > Thanks.
> > Jim
> >
> >

- -
Richard J Moore
IBM Advanced Linux Response Team - Linux Technology Centre
MOBEX: 264807; Mobile (+44) (0)7739-875237
Office: (+44) (0)1962-817072


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]