This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: new static user probe types

From: Roland McGrath <roland at redhat dot com>
To: Mark Wielaard <mjw at redhat dot com>
Cc: Stan Cox <scox at redhat dot com>, systemtap at sourceware dot org
Date: Wed, 22 Jul 2009 20:06:43 -0700 (PDT)
Subject: Re: new static user probe types
References: <4A453D09.60600@redhat.com> <4A5E0195.5080803@redhat.com> <4A64B8AF.6030304@redhat.com> <1248259327.7890.29.camel@springer.wildebeest.org>

> So uprobes is the clear winner for almost zero-overhead when disabled.

Well, sure.  It's just a nop or three before it gets a breakpoint inserted
(aside from argument packing overhead).

> But it has the largest overhead when enabled. Clearly we want the uprobe
> mechanism when probes are disabled, but the utrace mechanism when probes
> are enabled!

:-)

> Would it be possible to change the uprobes mechanism for inserting trap
> instructions to insert any instructions (sequence)?

That's what jump-optimized probes et al are about, using instruction
analysis.  But here you don't have to do the general case, just a
precisely-chosen code patch of an exactly-known nop sequence the sdt.h
macros generate.

> Then we could have the best of both worlds, or could even decide at
> runtime what to insert to when the probe gets enabled.

Indeed.

> You would make sure that there are enough nops in the place of the probe
> point for the instruction sequence you want to replace and then the
> uprobes insert instruction mechanism would (after checking it had enough
> nop space) insert the instruction sequence (preferable the one used by
> the utrace mechanism).

It can be more precisely-tailored than that, you don't need to think of it
as being a "uprobes method" at all.  It's very simple hard-wired code patching.
i.e., the macro produces one long nop and you patch that to a relative call.
You can make it a call to a stock function we provide in some .a you link
with, or to a stub generated directly in an alternate section by the macros.
(If you don't need different stubs, it could be in a linkonce section.)

> It would also help with implementing the idea for the ENABLED mechanism

That's just another variant of code-patching for the same purpose.

> So, it might be a bit like what Srikar posted to utrace-devel: [...]

By which you just mean it's another kind of code-patching.

> Or how about this.  We could expand STAP_PROBE(...) to
> 
>    { extern char stap_probe_NNNN_enabled_p;
>      if (unlikely(stap_probe_NNN_enabled_p)) {
>         /* current inline-asm stuff, but adding
>            &enabled_p to the descriptor struct. */
>      }
>    }

The point of this is to skip any argument-packing work generated by the
compiler, which would be inside the "if unlikely" block, right?

> The inline-asm inside could be the fastest enabled variant, probably
> the kprobe-based one.  (This would make user-space sdt.h usable
> without utrace & uprobes.)

Presumably the really fastest would be an ill-used syscall that has a
tracepoint in it.  I have also been thinking about vDSO ways of doing that.

> O, I like it. It is probably not as zero-overhead as the "pure nop"

Less for some and more for others, I would guess.  That is, in the base
case the test and branch not taken would be slower than the nop.  But if
just a few more instructions are required to set up the tracepoint
arguments and those are skipped in the disabled case, perhaps that tips it.

> Certainly warrants a try and benchmark.

I think this is a lot like some things Mathieu already experimented with
and measured in the kernel context.  I think he pursued a code-patching
flavor that patched an immediate operand because that was measured as
faster than having the actual extra load of a simple enabled_p variable.

> BTW. For storing changeable variables the .probes section should become
> alloc, rw now always (it currently is only for relocatable objects). 

It doesn't make sense that it should differ in relocatable objects.
I don't understand that.

As to the general question: eh, maybe.  From inside the kernel it's no
problem to poke a r/o text word just like a user-mutable data page.  Either
way the file-mapped page from the executable/DSO gets COW'd so you can
change it.  If you throw it into general data, it's more likely you will
share a page with other random data and so won't actually add a page's COW
overhead.  But the flip side is that keeping it in read-only means that a
program going haywire can't (without mprotect calls) accidentally set/clear
the flag, which is always winds up being lots of fun in the truly weirdest
debugging scenarios.


Thanks,
Roland

Follow-Ups:
- Re: new static user probe types
  - From: Mark Wielaard

References:
- Re: new static user probe types
  - From: Stan Cox
- Re: new static user probe types
  - From: Stan Cox
- Re: new static user probe types
  - From: Mark Wielaard

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]