This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] SystemTap future direction


Hi Mark,

Mark Wielaard wrote:
> Hi Masami,
> 
> On Wed, 2010-08-04 at 14:19 +0900, Masami Hiramatsu wrote:
>> As you may know (of course I Cc'd discussion on LKML), Ingo and
>> Christoph said that (at least) uprobes (but also kprobes) should
>> not support out-of-tree module.
> 
> I thought there were already modules using kprobes. And I think module
> support for uprobes will be beneficial too.

Yeah, but they can be (and Christoph said, should be) replaced by
tracepoints.

>> This means that if we succeed to merge uprobes into kernel,
>> SystemTap can't use uprobes itself.
> 
> :) So helping push things upstream means not using them yourself.
> If that happens we can always do what we do now of course, ship our own
> version. But it would be ideal if we could reuse the upstreamed code of
> course.

Hmm, that just makes things worse... Kernel developers migh just think
us as rogues :(.

>>  Even worse, if someone tries
>> to remove kprobes' module support, that could shake the foundation
>> of SystemTap.
> 
> kprobes are just one event source. An important one. But there are
> others and people do write scripts that never touch kprobes. They are
> very nice to have though. Especially if you want cross kernel/user space
> observability.

Yeah, but without dynamic tracing, SystemTap lose an advantage.

>> At least, to add support kmodules to uprobes, I think we have two
>> options, one is pushing systemtap itself and useful scripts into
>> kernel tree, or the other is finding very useful use-case of *probes
>> which requires out-of-tree module. (But the first one is hard because
>> Linus hates C++, and systemtap is too huge to push into the kernel)
> 
> That would be nice. The c++ part is just the user space translator
> anyway. So that doesn't have to be pushed (and doesn't really make sense
> IMHO) in the kernel sources. But maybe it can sit next to the user space
> perf tools if that is a nicer repository to hack in.

Yeah, maybe under tools/systemtap/.

>> Anyway, I think it's the time to discuss how we can get over this
>> situation and which is the feature direction of SystemTap together.
>> Since we already has many users, we are responsible to support them.
> 
> Yes. I was at GUADEC last week and was happily surprised to meet
> multiple Gnome hackers who were happy systemtap users. glib and gobject
> have their own static markers (dtrace compatible) and tapsets now.

That's a good news. Is that possible perf to support static markers too?

>> I'd like to suggest some directions here;
>>
>> - Merge runtime and module-source generator into linux kernel.
>>  This will requires rewriting whole of systemtap code from C++ to
>>  C or other LL (perl or python)
> 
> If that requires rewriting the whole translator that seems very
> unattractive. The translator is just the script parser and translator,
> so I don't see why it matters what language it is written in.

Because that's the policy of kernel majority. :P

> But
> merging some of the runtime, specifically the utrace/task-finder code so
> it can be reused by others to get better user space task/process
> observability seems like a nice thing to have.

Yes, that will be the next step of uprobes. Christoph already argued
that pid-only uprobe is hard to widely use.

>> - Port SystemTap on the perf/ftrace and extend perf/ftrace to support
>>  extend handlers which provided by modules.
> 
> That would be nice. If we can attach systemtap probe handlers to
> perf/ftrace events in kernel then those would be really nice event
> sources.
> 
>> - Port SystemTap on the perf/ftrace but drop embedded-C support.
>>  This will enhance perf/ftrace to support enough flexible data
>>  filter/modifier (including fault injection feature). In this case,
>>  SystemTap scripts will handle the data in user-space (not on-line).
> 
> I think the "not on-line" part is a bit of a showstopper. Since that
> kills the main idea of having powerful scriptable observability. Simple
> filters are too restrictive IMHO. It might be enough for simple
> profiling, where you analyze the data off-line afterwards. But that
> isn't an option for everybody (you need to store/push the data
> somewhere), and not very efficient some cases.

The efficiency is the key, and perf and systemtap aim to
different efficiency. SystemTap focuses on the efficiency of
transporting data, but perf focuses on the efficiency of
probing time. What they are trying to is reducing the overhead
of recording data to buffers, because it is less disturbance for
the performance of target processes.

> But we could try translating to something not-C for the runtime. That is
> the approach that the fish project seems to be going with extended GDB
> agent expressions (see the archer and utrace mailinglist for the
> discussion).

Ah, that's a good idea. Linux already have gdb command parser in kgdb.
So we can reuse it (or share new one with kgdb).

>> - Or, just do nothing and wait for kernel  maintainers choking
>>  our necks...
> 
> The kernel maintainers can make our lives easier by letting us upstream
> more stuff that we can then reuse. But if not, we can upstream and still
> carry our own copy if necessary. That is far from ideal, but if it is
> the only option, at least the user experience wouldn't be worse than
> what we have now. But I hope we can convince them otherwise of course.

Anyway, it is important that we show our effort which things goes forward.

>> I don't think the last one is the best one.
>> What would you think about that?
> 
> Personally I would like to push for an in-kernel interpreter/jit that
> our translator can translate to. And make it powerful enough so that it
> cannot just be used for systemtap probe handlers, but also for
> perf/ftrace/gdb-agent-expressions. But that is a lot of work. It is the
> most flexible one though.

Agreed, that greatly helps us, and good way to go.

> I do realize that the current SystemTap design comes from the fact that
> years ago the kernel maintainers rejected such an interpreter out of
> hand. But now that we have some many alternative obervability techniques
> that can use kernel support I hope they will now be more accepting.
> 
>> BTW, does no one attend to LinuxCon 2010 in Boston?
>> I'll be there next week...
> 
> Sorry, wrong continent for me. I currently live in Europe.

Thank you!

> 
> Cheers,
> 
> Mark
> 


-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]