This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fw: [ltc-interlock] [RFC] draft NFS trace hooks (fwd)


Tony Reix wrote:
Hi Li,

Le lundi 14 août 2006 à 11:11 +0800, Li Guanglei a écrit :
Hi, Tony

...

  SystemTap provides an infrastructure to do a dynamic probing, which
means no patches to Kernel, no recompile and no reboot is required.

I think such features will be very useful in the field, when customers encounter critical problems and that stopping the machine or updating the kernel to a new version is not an option.

Yes. We are now doing a tracing tool named LKET based on SystemTap and one of its usage scenario is just let the customer run the tracing tool and give analyst the raw trace data for post-processing to find out the problem.




  Currently we don't have a separate design document but you could
refer to the following links to get an overall picture of what we are
doing:
 http://sourceware.org/systemtap/overview.html
 http://sourceware.org/systemtap/man5/lket.5.html
 http://sourceware.org/systemtap/man5/stapprobes.5.html

These links do not provide me high level information;
However, this one:
http://sourceware.org/systemtap/documentation.html
contains very useful document to refresh/increase my knowledge of
SystemTap (I studied the features of DTrace some years ago ...).



 And I attached a list of what we want to trace for NFS. But we are
not working on NFS itself and don't have much knowledge about it.

We are working since some years on NFSv4, mainly testing or adding IPv6. So we have some understanding of the protocol, but sure we do not master the protocol nor the code.


This is why we sending out the NFS trace hooks for a review to make sure
that we didn't miss some important trace points.

Up to know, I've seen only 2 answers from the nfs mailing list, from Chuck Lever. But I do not see answers to the 2nd email of Chuck. Maybe I've missed it.

Is it the one that Jose Santos replied on 07/28 with the subject "Re: [NFS] [ltc-perf] draft of nfs event hook"? I didn't subscribe nfs mailinglist and I may miss that mail if I was not in to/cc list.




I put some answers to your questions below.

Please tell me if we missed some important functions that should be
probed for NFSv4. But what Xuepeng sent out is only our first step of
NFS trace hooks. The full list of our plan is in the attachment.

As I said before, we do not master NFSv4 internals, which are VERY complicated. I think only Trond Myklebust, Bruce Fields and Chuck Lever could really help you to check that these hooks are appropriate. But I know that Trond, Bruce and Chuck are very busy ... So they probably have no time to spend to understand SystemTap goals and basics and to check that the hooks you designed are appropriate.

Chuck asked for a design document. It seems you do not have one. And
building such a document would require a deep knowledge of NFS/NFSv4
internals. So this design document should be written either 1) by means
of a collaborative work between SystemTap and NFSv4 experts, or 2) by
someone who could spend time to build skills on both technologies, and
then discuss with the SystemTap and NFSv4 experts.

Yes. I don't have a dedicated design document for NFS trace hooks.
LKET that we are doing is mainly for system trace so I choose only some NFS functions for instrument. Although it could also be used for detailed diagnosis but that's not its focus.




 > I know very few about SystemTap. However, I have questions:
 > - Do you have a design document describing what must be traced in NFS ?
See the attachment.

Hum. This looks more as a conclusion than as a design explaining the rationale of the choices.

Yes. I only listed those functions that I think important, especially for performance analysis. One consideration of choosing these functions is that they could be correlated with other trace hooks, like VFS, IO Syscall available in LKET.




> - How do you plan to submit your code to NFS or Kernel maintainers ?
No. The trace hooks written in tapsets is for dynamic tracing, which
means that no patches to Kernel is needed.
> - How much of your code must be included in NFS code, as patches ?
None. This is the powerful aspect of SystemTap: you can probe the
Kernel without touching the Kernel source codes.


Ohh. I did not know. Probably this is explained in the OLS'06 paper. Do you have a link to a technical explanation of how that works ?

kprobe is its fundamental. you can refer to Documents/kprobes.txt




> - Have you worked with them (I saw nothing on mailing list) ?
No. we now work with SystemTap community. NFS is only a part of what
we are doing.

I do not remember someone talking about systemtap in the nfs and nfsv4 mailing list. A quick search in the 3200 messages I've kept in my folders seems to show that no one talked about systemtap before you did.

Do you know what is the position of OSDL about SystemTap ?

oh, I don't know. Vara, do you know it?




> What are your plans about tapsets for NFSv4 ?
only NFSv4 procedure stub functions for the client and server side.

> Do you have resources to do that in 2006 or in 2007 ?
oh. I am not sure. We are assigned to work for SystemTap and LKET :-)

I'm used to have students giving help on some projects (this year: NPTL Trace Tool, and NFSv4 Administration/Security). Do you think a student (best French Computer Science University, 5-6 months of work) could help ?

Of course. Although we did some work about NFS trace hooks but there are still a lot of placed needed to be instrumented for NFS.




 > First step could be to check that NFSv4 developers are already SystemTap
 > enthusiastic. If not yet, one should discuss of that with them. It seems
 > very important to get their comments and approval for which features to
 > trace. The problem is that NFSv4 Server developer is overloaded ... so
 > it may take some time.
 > Also, there already are some trace code in NFSv4.
It seems to me that a lot of developers still not realized the
existence or how SystemTap could facilitate their
development/debugging of Kernel. We need more advertisement for
SystemTap :-)

Yes. I've talked with 2 guys here working on ext3 and they have no clear idea of how SystemTap could help them ... One of them (a ext3 and Xen expert) did not know about SystemTap even.

I know that proposing debugging/tracing tools to expert/guru developers
is difficult: they are often more efficient without using such tools but
they often forget that such tools will greatly help maintenance in the
future and to understand problems in the field.

What do Linux gurus (Linus, ...) think about SystemTap ?
Now, it seems that SystemTap is only an initiative by RedHat, IBM, Intel
and Hitachi. People close to the customers.

In fact I don't know. Vara, could you give us some news about this?




SystemTap wiki is a good place to share the experience:

http://sources.redhat.com/systemtap/wiki/HomePage

Yes. I've found useful documents. Now, I need to find time to read them ...


Feel free to let me know your suggestions and questions. Thanks.

What about a student working on tapsets for NFSv4 in 2007 ? After we have got the position of NFSv4 developers about SystemTap, sure. I'll ask them about SystemTap.

of course that will be great. What we have done is only a start and a lot of additional work is needed for NFS tapsets. We are not dedicated to work for NFS tapsets so it will help if we can find someone continue work on it.




Regards,


Tony


- Guanglei



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]