This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fw: [ltc-interlock] [RFC] draft NFS trace hooks (fwd)


Hi, Tony

  The bigger background is that our team is now working on Linux
Kernel Even Trace(LKET) tool which is actually an extension to
SystemTap's tapset library. LKET has been incorporated into SystemTap
already and it will be used to trace various events inside Kernel,
such as scsi, io scheduler, page faults ..., and NFS trace hooks that
we are working on is only one of them. LKET could be used to do a
system trace so it won't trace the very detail operations of each kind
of trace hooks.

  SystemTap provides an infrastructure to do a dynamic probing, which
means no patches to Kernel, no recompile and no reboot is required.
Each probe definition defines where inside the Kernel to probe and
what action to be executed when hit that probe point, an example:

probe kernel.function("sys_read")
{
	printf("fd:%d, count: %d, buff: %x", $fd, $count, $buf)
}

  Currently we don't have a separate design document but you could
refer to the following links to get an overall picture of what we are
doing:
 http://sourceware.org/systemtap/overview.html
 http://sourceware.org/systemtap/man5/lket.5.html
 http://sourceware.org/systemtap/man5/stapprobes.5.html

 And I attached a list of what we want to trace for NFS. But we are
not working on NFS itself and don't have much knowledge about it. This
is why we sending out the NFS trace hooks for a review to make sure
that we didn't miss some important trace points.

 I put some answers to your questions below.


Xue Peng Li wrote:
> Thanks.
> 
> ----- Forwarded by Xue Peng Li/China/Contr/IBM on 2006-08-14 09:47 -----
> 
> Tony Reix <tony.reix@bull.net> wrote on 2006-08-11 18:23:51:
> 
>  > Hi Xue Peng Li,
>  >
>  > Gerrit has warned me that you are working on Systemtap tapsets for NFS.
>  > It seems a good idea to enable the tracing and probing of NFS !
>  >
>  > My team is contributing to testing and stabilizing NFSv4 Server/Client.
>  > We are working with CITI and OSDL.
>  > See:   http://nfsv4.bullopensource.org/
>  >
>  > Based on the email Gerrit attached, it seems that these tapsets are
>  > ready for all NFS versions but more focused on NFSv3.
>  >
>  > Since NFSv4 shares a lot of code with NFSv3, part of your work will be
>  > useful for NFSv4. However NFSv4 protocol is very different from NFSv3
>  > protocol. So probably specific tapsets code are required for NFSv4.
>  > Also, main NFS activity now is around NFSv4, which is starting to appear
>  > in distribs (RHEL5, SLES10), and which has reached a good lever of
>  > reliability and performance.

Please tell me if we missed some important functions that should be
probed for NFSv4. But what Xuepeng sent out is only our first step of
NFS trace hooks. The full list of our plan is in the attachment.

>  >
>  > I know very few about SystemTap. However, I have questions:
>  > - Do you have a design document describing what must be traced in NFS ?

See the attachment.

>  > - How do you plan to submit your code to NFS or Kernel maintainers ?

No. The trace hooks written in tapsets is for dynamic tracing, which
means that no patches to Kernel is needed.

>  > - Have you worked with them (I saw nothing on mailing list) ?

No. we now work with SystemTap community. NFS is only a part of what
we are doing.

>  > - How much of your code must be included in NFS code, as patches ?

None. This is the powerful aspect of SystemTap: you can probe the
Kernel without touching the Kernel source codes.

>  >
>  > >From my opinion, the main problem is to get your code accepted by NFS
>  > and Kernel maintainers. Talking with CITI guys could be easier than
>  > talking with Kernel guys. We already are in contact with CITI guys. (As
>  > an example, it took 13 months before our first IPv6 code for NFSv4
>  > client gets into the process to be studied and (hopefully) accepted in
>  > Kernel code.)
>  >
>  > Providing a design document would help my team to understand which parts
>  > of NFS are traced. (However, Linux people seem not to like reading such
>  > papers ...).
>  >
>  > What are your plans about tapsets for NFSv4 ?

only NFSv4 procedure stub functions for the client and server side.

>  > Do you have resources to do that in 2006 or in 2007 ?

oh. I am not sure. We are assigned to work for SystemTap and LKET :-)

>  > First step could be to check that NFSv4 developers are already SystemTap
>  > enthusiastic. If not yet, one should discuss of that with them. It seems
>  > very important to get their comments and approval for which features to
>  > trace. The problem is that NFSv4 Server developer is overloaded ... so
>  > it may take some time.
>  > Also, there already are some trace code in NFSv4.

It seems to me that a lot of developers still not realized the
existence or how SystemTap could facilitate their
development/debugging of Kernel. We need more advertisement for
SystemTap :-)

SystemTap wiki is a good place to share the experience:

http://sources.redhat.com/systemtap/wiki/HomePage

Feel free to let me know your suggestions and questions. Thanks.

- Guanglei

>  >
>  > Regards,
>  >
>  > Tony
>  >
>  >
>  >
>  > Aurélien, Aimé,
>  >
>  > You can find info about KLET at:
>  >    http://sourceware.org/systemtap/man5/lket.5.html
>  >
>  >
>  >
>  > --
>  > Cordialement/Regards,
>  >  
>  > Tony Reix
>  >                             Carpe Diem
>  >  
>  > ("Carpe diem quam minimum credula postero" - Horace
>  >   Mets à profit le jour présent sans croire au lendemain )
>  >  
>  > 
> **********************************************************************************
>  > Name/Company: Tony Reix                        Bull SAS - AIX/Linux R&D
>  > EMail:        Tony.Reix@bull.net    (From IBM: Tony.Reix@frec.bull.fr)
>  > Position:     Linux Projects Manager - NPTL - NFSv4
>  > Web-Sites:    http://www.bull.com      http://nfsv4.bullopensource.org/
>  > Address:      BULL, 1 rue de Provence, BP 208, 38432 Echirolles - France
>  > Phone         France: 04 76 29 72 67     International: 33 4 76 29 72 67
>  > Fax:          France: 04 76 29 76 00     International: 33 4 76 29 76 00
>  > Bull:         Phone: 229-7267  MailAddress: FREC B1-188   Office: B1-225
>  > 
> **********************************************************************************
>  >
>  > Bull, Architect of an Open World
>  >
   We chose some NFS releated functions to be instrumented. We will
trace the entry of these functions and if necessary, the return of 
them will also be traced. The following is the list of these functions,
please take a review:

==================== Client Side ==========================

<1> nfs directory operations

      All functions from nfs_dir_operations:

       const struct file_operations nfs_dir_operations = {
         .llseek         = nfs_llseek_dir,
         .read           = generic_read_dir,
         .readdir        = nfs_readdir,
         .open           = nfs_opendir,
         .release        = nfs_release,
         .fsync          = nfs_fsync_dir,
};

<2> nfs file operations

     All functions from nfs_file_operations:

	const struct file_operations nfs_file_operations = {
         .llseek         = nfs_file_llseek,
         .read           = do_sync_read,
         .write          = do_sync_write,
         .aio_read               = nfs_file_read,
         .aio_write              = nfs_file_write,
         .mmap           = nfs_file_mmap,
         .open           = nfs_file_open,
         .flush          = nfs_file_flush,
         .release        = nfs_file_release,
         .fsync          = nfs_fsync,
         .lock           = nfs_lock,
         .flock          = nfs_flock,
         .sendfile       = nfs_file_sendfile,
         .check_flags    = nfs_check_flags,
};

<3> nfs address space operations:
     All functions from nfs_file_aops:

       struct address_space_operations nfs_file_aops = {
         .readpage = nfs_readpage,
         .readpages = nfs_readpages,
         .set_page_dirty = __set_page_dirty_nobuffers,
         .writepage = nfs_writepage,
         .writepages = nfs_writepages,
         .prepare_write = nfs_prepare_write,
         .commit_write = nfs_commit_write,
         .invalidatepage = nfs_invalidate_page,
         .releasepage = nfs_release_page,
#ifdef CONFIG_NFS_DIRECTIO
         .direct_IO = nfs_direct_IO,
#endif
      };

<4> NFS RPC procedures:

    All functions from nfs_v[2,3,4]_clientops:
     I only list the nfs_v3 rpc procedures:
      struct nfs_rpc_ops      nfs_v3_clientops = {
         .version        = 3,                    /* protocol version */
         .dentry_ops     = &nfs_dentry_operations,
         .dir_inode_ops  = &nfs3_dir_inode_operations,
         .file_inode_ops = &nfs3_file_inode_operations,
         .getroot        = nfs3_proc_get_root,
         .getattr        = nfs3_proc_getattr,
         .setattr        = nfs3_proc_setattr,
         .lookup         = nfs3_proc_lookup,
         .access         = nfs3_proc_access,
         .readlink       = nfs3_proc_readlink,
         .read           = nfs3_proc_read,
         .write          = nfs3_proc_write,
         .commit         = nfs3_proc_commit,
         .create         = nfs3_proc_create,
         .remove         = nfs3_proc_remove,
         .unlink_setup   = nfs3_proc_unlink_setup,
         .unlink_done    = nfs3_proc_unlink_done,
         .rename         = nfs3_proc_rename,
         .link           = nfs3_proc_link,
         .symlink        = nfs3_proc_symlink,
         .mkdir          = nfs3_proc_mkdir,
         .rmdir          = nfs3_proc_rmdir,
         .readdir        = nfs3_proc_readdir,
         .mknod          = nfs3_proc_mknod,
         .statfs         = nfs3_proc_statfs,
         .fsinfo         = nfs3_proc_fsinfo,
         .pathconf       = nfs3_proc_pathconf,
         .decode_dirent  = nfs3_decode_dirent,
         .read_setup     = nfs3_proc_read_setup,
         .read_done      = nfs3_read_done,
         .write_setup    = nfs3_proc_write_setup,
         .write_done     = nfs3_write_done,
         .commit_setup   = nfs3_proc_commit_setup,
         .commit_done    = nfs3_commit_done,
         .file_open      = nfs_open,
         .file_release   = nfs_release,
         .lock           = nfs3_proc_lock,
         .clear_acl_cache = nfs3_forget_cached_acls,
     };

   The LKET already has syscall and iosyscall trace hooks. So with the 
above trace hooks, LKET could trace different layer of NFS operations:
    --> Syscall
       --> struct file_operations
           --> struct address_space_operations
                --> struct nfs_rpc_ops

======================= Server Side =============================

<1> nfsd_dispatch
    This is the NFS dispatching function sit on top of RPC.

<2> NFS RPC procedures:

     For NFSv4, it will be nfsd4_proc_compound

     For NFSv2, NFSv3, it will be the functions from nfsd_procedures[2,3]

     Here is a list for NFSv3. NFSv2 is almost the same:
       nfsd3_proc_null,
       nfsd3_proc_getattr,
       nfsd3_proc_setattr,
       nfsd3_proc_lookup,
       nfsd3_proc_access,
       nfsd3_proc_readlink,
       nfsd3_proc_read,
       nfsd3_proc_write,
       nfsd3_proc_create,
       nfsd3_proc_mkdir,
       nfsd3_proc_symlink,
       nfsd3_proc_mknod,
       nfsd3_proc_remove,
       nfsd3_proc_rmdir,
       nfsd3_proc_rename,
       nfsd3_proc_link,
       nfsd3_proc_readdir,
       nfsd3_proc_readdirplus,readdirplus,
       nfsd3_proc_fsstat,
       nfsd3_proc_fsinfo,
       nfsd3_proc_pathconf,
       nfsd3_proc_commit,

<3> NFSD file VFS operations

      The functions nfsd_xxx from "fs/nfsd/vfs.c"

With the above server side trace hooks, LKET could trace NFS 
operations at different layer:

      nfsd_dispatch -->
         --> NFS RPC Procedures
            --> NFS VFS file operations


   What I didn't list about NFS operations includes authentication, 
NFSv4 callback and RPC(I prefer to use a separate set of trace hooks 
for RPC). I am not sure if these operations are also required to be 
traced. If I missed some important functions or I listed some 
redundant functions, please feel free to let me know. Any comments 
will be highly appreciated.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]