This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Better handling of arguments/literals from the command-line?


Hi Ted,

On Sun, 2008-10-05 at 23:05 -0400, Theodore Tso wrote:
> On Sun, Oct 05, 2008 at 10:47:09AM +0200, Mark Wielaard wrote:
> > 
> > It is less than 10 seconds on my machine, but still some seconds. Which
> > does indeed seem somewhat long if we can find the cached files
> > eventually.
> 
> Really?  I wonder why it takes so long on my system.  (An X61s laptop,
> with an 1.6Ghz Intel Core 2 Duo, 2gigs memory, 7200rpm hard drive)
> What sort of system do you have?

Well, less than 10, but still 8 seconds (it felt less yesterday).
Workstation: 3.2Ghz, 2GB, 10.000rpm

>   Is there some way the initial
> compilation can be sped up?  I considered it rather disturbing that it
> was taking so long; at greater than 10-15 seconds, usability design
> really demands a progress bar, and some indication of when the
> systemtap script is actually executing as opposed to compiling.

stap -v gives you at least the timings of Pass 1-5. So you know where
the time is being spend.

This is almost completely Pass 2 ():
Pass 2: analyzed script: 792 probe(s), 118 function(s), 14 embed(s),
        1 global(s) in 7810usr/480sys/8309real ms.
Pass 3 and 4 used cached files after the first run and so don't take
much time.

The analyzing translates the syscall.*[.return] probes into almost 800
individual kernel.function("sys_*")[.return] probes (with -vv you can
see which). And then for all these individual probes it calculates how
to get the arguments from the dwarf info to create the argstr and
returnstr used in the script. I am somewhat surprised this takes so
long, but this seems indeed what it is mainly doing (opreport -l):

samples  %        image name           symbol name
41080    13.1230  stap                 query_module(Dwfl_Module*, void**, char const*, unsigned long, void*)
29392     9.3893  libdw.so.1           __libdw_find_attr
17950     5.7341  libdw.so.1           __libdw_form_val_len
17344     5.5406  libdw.so.1           lookup
11379     3.6350  libc-2.8.so          strlen
10429     3.3316  libstdc++.so.6.0.10  std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)
9987      3.1904  stap                 std::tr1::__detail::_Map_base<std::string, std::pair[.truncated.]>::operator[](std::string const&)
8064      2.5761  libc-2.8.so          _int_malloc
6993      2.2339  stap                 dwflpp::iterate_over_functions(int (*)(Dwarf_Die*, void*), void*)
6429      2.0538  libc-2.8.so          memcpy
6059      1.9356  libdw.so.1           dwarf_siblingof
5818      1.8586  libstdc++.so.6.0.10  std::string::assign(std::string const&)

One contributing factor here is that I have 1715 kernel modules
installed (standard Fedora it seems) and that for each probe point (800
in this script) query_module() is called for each of them (so that is
1.372.000 calls). Although query_module() does seem to cache its results
for each module, that is still a lot. Of course in this case all the
probes are for sys_* functions which are never in a module, but
systemtap doesn't know that.

It would be nice if we could simply cache the results of Pass 2 also
when the .stp and debuginfo files used haven't changed. Then rerunning a
script would be almost instant.

Cheers,

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]