This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Fw: systemtap application to find applications doing polling
- From: Vaidyanathan Srinivasan <svaidy at linux dot vnet dot ibm dot com>
- To: William Cohen <wcohen at redhat dot com>
- Cc: Maneesh Soni <maneesh at in dot ibm dot com>, dipankar at in dot ibm dot com, ananth at in dot ibm dot com, SystemTAP <systemtap at sources dot redhat dot com>, Ulrich Drepper <drepper at redhat dot com>
- Date: Thu, 5 Feb 2009 22:35:14 +0530
- Subject: Re: Fw: systemtap application to find applications doing polling
- References: <20090129164643.GA17621@in.ibm.com> <20090201132815.GA6346@dirshya.in.ibm.com> <4989CB77.20706@redhat.com>
- Reply-to: svaidy at linux dot vnet dot ibm dot com
* William Cohen <wcohen@redhat.com> [2009-02-04 12:08:07]:
> Vaidyanathan Srinivasan wrote:
> > * Maneesh Soni <maneesh@in.ibm.com> [2009-01-29 22:16:43]:
> >
> >> Is this something useful for energy management?
> >
> > Hi Maneesh,
> >
> > This would be useful for energy management as Ulrich has noted in his
> > blog. The rate of wake up is reposted by PowerTop using
> > /proc/timer_list where even the device driver timers and in kernel
> > offenders are also identified.
> >
> > Once the userspace application is identified, then further details on
> > the type of polling loops and syscall and library APIs will definitely
> > help optimise the user applications.
> >
> > Will's script will help to identify types of polling loops and top
> > offenders at run time in an user space application.
> >
> >> ----- Forwarded message from William Cohen <wcohen@redhat.com> -----
> >>
> >> Date: Wed, 28 Jan 2009 11:52:10 -0500
> >> From: William Cohen <wcohen@redhat.com>
> >> To: SystemTAP <systemtap@sources.redhat.com>
> >> CC: Ulrich Drepper <drepper@redhat.com>
> >> Subject: systemtap application to find applications doing polling
> >>
> >> Hi All,
> >>
> >> Uli Drepper mentions in a blog entry need "avoid unnecessary wakeups" and that a
> >> systemtap script to monitor this would be useful:
> >>
> >> http://udrepper.livejournal.com/19041.html
> >>
> >> I talked with Uli about developing the script that identify the processes that
> >> are doing a lot of polling. The attached script, timeout.stp, monitors the
> >> poll, epoll_wait, select, futex, nanosleep, timer (it_real_fn). The poll and
> >> epoll are only recorded if the timeout value is greater than zero. The resulting
> >> output is displayed in a top-like format for the top twenty processes with the
> >> entries ordered from most problem calls to fewest. The columns indicate the
> >> count of each type. The output ends up like the following:
> >>
> >> uid | poll select epoll itimer futex nanosle signal| process
> >> 2628 | 0 364 0 0 0 0 0| Xorg
> >> 3586 | 21 0 0 0 179 0 0| thunderbird-bin
> >> 3575 | 41 0 0 0 0 20 0| xchat
> >> 3454 | 0 60 0 0 0 0 0| emacs
> >> 3325 | 43 0 0 0 0 0 0| gnome-terminal
> >> 3082 | 11 0 0 0 0 0 0| gnome-panel
> >> 3068 | 7 0 0 0 0 0 0| metacity
> >> 3181 | 6 0 0 0 0 0 0| wnck-applet
> >> 3119 | 0 5 0 0 0 0 0| httpd
> >> 2135 | 4 0 0 0 0 0 0| hald
> >> 2307 | 4 0 0 0 0 0 0| NetworkManager
> >> 2362 | 4 0 0 0 0 0 0| setroubleshootd
> >> 2530 | 0 0 0 0 0 4 0| cups-polld
> >> 3084 | 3 0 0 0 0 0 0| nautilus
> >> 3616 | 0 0 0 0 3 0 0| firefox
> >> 3060 | 2 0 0 0 0 0 0| gnome-settings-
> >> 2304 | 2 0 0 0 0 0 0| hald-addon-stor
> >> 0 | 0 0 0 1 0 0 0| swapper
> >>
> >> I plan to check this into systemtap.examples directory in next day or so. Just
> >> looking to see if people have additional suggestions.
> >>
> >> -Will
> >
> > This output information and format is good, while I have the following
> > comments and suggestion:
> >
> > * Display the observation interval in the output and provide options
> > for say 1s or 10s sampling
>
> It is possible to have an optional argument in systemtap such as the
> para-callgraph.stp:
>
> http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=blob_plain;f=testsuite/systemtap.examples/general/para-callgraph.stp;hb=HEAD
>
> > * At low wakeup rate does the system tap script itself add to the
> > wakeups?
>
> No effort is made to filter out the impact from the systemtap code from the
> output. Don't see the effect in the output of timeout.stp, but in powertop can
> see some effect:
>
> 41.8% (1000.0) staprun : __mod_timer (__stp_time_timer_callback)
> 41.8% (1000.0) udevd : __mod_timer (__stp_time_timer_callback)
> 4.2% (100.0) staprun : __mod_timer (__utt_wakeup_timer)
> 4.2% ( 99.8) staprun : queue_delayed_work (delayed_work_timer_fn)
>
> This makes me wonder if there is someway to reduce staprun's effect.
This wakeup rate is very high and this implies that we should use the
stap script for a per-application level wakeup tracing only and should
not try to profile the overall system.
Definitely some opportunity here for stap to reduce wakeups :)
But what is causing udevd to wakeup so often!
> > * Does these values match closely with PowerTop?
>
> Powertop shows rate and the current timeout script is showing total
> accumulation. If the timeout script is adjusted to print every 10 seconds and
> clear out the data then a more direct comparison can be made. I made that change
> and looked at the output. There appears to be some differences in what each is
> measuring. Powertop reading /proc/timer_stats need to check to see how that
> differs from what timeout.stp is probing.
Overall wakeup rate shown by powertop is averaged over nr_cpus. The
per application/thread wakeup count is accurate as far as I have
determined from experiments and I have also compared against
/proc/interrupts. (LOC is the local timers)
> > * Can we aggregate these values for a group of PIDs (possibly
> > parent pid or tgid) so that we can collect results for a complete
> > application stack easily. I have tried doing this by manually
> > adding up wake-ups for a group of PIDs
>
> There have been examples that have PID filters that limit the scope to some
> subset of PIDs and their children. Put the PID and any children in to
> associative array and then check the associative array before doing probing
> operation.
Yeah, this should be easy with stap.
> > * Another wishlist item would be to be able to add a probe at various
> > locations in library and move closer to userspace code.
>
> There has been some work on userspace probing for systemtap. It isn't in a
> packaged distro yet, but there should be one for fedora coming out soon.
> However, this needs utrace in the kernel.
Looking forward to this feature. This will bring statistics and
tracing closer to libraries where there may be better scope for
optimisations.
Thanks,
Vaidy