This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: health monitoring scripts


On 08/20/2009 02:45 PM, Frank Ch. Eigler wrote:
> Hi -
> 
> I ask asked to share some snippets of an old idea regarding a possible
> compelling application for systemtap.  Here goes, from a few months
> back:
> 
> ------------------------------------------------------------------------
> 
> The technical gist of the idea would have several parts: to create a
> suite of systemtap script fragments (a new "health" tapset); and to
> one or more front-ends for monitoring the ongoing probes graphically
> and via other tools.
> 
> Each tapset piece would represent a single subsystem or concern.  A
> variety of probes could act to represent a view of its health
> (tracking allocation counts, rates of activity, latency trends,
> whatever makes sense).  The barest sketch ...
> 

... sketch removed ...

I've been taking a stab at implementing this.  Here's what I've discovered.

At first, the basic framework idea itself didn't work correctly.  Based
on which tapset file something was in, it would get included (or not) in
the final module.  After driving me slightly (more) nuts, I narrowed
down the behavior into pr10568 ('translator issue with tapsets').  From
what I've looked at, this behavior has never worked correctly.  Josh
fixed the problem, and now the basic framework idea works correctly (and
it is pretty slick).

In irc, Frank mentioned to me several (theoretically) simple things to
monitor: number of context switches, number of interrupts, and number of
network buffers.  Here's the status of each (based on looking at the
source to fedora's 2.6.31-0.125.rc5 kernel):

- number of context switches:  You can see the current number of context
switches by looking in /proc/stat in the 'ctxt' line.  This information
comes from calling the nr_context_switches() function in kernel/sched.c.
 nr_context_switches() gets this information from a per-CPU runqueue
structure (which contains lots of interesting information).
Unfortunately, neither the nr_context_switches() function is exported
nor the underlying runqueue data structure is exported.  The nr_switches
field of the runqueue structure gets incremented in schedule(), but it
is possible for for schedule() to increment nr_switches more than once
(and we have no way to detect this).

- number of interrupts: You can see the current number of all of the
various types of interrupts for each cpu by looking in /proc/interrupts
or you can look in /proc/stats for a sum of interrupts.  Each
architecture is responsible for implementing its own interrupt counts
(since each arch can have different types of interrupts).  For x86, this
is handled by arch/x86/kernel/{irq.c,irq64.c,irq32.c}.  The
arch_irq_stat_cpu() function gets this information from a per-CPU
irq_stat structure.  Unfortunately, neither the arch_irq_stat_cpu()
function is exported nor the underlying irq_stat structure is exported.
 If we wanted to just monitor the number of nmi interrupts (for
example), that gets incremented by do_nmi() (in
arch/x86/kernel/traps.c), but since that function is marked as
__kprobes, we can't put a kprobe on it.

- number of network buffers: The kernel doesn't really keep up with the
number of network buffers (skb's).  skb's are created using
__alloc_skb() and deleted using __kfree_skb() (both functions are in
net/core/skbuff.c).  It is possible to put a kprobe on both functions so
we can keep up with a running total of network buffers.  However, it
doesn't appear to be (easily) possible to get the correct initial value
of network buffers currently allocated.

So, basically I've struck out trying to implement these specific health
monitoring points.  If anyone knows a good way to get some of these
values, I'd be happy to listen.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]