This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [pcp] suitability of PCP for event tracing

From: nathans at aconex dot com
To: "Frank Ch. Eigler" <fche at redhat dot com>, Ken McDonell <kenj at internode dot on dot net>, Greg Banks <gnb at evostor dot com>
Cc: systemtap at sources dot redhat dot com, pcp at oss dot sgi dot com
Date: Fri, 17 Sep 2010 09:18:40 +1000 (EST)
Subject: Re: [pcp] suitability of PCP for event tracing

Hi guys,

----- "Ken McDonell" <kenj@internode.on.net> wrote:

> On 16/09/2010 12:07 PM, Greg Banks wrote:
> > Frank Ch. Eigler wrote:
> 
> I think we should pursue the discussion this approach a little
> further. 
>   There is only one layer of buffering needed, at the PMDA.
> 
> So far, the obstacles to this approach would appear to be ...
> 
> 1. Need buffer management and per-client state in the PMDA (actually
> it is per PMAPI context which is a little more complicated, but doable)
> ... 
> I don't see either issue as a big deal, and together they are order of
> magnitude simpler than supporting the sort of asynchronous callbacks 
> from PMCD that have been suggested.
> 
> 2. Latency for event notification ... the client can control the
> polling 
> interval (down to a few milliseconds demonstrably works), so I expect
> 
> you'd be able to tune the latency to match the semantic demands.  If 
> really low latency is needed then any TPC-based mechanism is probably
> 

[sp. TCP]  :) ... local context mode could be used in that situation
(PM_CONTEXT_LOCAL), which would map more closely to the current trace
tools and doesn't use TCP.  I haven't seen any reason why this scheme
wont work for our little-used local context friend, good thing we did
not remove that code, eh Ken?  ;)

> not going to work well, and PCP may be the wrong tool for that space.

Local context should be fine, and perhaps that should be the default
mode for any generic PCP tracing client tool (which, I imagine, we'll
soon be needing).

> 
> c. does not break any existing PMDAs or PMAPI clients
> 

I guess it remains to be seen what (existing) tools will do with the
trace data ... I'm guessing for the most part they will ignore it (as
many of them do for STRING/AGGREGATE type already (pmie, pmval, etc).
So, there's still plenty of work to be done to do a good job of adding
support to the client tools - almost certainly a new tracing-specific
tool will be needed.

> d. be doable in a very short time ... for instance wrapping an array
> of 
> events inside a "special" data aggregate is simple and isolated, and 
> there is already the basis for the required PMCD-PMDA interaction to 
> ensure the context id is known to the PMDA, and the existing context 
> cleanup code in PMCD provides the place to notify PMDAs that a context
> will no longer be requesting events.
> 
> So, can anyone mount a convincing argument that the requirements would
> 
> demand changes to allow asynchronous behaviour between PMAPI clients 
> <---> PMCD <---> PMDAs?

Main concerns center around the PMDA buffering scheme ... things like,
how does a PMDA decide what a sensible timeframe for buffering data is
(probably will need some kind of per-PMDA memory limit on buffer size,
rather than time frame).  Also, will the PMDA have to keep track of
which clients have been sent which (portions of?) buffered data?  (in
case of multiple clients with different request frequencies ... might
get a bit hairy?).

Also, we've not really considered the additional requirements that we
have in archive mode.  Unlike the sampled data, traces have explicit
start and end points, which we will need to know about.  For example,
if I construct a chart with starting offset (-S) at 10am and ending
(-T) at 10:15, and a trace started at 9:45 which completes at 10:10,
I'd expect to see that trace displayed, even though the trace data
would (AIUI, in this proposal) all be stored at the time the trace
was sampled?  Well, actually, not sure how this will look? - does a
trace have to end before a PMDA would see it?  that'd be a bit lame;
or would we export start and end events separately?  then we need a
way to tie them back together in the client tools.  Or in this example
of a long-running trace (relative to client sample time), does the
PMDA report "trace X is in-progress" on each sample?  That'd be a bit
wasteful on disk space ... hmm, not clear what the best approach here
will be.

Could extend the existing temporal index to index start/end time for
traces so we can quickly find whether a client sample covers a trace?
Either way, I suspect "trace start" and "trace end" may need to each
be a new metric type (in addition to PM_TYPE_COUNTER, PM_TYPE_INSTANT
and PM_TYPE_DISCRETE that we have now, iow).

> If not, I strongly suggest we work to flesh out the changes needed to
> make a variable length array of structured event records avaliable 
> through the existing poll-based APIs.

I'm not far away (got distracted after starting then tossing XML, and
switching to JSON) from sending out some prototype JSON PDU support,
which adds in a libpcp_json library that would be handy here I think.

FWIW, the structured data approach should be just fine for capturing
the parent/child trace relationship which I want us to tackle as well
(from those papers I fwd'd); for traces that support this concept we
can add those as additional JSON maps (or XML elements, or...), so I
am content there.

Alot of work here, but its all fascinating stuff & gonna be great fun
to code!

cheers.

-- 
Nathan

Follow-Ups:
- Re: [pcp] suitability of PCP for event tracing
  - From: Ken McDonell

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]