This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Proposed systemtap access to perfmon hardware


wcohen wrote:

> To try to get a feel on how the performance monitoring hardware
> support would work in SystemTap I wrote some simple examples. 

Nice work.  To flesh out the operational model (and please correct me
if I'm wrong): the way this stuff would all work is:

- The systemtap translator would be linked with libpfm from perfmon2.
  (libpfm license is friendly.)

- This library would be used at translation time to map perfmon.* probe
  point specifications to PMC register descriptions (pfmlib_output_param_t).
  (This will require telling the system the exact target cpu type for
  cross-instrumentation.)

- These descriptions would be emitted into the C code, for actual
  installation during module initialization.  For our first cut, since
  there appears to exist no kernel-side management API at the moment,
  the C code would directly manipulate the PMC registers.  (This means
  no coexistence for oprofile or other concurrent perfctr probing.
  C'est la vie.)

- The "sample" type perfmon probes would map to the same kind of
  dispatch/callback as the current "timer.profile": the probe handler
  should have valid pt_regs available.

- The free-running type perfmon probes, probably named
  "perfctr.SPEC.setup" or ".start" or ".begin" would map to a one-time
  initialization that passes a token (PMC counter number?)  to the
  handler.  Other probe handlers can then query/manipulate the
  free-running counter using that number via the start/stop/query
  functions.

Is that sufficiently detailed to begin an implementation?


> [...] print ("ipc is %d.%d \n", ipc/factor, ipc % factor);

(An aside: we should have a more compact notation for this.  We won't
support floating point numbers, but integers can be commonly scaled
like this.  Maybe printf("%.Nf", value), where N implies a
power-of-ten scaling factor, and printf("%*f", value, scale) for
general factors.)


- FChE


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]