This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC] Systemtap translator support for hardware breakpoints on

From: Roland McGrath <roland at redhat dot com>
To: Prerna Saxena <prerna at linux dot vnet dot ibm dot com>
Cc: systemtap at sourceware dot org
Date: Thu, 7 Jan 2010 15:14:57 -0800 (PST)
Subject: Re: [RFC] Systemtap translator support for hardware breakpoints on
References: <4B459CC8.2030402@linux.vnet.ibm.com>

> probe kernel.data(ADDRESS).write
> probe kernel.data(ADDRESS).rw
> probe kernel.data(ADDRESS).length(LEN).write
> probe kernel.data(ADDRESS).length(LEN).rw
> probe kernel.data("SYMBOL_NAME").write
> probe kernel.data("SYMBOL_NAME").rw
> 
> The 'length' construct is at present only supported with an address, and
> not extended to symbol names. Wherever 'length' is not specified, the 
> translator requests a hardware breakpoint probe of length 1.
> If an invalid length is specified which is not supported by the 
> architecture, the translator skips registration for that probe with a 
> warning about incompatible length.

I don't understand why the literal vs symbolic selection of an address has
anything to do with .length().  The syntax should be consistent:

probe kernel.data({ADDRESS,"NAME"})[.length(LEN)].{read,write}

I don't think the specification of the script-language feature should have
that exact wording about the length.  On other machines (IIRC all but x86,
in fact), there is no option for length, it's always aligned-word-size.

On other machines there is a hard alignment constraint on the address.
e.g. on powerpc, it doesn't even store the low two bits of the address.

On x86, there is a length-based implicit alignment constraint.  That is,
for .length(2) the address is considered % 2, for .length(4) % 4, for
.length(8) % 8.  

On x86, and I think also on powerpc and probably all others, the meaning of
the addr+length (i.e. fixed addr+8 on powerpc) range is that access to any
byte within that range triggers the watchpoint, not just an exact access
that specifies that address and length.

Nothing will complain about misalignment (unless maybe the hw_breakpoint
layer does, I don't recall), but the low bits will be ignored so
kernel.data(0x123).length(2) really means kernel.data(0x122).length(2).
The translator should prevent you from doing something that nonobvious.


Next issue.  So that's the basic primitive feature.  I take this to mean
that "SYMBOL" is just a way to specify the address with an ELF symbol.
i.e. it means the same as ADDRESS aside from runtime address bias.

What I think you'd really want is:

	probe kernel.data($variable).write { ... }

That means "variable" gets looked up by $ rules, but in some idea of the
"global" scope for kernel (search all CUs' top-level scopes, I guess).
That could be a C++ $foo::bar::baz without doing the mangling yourself,
etc.  It uses the type information to decide the size of $variable and
implicitly set the length for the breakpoint.

Note you can use the full expressivity of $expr here,
i.e. $variable->field1->field2.  (IIRC our $ syntax uses only ->
and never . though in this case it's only valid to use what should
be $variable.field1.field2, i.e. static offset calculations rather
than pointer indirections.)

IOW, we would elaborate:

	probe kernel.data($var::iable->second_int_field)

as:

	probe kernel.data("_Zmangle_var_iable"+4).length(4)

Note that with $ syntax:

	probe module("foo").data($foovar)

makes sense to indicate which "global" context $foovar resolves in.


Third issue.  So I mentioned that on powerpc, all you get is an aligned
8-byte watchpoint.  Consider:

	int a;	// 0x124
	int b;	// 0x128
	int c;	// 0x12c

If you wanted:

	probe kernel.data($c).write

i.e.:

	probe kernel.data(0x12c).write.length(4)

you can't get it.  To start with, the translator says .length(8) is your
only option, you can't have it.  Then it says 0x12c is not aligned to 8,
you can't have it.  For real .write semantics, this is true, and so be it.
All you can do is probe kernel.data(0x128).write and tell that $a and/or $b
were touched.  For .read, you are just SOL and that's all you can do.

But probably the common use is that you really wanted:

	probe kernel.data($c).change { ... }

For this, we can do it for you with some implicit magic.  
i.e., translate this into:

	probe begin { c_val = $c }
	probe kernel.data(0x128).length(8) {
	  new_c = $c;  
	  if ($c == c_val) next;
	  c_val = new_c;
	  ...
	}

but really do it with implicit runtime stuff rather through scriptese.

When you have:

	probe kernel.data($b).change { some_stuff() }
	probe kernel.data($c).change { other_stuff() }

then you can turn this into:

	probe begin { b_val = $b; c_val = $c }
	probe kernel.data(0x128).length(8) {
	  new_b = $b; new_c = $c;
	  if ($b != b_val) some_stuff();
	  if ($c != c_val) other_stuff();
	  b_val = new_b;
	  c_val = new_c;
	}

Note that consistent .change semantics would require doing this even when
there isn't a size/alignment mismatch, just so that you distinguish "value
changed" from "same value written again".


Next, a related tangent.  For:

	probe kernel.data(0x120).length(8)

on i386 you can implicitly turn that into:

	probe kernel.data(0x120).length(4), kernel.data(0x124).length(4)

It is probably worth doing that for length 8 on i386, since "long long"
does get used there.  You could theoretically do it for larger things too,
but that is probably not worth bothering with.  On x86 you get to set up to
4 breakpoints, so x86-64 could cover up to 32 bytes this way.  But
e.g. powerpc has just the one, so you'll never actually win doing this even
for a 16-byte range.


That's enough magic for one message, so I'll leave the really exciting
subject to a separate strand of the thread.


Thanks,
Roland

Follow-Ups:
- Re: [RFC] Systemtap translator support for hardware breakpoints on
  - From: Frank Ch. Eigler

References:
- [RFC] Systemtap translator support for hardware breakpoints on
  - From: Prerna Saxena

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]