This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Getting user-space stack backtraces in more probe contexts


> > 1. Work with what you got.
[...]
> I like it as a fallback / heuristic.  (Plus we should be able to fall
> back to frame-pointer heuristics and/or the kernel's guesswork.)

I wouldn't call this a "fallback".  Rather, it seems like the natural
thing to try first and then only fall back to other (or additional)
means when you run into a need for a trap-state register that you
don't have.

So, you'd proceed with user-only unwinding as I described.  When you
come across a need for a register value and that register's state is
"undefined", then you pause to go off and do kernel-side unwinding
from your base probe state back to a user-mode state.  If you succeed
in producing what you think are the user-mode registers at the
boundary crossing, then you can return to user user-mode unwinding
state and replace every user register there in "undefined" or
"same-value" state with the value just recovered via kernel unwinding.

Of course, if the same probe site also happens to use a kernel
backtrace, then you should just have it do the kernel backtrace
calculation beforehand (even if it's used later in the probe action
script code than the ubacktrace is) to prime the state for the user
backtrace.  (No need for laziness when you know you're going to do it
anyway.)

> > 2. Turtles all the way down!
[...]
> I like it.  This seems like the best first try.

Well, I'm not sure what you mean by "first".  This is little work on
the stap/runtime side and almost entirely just a big dependency on the
kernel compilation details being what you need.  So you can try it
just as soon as you are using a bleeding-edge kernel, as of this
writing not yet compiled anywhere by anyone except for my home build,
and only on x86.  (Or, go to town right away on ia64, where everything
is already peachy in its own way.)

The upstream x86 kernel, when built with a sufficiently recent
assembler, may well have the CFI for the important assembly layers by
2.6.35 or so.  Fedora x86 kernels will have it much sooner, but
probably only ever in updates for 12 and 13.

A further note about kernel CFI.  The current x86 kernels are built
with -fno-asynchronous-unwind-tables, so the compiler will sometimes
fudge the CFI state in between call sites.  This becomes relevant when
you want to unwind across an interrupt/trap frame where kernel-mode
code got interrupted.  The assembly CFI for the actual interrupt frame
will be correct, so you unwind through it to the exact state that got
interrupted.  But if what was interrupted was near code changing CFI
state between calls, there may be problems.  A likely example is if
the interrupted instruction was before pushing some arguments on the
stack for a call.  You may get CFI that thinks the SP is in the state
before any of the pushes, or after all of them, for a PC where that's
not right.  This could get you off by a word or more in judging the
CFA of that interrupted frame, which will lead you wrong values for
its caller's PC, CFA, or other registers.

> > 3. Two phase with a safe point
[...]
> I don't like it as much.  It's far more complex, 

Agreed on complexity.  OTOH, it is in a larger sense a complexity
reducer when one takes advantage of it to abandon prepacking CFI from
userland binaries and instead do dynamic user CFI unwinding (just as
userland does).

> plus I would like not
> to sacrifice the ability to process backtraces as first class run-time
> objects.

I don't quite understand this.  Making backtraces a special type is
what I'd call a "first-class" object, contrary to the status quo.  I
suspect you mean "... process backtraces as normal run-time strings".

> > 4. Pre-collect via syscall-entry tracing
> > [...]
> 
> I like this, but doesn't appear to handle the interrupt / signal /
> preemption type involuntary jumps into kernel space.

I'm sorry I was not clear.  I referred to "non-system-call entries" or
"other entries" when talking about different kinds of kernel entry,
and these include all those kinds you mention.  You get complete
information for these and can use 'struct pt_regs' or user_regset
freely.  It is only the system call paths (and not even those on i386)
that save partial register information and could require any of these
complex techniques.


Thanks,
Roland


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]