This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fw: kprobe fault handling





looks like my response didn't get copied to the mailing list ....
- -



systemtap-owner@sourceware.org wrote on 06/02/2006 19:50:23:

> I've been trying to understand how kprobes fault handling is supposed to
> work and why it isn't doing what I thought it did.


It seems that over the passage of time this capability has become broken or
removed.

The original design in dprobes was as follows:

While the probe handler (RPN interpreter) was running the interpreter
hooked page faults. If one occurred then it would simply pop the trap
stackframe off the stack and jump to probe back-end  processing, which
started with the single-step of the original instruction.

A little late an optional page-fault call-back to the RPN script was added.
If the current script had such a call-back then on page-faulting during
script interpretation, we would pop off the trap stackframe
as before, but instead of jumping to back-end processing we would
"longjump" to the page fault call-callback and continue probe
interpretation from that point.

Then kprobes came along and the call-back became an entry-point into the
probe-handler module. It was supposed to be 'longjumped' to if present and
the trap stack frame was supposed to be discarded by kprobes before the
longjump.

In other words a pagefault would always be silently handled and optionally
the probe-handler could elect to continue from a specified call-back point.
Never was it the intent to allow an unhandled pagefault to surface to the
kernel, except when single-stepping the original instruction.

If this is no longer the case, then someone has thrown out the baby with
the bath water.

Richard


>
> When page faults happen, do_page_fault() almost immediately calls
> notify_die(DIE_PAGE_FAULT,...) This calls the notifier chain which calls
> kprobe_exceptions_notify(). This calls kprobe_fault_handler().
>
> kprobe_fault_handler() checkes to see if there is a specific fault
> fandler for that kprobe, and if there is, it calls it.  Question: What
> do we imagine a probe-specific page fault handler would do?  Why is it
> useful?
>
> Then there is this code, which I don't understand
>    if (kcb->kprobe_status & KPROBE_HIT_SS) {
>       resume_execution(cur, regs, kcb);
>       regs->eflags |= kcb->kprobe_old_eflags;
>
>       reset_current_kprobe();
>       preempt_enable_no_resched();
>    }
>
> And that's it. kprobe_fault_handler returns 0.  No call to
> fixup_exceptions()!  So do_page_fault() will have to do the fixups, but
> first it will print nasty might_sleep warnings and maybe actually sleep!
>
> I could have sworn this was not the case previously but it has been a
> very long time since I have looked at the code at this level.  Anyway,
> this MUST be fixed.
>
> Martin
>
>



- -
Richard J Moore
IBM Advanced Linux Response Team - Linux Technology Centre
MOBEX: 264807; Mobile (+44) (0)7739-875237
Office: (+44) (0)1962-817072


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]