This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Kprobes- robust fault handling for i386


On Tue, Feb 28, 2006 at 12:25:26PM -0800, Keshavamurthy Anil S wrote:
> On Tue, Feb 28, 2006 at 06:38:36AM -0800, Prasanna S Panchamukhi wrote:
> > 
> >    Anil,
> > 
> >    Thanks for your review comments. Please see the updated patch
> >    below, this patch is only for i386 architecture and once
> >    we are ok with it, we will port it to other architectures.
> This version looks good with no new Kprobes states.
> Makes life easy to understand :-)
> 
> >    [..]The main reason to avoid post_handler execution in this
> >    case is to avoid any incosistant data references between pre and post
> >    handlers.
> Okay, I got that point, but if the fault recovery in pre_handler
> is *successful*, then in this case you *should* permit calling
> post_handler. See my inline comments to address this issue.

Anil,

To skip post_handler execution for unsuccessful fault recovery in the
pre_hanlder, we need to take several things like aggrigate kprobe
handlers, using the same kprobe structures across the same probe hit on 
different cpus at the same time etc. This restricts us from avoiding
execution of the post-handler in case of unsuccessful fault recovery.
Please find the patch below that allows post-handler execution in all
cases as of now.

Thanks
Prasanna

This patch provides proper kprobes fault handling, if a user-specified
pre/post handler tries to access user address space, because of  
copy_from_user(), get_user() etc. The user-specified fault handler
gets called only if the fault occurs while executing user-specified
handler. In such a case user-specified handler is allowed to fix it
first. If it is unsuccessful, we try to fix it by calling 
fixup_exception(). The user-specified handler will not be called if
the fault happened when single stepping the original instruction,
instead we reset the current probe and allow the system page fault
handler to handle it.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>



 arch/i386/kernel/kprobes.c |   57 +++++++++++++++++++++++++++++++++++++++------
 1 files changed, 50 insertions(+), 7 deletions(-)

diff -puN arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling arch/i386/kernel/kprobes.c
--- linux-2.6.16-rc4-mm2/arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling	2006-03-01 19:05:01.000000000 +0530
+++ linux-2.6.16-rc4-mm2-prasanna/arch/i386/kernel/kprobes.c	2006-03-01 19:07:17.000000000 +0530
@@ -35,6 +35,7 @@
 #include <asm/cacheflush.h>
 #include <asm/kdebug.h>
 #include <asm/desc.h>
+#include <asm/uaccess.h>
 
 void jprobe_return_end(void);
 
@@ -554,15 +555,57 @@ static inline int kprobe_fault_handler(s
 	struct kprobe *cur = kprobe_running();
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
 
-	if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
-		return 1;
-
-	if (kcb->kprobe_status & KPROBE_HIT_SS) {
-		resume_execution(cur, regs, kcb);
+	switch(kcb->kprobe_status) {
+	case KPROBE_HIT_SS:
+	case KPROBE_REENTER:
+		/*
+		 * We are here because the instruction being single
+		 * stepped caused a page fault. We reset the current
+		 * kprobe and the eip points back to the probe address
+		 * and allow the page fault handler to continue as a
+		 * normal page fault.
+		 */
+		regs->eip = (unsigned long)cur->addr;
 		regs->eflags |= kcb->kprobe_old_eflags;
-
-		reset_current_kprobe();
+		if (kcb->kprobe_status == KPROBE_REENTER)
+			restore_previous_kprobe(kcb);
+		else
+			reset_current_kprobe();
 		preempt_enable_no_resched();
+		break;
+	case KPROBE_HIT_ACTIVE:
+	case KPROBE_HIT_SSDONE:
+		/*
+		 * We increment the nmissed count for accounting,
+		 * we can also use npre/npostfault count for accouting
+		 * these specific fault cases.
+		 */
+		kprobes_inc_nmissed_count(cur);
+
+		/*
+		 * We come here because instructions in the pre/post
+		 * handler caused the page_fault, this could happen
+		 * if handler tries to access user space by
+		 * copy_from_user(), get_user() etc. Let the
+		 * user-specified handler try to fix it first.
+		 */
+		if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
+			return 1;
+
+		/*
+		 * In case the user-specified fault handler returned
+		 * zero, try to fix up.
+		 */
+		if (fixup_exception(regs))
+			return 1;
+
+		/*
+		 * fixup_exception() could not handle it,
+		 * Let do_page_fault() fix it.
+		 */
+		break;
+	default:
+		break;
 	}
 	return 0;
 }

_
-- 
Prasanna S Panchamukhi
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-51776329


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]