This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: thoughts about exception-handling requirements for kprobes


On Mon, Mar 20, 2006 at 10:39:51AM -0800, Keshavamurthy Anil S wrote:
> On Sun, Mar 19, 2006 at 09:24:54AM -0800, Prasanna S Panchamukhi wrote:
> > 
> >    On Fri, Mar 17, 2006 at 01:50:57PM -0800, Keshavamurthy Anil S wrote:
> >    > On Thu, Mar 09, 2006 at 07:57:18AM -0800, Richard J Moore wrote:
> >    > >
> >    >  >     I've  been thinking about the need for exception-handling and
> >    how the
> >    > >    current implementation has become a little muddled.
> >    >
> >    > Here is my thinking on this kprobe fault handling...
> >    > Ideally we want the ability to recover from all
> >    > the page faults happening from either pre-handler
> >    > or happening from post-handler transparently in the
> >    > same way as the normal kernel would recover from
> >    > do_page_fault() function. In order for this to happen,
> >    > I think we should not be calling pre-handler/post-handler
> >    > by disabling preempt which is a major design change.
> >    > Also in the current code if fixup_exception() fails to
> >    > fixup the exception then falling back on the normal
> >    > do_page_fault() is a bad thing with preempt disabled.
> >    >
> >    > I was thinking on this issue for the past several days
> >    > and I believe that currently we are disabling preempt
> >    > before calling pre/post handler, because we don;t
> >    > want the process to get migrated to different CPU
> >    > and we don't want another process to be scheduled
> >    > while we are servicing kprobe as the newly scheduled
> >    > process might trigger another probe and we don;t
> >    > have space to save the kprobe control block(kprobe_ctlbk)
> >    > info, because we save kprobe_ctlbk in the per cpu structure.
> >    >
> >    > If we move this saving kprobe_ctlbk to task struct then
> >    > I think we will have the ability to call pre/post-handler
> >    > without having to disable preempt and their by any faults
> >    > happening from either pre/post handler can recover transparently
> >    > in the same way as the normal kernel would recover.
> >    >
> > 
> >    Kprobes user-specified pre/post handler are called within
> >    the interrupt context and if we allow page faults while within
> >    user-specified pre/post handler, then it might sleep.
> >    Is is ok to sleep while within the interrupt handler?
> Prasanna,
> 	I am not getting what you are asking here, if you are
> asking is it okay to sleep while within the interrupt handler,
> then it is BIG NO.

Anil,

> 
> What I am saying is that we should look into kprobes to see
> if we can support calling users pre/post handlers
> without having to disable preempt.
> 
> Currenlty we are calling users pre_handler() and post_handler()
> with preempt disabled. If the user has put a probes on 
> syscalls, then when his pre/post handlers are called he is
> bound to call copy_from_user(), which has a check might_sleep().
> The might_sleep() calls in_atomic() function which checks preempt_count()
> and if preempt_count() is greater than zero( in our case it indeed greater
> than zero, since we are calling pre/post handlers with preempt disabled)
>  the kernel prints a error message
> printk(KERN_ERR "Debug: sleeping function called from invalid"
>                                 " context at %s:%d\n", file, line);

Are you trying to tell here that by allowing preemption() in the
kprobes handler, the above debug message log can be avoided?

> 
> Also if we want to fallback on do_page_fault() function in kprobe_fault_handler() to 
> recover the page, then we should not be in preempt_disabled() state.

We actually do not want to fall back on system do_page_fault() because,
it might sleep. When pre/post handler page faults, we can just try 
calling fixup_exception() (non-ia64 architectures) and try to avoid actual
do_page_fault() to be called because it might sleep().

Thanks
Prasanna
-- 
Prasanna S Panchamukhi
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-51776329


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]