This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Questions about kprobes implementation on SMP and preemptible kernels
- From: Quentin Barnes <qbarnes at urbana dot css dot mot dot com>
- To: Ananth N Mavinakayanahalli <ananth at in dot ibm dot com>
- Cc: systemtap at sources dot redhat dot com, dwilder at us dot ibm dot com, grundym at us dot ibm dot com
- Date: Wed, 17 Jan 2007 22:29:21 -0600
- Subject: Re: Questions about kprobes implementation on SMP and preemptible kernels
- References: <20070116034826.GA22002@urbana.css.mot.com> <20070116053955.GA14386@in.ibm.com>
One of two things must be true. Either: 1) There is a
guaranteed unbroken chain of execution on the processor from
the point of its exception occurring all the way through to
kprobe_exceptions_notify() and onward, or 2) There is no
guarantee of an unbroken chain -- execution is allowed to
swap to another processor between the exception and executing
kprobe_exceptions_notify().
If #1 is true, preempt_disable()/preempt_enable() is superfluous and
calling kprobe_running() is safe without the protection. If #2 is
true, then calling preempt_disable()/preempt_enable() is pointless
since there is no guarantee we're on the same processor as the
exception so calling kprobe_running() is an unreliable way to
determine what processor the probe that triggered the exception was
on. A way to solve that would be the ID of the processor that
triggered the exception must be saved as part of the exception
context and that state must be accessed to determine the processor
that kprobe triggered on instead of just calling kprobe_running().
Can someone explain which is true? (Or is neither true and my
understanding of the Linux kernel is just wrong?)
1 is true. The notify_page_fault() call in do_page_fault() happens early
enough and there isn't anything in the page_fault handling code up to
this point, to cause a broken chain of execution. The reason we have a
preempt_(en/dis)able in there is 'cos the Linux kernel has multiple
interfaces to get smp processor id and there are caveats for usage of
each variant. kprobe_running() calls __get_cpu_var() which requires
preemption to be disabled and on some archs, ends up calling
smp_processor_id() which in turn, is not implicitly preemption safe.
Having __get_cpu_var() require preemption be disabled on invocation
makes perfect sense. If you're calling __get_cpu_var(), you'd
better have or preemption disabled by either incrementing preempt
count or by disabling interrupts. Otherwise, the code calling
__get_cpu_var() in all likelihood has a very serious bug.
It seems to me that the preempt_disable()/preempt_enable() calls in
kprobe_exceptions_notify() are either at best inert, or possibly
hiding an existing bug. Either way, they shouldn't be there even on
the architectures you mentioned. If on those architectures invoking
__get_cpu_var() failed, it was because it was trying to point out
an existing bug in the code. (Or the checks thenselves are busted
because they forgot to account for interrupts blocking preemption
and hence they need to be corrected.)
I'm wondering if I found a bug on the ARM implementation of prefetch
and data abort exception handlers for SMP platforms with kernel
preemption enabled. Immediately after switching to SVC mode in
__pabt_svc and __dabt_svc, the handler re-enables IRQs (interrupts)
if they were enabled prior to the exception. If it re-enables
interrupts at this point, it seems to me that a preemptive kernel
(CONFIG_PREEMPT defined) could switch execution to another processor
breaking the chain of execution before it has a chance to note which
processor triggered the exception. Any ARM Linux kernel experts on
this list to comment, or do I need to bounce this to another list?
In the __irq_svc path (IRQ interrupt handling), the handler
increments the preempt count before reenabling IRQs (and later
restores its previous value after servicing the interrupt) when
built for a preemptive kernel.
Quentin