This is the mail archive of the
systemtap@sources.redhat.com
mailing list for the systemtap project.
RE: [KPROBE][RFC] Tweak to the function return probe design
- From: "Lynch, Rusty" <rusty dot lynch at intel dot com>
- To: "Jim Keniston" <jkenisto at us dot ibm dot com>
- Cc: "SystemTAP" <systemtap at sources dot redhat dot com>
- Date: Wed, 8 Jun 2005 14:20:28 -0700
- Subject: RE: [KPROBE][RFC] Tweak to the function return probe design
From: Jim Keniston [mailto:jkenisto@us.ibm.com]
>Sent: Wednesday, June 08, 2005 1:51 PM
>To: Lynch, Rusty
>Cc: SystemTAP
>Subject: Re: [KPROBE][RFC] Tweak to the function return probe design
>
>On Wed, 2005-06-08 at 09:06, Rusty Lynch wrote:
>> From my experiences with adding return probes to x86_64 and ia64, and
the
>> feedback on LKML to those patches, I think we can simplify the design
>> for return probes.
>
>All in all, I like this a lot. Comments below.
>>
>...
>> If multiple return probes are registered for a given target
function,
>> then arch_prepare_kretprobe() will get called multiple times for
the
>same
>> task (since our kprobe implementation is able to handle multiple
>kprobes
>> at the same address.) Past the first call to
arch_prepare_kretprobe,
>> we end up with the original address stored in the return probe
instance
>> pointing to our trampoline function. (This is a significant
difference
>> from the original arch_prepare_kretprobe design.)
>
>I understand how it's necessary in your design that only the first
>(earliest hashed) instance contain the real return address. As I've
>mentioned before, this may occasionally be inconvenient for people
>writing handlers. If there's sufficient outcry, we can always add a
>function -- get_return_addr(struct kretprobe_instance*) or some such --
>that iterates through the hash list 'til it finds the real RA.
True, I never considered a handler needing to know the real return
address, but I can image where a handler would like to know such info.
>
>...
>>
>> * trampoline_probe_handler() looks up the newest return probe
instance
>> associated with the current task and then:
>> - calls the registered handler function
>> - sets the pt_regs instruction pointer back to the original
>> return address
>> - marks the instance as free
>> - returns in a way that the single stepping of the original
>> instruction is skipped
>>
>> If there were multiple kretprobes registered for the target
function,
>> then the original return address would be
trampoline_probe_handler,
>causing
>> the next instance to be handled until finally the real return
address
>> is restored.
>>
>> (Huge change)
>
>Not for ia64. :-)
Ok, busted.
> Consider the possibility of iterating down the hash
>list rather than bouncing repeatedly on the trampoline, incurring the
>trap overhead each time. Presumably you would not have to hold the
list
>locked the whole time you're calling the handlers.
>
Ah... good idea, I'll do that.
>>
>> * If the task is killed with some left-over return probe instances
>(meaning
>> that a target function was entered, but never returned), then we
just
>> free any instances associated with the task. (Not much different
other
>> then we can handle this without calling architecture specific
>functions.)
>>
>> BUT... I just mark each of the instances as unused without
bothering to
>> restore their original return address. The original implementation
>would
>> restore the return address for each instance (I assume because it
was
>> thought that the target function could still be running and could
still
>> return.)
>>
>> On i386 we do this cleanup from process.c:exit_thread() and
>> process.c:flush_thread(). Let me know if I am wrong, but I do not
>think
>> at this point in the task lifecycle that it is possible for the the
>> target functions to continue after this point and get into trouble
>> because of a wrong return address.
>>
>> (Significant change)
>
>do_execve calls load*binary, which calls flush_old_exec, which calls
>flush_thread, which calls kprobe_flush_task. I think we would be in
>trouble if somebody set a return probe on flush_old_exec or
>flush_thread. As mentioned previously, we could avoid that by refusing
>to register return probes on those functions.
I like the idea of refusing to register a return probe on the
problematic functions.
>>
>> This patch applies to the 2.6.12-rc6-mm1 kernel, but only for the
i386
>> architecture. I know this approach will work on x86_64 and ia64, and
I
>> haven't the slightest clue if this is ok for ppc64.
>>
>> --rusty
>
>Good design, and good writeup.
>Jim