This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Djprobes questions

From: Masami Hiramatsu <masami dot hiramatsu dot pt at hitachi dot com>
To: Mathieu Desnoyers <compudj at krystal dot dyndns dot org>
Cc: Masami Hiramatsu <hiramatu at sdl dot hitachi dot co dot jp>, linux-kernel at vger dot kernel dot org, systemtap at sources dot redhat dot com, mbligh at google dot com, Satoshi Oshima <soshima at redhat dot com>, Yumiko Sugita <yumiko dot sugita dot yf at hitachi dot com>, Hideo Aoki <haoki at redhat dot com>
Date: Tue, 13 Mar 2007 15:07:09 +0900
Subject: Re: Djprobes questions
Organization: Systems Development Lab., Hitachi, Ltd., Japan
References: <20070312223749.GA22280@Krystal>

Hi Mathieu,

Mathieu Desnoyers wrote:
> Hi Masami,
> 
> I recently had to add support for inline code patching on i386 to my
> marker infrastructure. Clearly, it looks like what is done in djprobes,
> with the main difference that I only patch the immediate value of a 2
> bytes "load immediate" instruction.

That's interesting.

> I think I found a solution to one of the main issues with djprobes : it
> currently has to wait for each CPU to hit the probe before being sure
> that it's safe to patch the code with something else than an int3. This
> is due to PIII errata 49, which says that a CPU much execute a
> serializing instruction before executing cross-modified code.

Hmm, djprobe already might not wait for each CPU to hit the probe
point. It just wait scheduler synchronization instead of that.
And after that, it issues cpuid for cache serialization before
executing cross-modified code.

The most difficult point of the djprobe is that it has to replace
"live" instructions. So we must check other processors not to run
those instructions carefully.

> Here is what I do : While I use a breakpoint to fall in a trap for the
> CPUs that hit the site currently being modified, I also send an IPI to
> all CPUs so they execute cpuid. Once it returns, I am sure that every
> CPU has executed a serializing instruction, which enables me to go on
> with the complete code modification, therefore removing the initial
> breakpoint.

I think its OK. That is the same way which I've done in djprobe.

> Here is my code :
> 
> http://ltt.polymtl.ca/cgi-bin/gitweb.cgi?p=linux-2.6-lttng.git;a=blob;f=arch/i386/kernel/marker.c;h=89b06f02f0966685be260d6364a0dd94c3d14456;hb=v2.6.20-lttng
> 
> (Comments are welcome)
> 
> On a second note, looking at the djprobes code triggered some question 
> in my mind about the safety of using a worker thread to "make sure"
> every interrupt context has returned (so there is no IP pointing into
> the modified code). The following scenario might be possible : an
> interrupt handler (or trap handler) reenables interrupts, does irq_exit()
> or nmi_exit() (which reenables preemption) but does not do iret yet. My
> understanding is that it could be scheduled and have a return IP
> pointing to the code that is being modified. Am I right ?

Same idea was already discussed. It might work on normal kernel,
but, unfortunately, it doesn't work on preemptive kernel. And actually,
that idea is same as synchronize_sched(). So, I've used it on normal
kernel. In the case of preemptive kernel, currently, I'm using
freeze_processes() suggested by Ingo.

Anyway, I and Satoshi are developing a static analysis tool to
check whether target instructions can be replaced by long jump.
I'd like to release djprobe patch against latest kernel after
developed it.

Best regards,

-- 
Masami HIRAMATSU
Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

Follow-Ups:
- Re: Djprobes questions
  - From: Mathieu Desnoyers

References:
- Djprobes questions
  - From: Mathieu Desnoyers

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]