This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Hitachi djprobe mechanism

From: Mathieu Desnoyers <compudj at krystal dot dyndns dot org>
To: Satoshi Oshima <soshima at redhat dot com>
Cc: karim at opersys dot com, Richard J Moore <richardj_moore at uk dot ibm dot com>, systemtap at sources dot redhat dot com, Andi Kleen <ak at suse dot de>, Masami Hiramatsu <hiramatu at sdl dot hitachi dot co dot jp>, Masami Hiramatsu <masami dot hiramatsu at gmail dot com>, michel dot dagenais at polymtl dot ca, Roland McGrath <roland at redhat dot com>, sugita at sdl dot hitachi dot co dot jp
Date: Wed, 3 Aug 2005 21:12:46 -0400
Subject: Re: Hitachi djprobe mechanism
References: <OF331D042E.CCADD212-ON41257050.002DC63A-41257050.002F8168@uk.ibm.com> <42EE7E97.7080501@redhat.com> <42EE86AD.609@opersys.com> <42EE9E4B.7060204@redhat.com> <42EEAAB0.3030902@opersys.com> <42EFBE56.9080403@redhat.com>

* Satoshi Oshima (soshima@redhat.com) wrote:
> I see.
> 
> We should add another limitation to djprobe limitation list.
> Current list is ...
> ------------------------------------------
> 
> limitation of djprobe
> 
> djprobe user must avoid inserting a probe into the place below:
> 
> code includes relative jmp instruction
> code includes call instruction
> code includes int instruction

Well... When you say "code includes int instruction", this is really not what I
mean by "code being interrupted".

Interruption are asynchronous to the executing code, they may happen anywhere
where interrupts are not disabled. You still can have a int instruction which
synchronously raises an interrupt, and yes, it's not safe to overwrite them. But
the prior problem is asynchronous interruptions.

> functions that preempt current process such as sched() or might_resched()
> 

Well, if you run a voulountarily preemptble kernel, those will be explicit
calls. On the other hand, running a full preemptible kernel will make scheduler
being called from anywhere in your code (using an asynchronous interrupt).
Everywhere where interrupts are not disabled or preemption is not disabled are
at risk.

> >The only way you could limit that is if you did a static analysis
> >and forbade any insertion of probes on any instruction preceeding
> >a call that _may_ result in a process scheduling ... Surely you see
> >this can't scale.
> 
> I don't see why that analysis is required.
> We can simply suggest that user should avoid a call
> instruction.
> 
> The problem is EIPs which is included with replacing
> code on stack. So there is no problem when they don't
> try to replace call instruction.
>

Asynchronous interrupts will return to any instruction which is not in a zone
where interrupts are disabled. No need of call instruction to have this problem.
Well, in fact, even worst : non maskable interrupts can return _anywhere_,
excepted in the fault handler code (a double fault is handled by a abort if I
remember well).

> 
> >>In addition, all CPU run on bypass code after int3 bypass
> >>is created. (In another word, once int3 bypass would be set,
> >>all CPU never push replacing instruction address on it's stack)
> >>
> >>So we need to take care of EIPs on current process of all CPUs
> >>and interrupt stack. Now we are implementing this check code,
> >>and we will provide soon.
> >
> >But you have no way to figure out whether what you've found on the
> >stack is an address to some piece of code or just some valid data ...
> 
> We are implementing two different way to check this.
> 
> First one:
> Each interrupt handler push EIP on the stack to djprobe's
> per cpu data structure before calling do_irq or something,
> and pop EIP after returning.For checking safety,
> djprobe look through this pushed EIPs.
> 
> djprobe can easily check EIPs which are included on stacks.
> 
> But we are afraid that upstream would not accept this
> approach. So we are now trying another one.
>

Well, it will clearly have a performance cost on live systems I am not sure many
people will like.

> 
> Second one:
> Simply looking through current stack and interruption stack.
> djprobe may find the data that is same to an address to replace.
> When it would happen, djprobe can easily postpone to replace
> and wait for next check.
> 
> This implementation brings some delay to replace int 3 with
> jmp. But probe code is still valid by kprobe and there is
> no other side effect. Probe cost is same as kprobe.
> 

How do you plan to check all processors'stack ?

> Currently we have no plan to limit djprobe not to use for
> less than 5 bytes instruction. But when we would move to it,
> djprobe will not provide any check on stack. There is no
> problem when a stack has the same addresses to replace
> if the candidate is more than 4 byte. Because a processor
> can run jmp instruction instead of replaced code or int 3
> instruction.
> 

Instruction cache coherency might be a problem there, even if the instruction to
replace is bigger than 5 bytes. You have to make sure the instruction cache of
each CPU is flushed before they go back to this modified section.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

Follow-Ups:
- Re: Hitachi djprobe mechanism
  - From: Mathieu Desnoyers

References:
- Re: Hitachi djprobe mechanism
  - From: Richard J Moore
- Re: Hitachi djprobe mechanism
  - From: Satoshi Oshima
- Re: Hitachi djprobe mechanism
  - From: Karim Yaghmour
- Re: Hitachi djprobe mechanism
  - From: Satoshi Oshima
- Re: Hitachi djprobe mechanism
  - From: Karim Yaghmour
- Re: Hitachi djprobe mechanism
  - From: Satoshi Oshima

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]