This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [PATCH] kprobes for s390 architecture
- From: Heiko Carstens <heiko dot carstens at de dot ibm dot com>
- To: Michael Grundy <grundym at us dot ibm dot com>
- Cc: Jan Glauber <jan dot glauber at de dot ibm dot com>, Martin Schwidefsky <schwidefsky at de dot ibm dot com>, linux-kernel at vger dot kernel dot org, systemtap at sources dot redhat dot com
- Date: Sat, 24 Jun 2006 13:36:41 +0200
- Subject: Re: [PATCH] kprobes for s390 architecture
- References: <20060623150344.GL9446@osiris.boeblingen.de.ibm.com> <OF44DB398C.F7A51098-ON88257196.007CD277-88257196.007DC8F0@us.ibm.com> <20060623222106.GA25410@osiris.ibm.com>
> At least this is something that could work... completely untested and might
> have some problems that I didn't think of ;)
>
> struct capture_data {
> atomic_t cpus;
> atomic_t done;
> };
>
> void capture_wait(void *data)
> {
> struct capture_data *cap = data;
>
> atomic_inc(&cap->cpus);
> while(!atomic_read(&cap->done))
> cpu_relax();
> atomic_dec(&cap->cpus);
> }
>
> void replace_instr(int *a)
> {
> struct capture_data cap;
>
> preempt_disable();
> atomic_set(&cap.cpus, 0);
> atomic_set(&cap.done, 0);
> smp_call_function(capture_wait, (void *)&cap, 0, 0);
> while (atomic_read(&cap.cpus) != num_online_cpus() - 1)
> cpu_relax();
> *a = 0x42;
> atomic_inc(&cap.done);
> while (atomic_read(&cap.cpus))
> cpu_relax();
> preempt_enable();
> }
Forget this crap. It can easily cause deadlocks with more than two cpus.
Just do a compare and swap operation on the instruction you want to replace,
then do an smp_call_function() with the wait parameter set to 1 and passing
a pointer to a function that does nothing but return.
The cs/csg instruction will make sure that your cpu has exclusive access
to the memory region in question and will invalidate the cache lines on all
other cpus.
With the following smp_call_function() you can make sure that all other
cpus discard everything they have prefetched. Hence there is only a small
window between the cs/csg and the return of smp_call_function() where you
do not know if other cpus are executing the old or the new instruction.