This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [PATCH] Linux Kernel Markers
- From: Richard J Moore <richardj_moore at uk dot ibm dot com>
- To: Alan Cox <alan at lxorguk dot ukuu dot org dot uk>
- Cc: Andrew Morton <akpm at osdl dot org>, Mathieu Desnoyers <compudj at krystal dot dyndns dot org>, "Frank Ch. Eigler" <fche at redhat dot com>, Greg Kroah-Hartman <gregkh at suse dot de>, Christoph Hellwig <hch at infradead dot org>, Jes Sorensen <jes at sgi dot com>, Paul Mundt <lethal at linux-sh dot org>, linux-kernel <linux-kernel at vger dot kernel dot org>, ltt-dev at shafik dot org, Martin Bligh <mbligh at google dot com>, Michel Dagenais <michel dot dagenais at polymtl dot ca>, Ingo Molnar <mingo at elte dot hu>, prasanna at in dot ibm dot com, systemtap at sources dot redhat dot com, Thomas Gleixner <tglx at linutronix dot de>, William Cohen <wcohen at redhat dot com>, Tom Zanussi <zanussi at us dot ibm dot com>
- Date: Wed, 20 Sep 2006 14:45:25 +0100
- Subject: Re: [PATCH] Linux Kernel Markers
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote on 20/09/2006 11:32:07:
> Ar Mer, 2006-09-20 am 09:18 +0100, ysgrifennodd Richard J Moore:
> > > Are you referring to Intel erratum "unsynchronized cross-modifying
code"
> > > - where it refers to the practice of modifying code on one processor
> > > where another has prefetched the unmodified version of the code.
>
> > In the special case of replacing an opcode with int3 that erratum
doesn't
> > apply. I know that's not in the manuals but it has been confirmed by
the
> > Intel microarchitecture group. And it's not reasonable to it to be any
> > other way.
>
> Ok thats cool to know and I wish they'd documented it. Is the same true
> for AMD ?
>
> Alan
>
Not sure probably - I can ask.
Intel explained it to me thus:
When the i-fetch has been done and the micro-ops are in the trace cache
then there's no longer a direct correlation between the original machine
instruction boundaries and the micro ops. This is due to optimization. For
example (artificial one for illustrative purposes):
mov eax,ebx
mov memory,eax
mov eax,1
(using intel notation not ATT - force of habit)
In the trace cache there would be no micro ops to update eax with ebx.
Altering the "mov eax,ebx" to "mov ecx,ebx" on the fly invalidates the
optimized trace cache, hence the onlhy recourse is a GPF.
If the modification doens't invalidate the trace cache then no GPF. The
question is: "can we predict th circumstances when the trace cache has not
been invalidated", and the answer in general is no since the
microarchtecture is not public. But one can guess that modifying the single
byte opcode with in interrupting instruction - int3 - doesn't cause an
inconsistency that can't be handled. And that's what Intel confirmed. Go
ahead and store int3 without the need to synchronise (i.e. force the trace
cache to be flushed).
My guess is that AMD behaves exactly the same way. But I'll check.
Richard