This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: WIP: Register doco

From: Jim Blandy <jimb at redhat dot com>
To: Andrew Cagney <ac131313 at ges dot redhat dot com>
Cc: gdb at sources dot redhat dot com
Date: 20 Jul 2002 12:55:02 -0500
Subject: Re: WIP: Register doco
References: <3D38AF69.7020902@ges.redhat.com><np8z47ccv4.fsf@zwingli.cygnus.com> <3D39954D.1020306@ges.redhat.com>
Andrew Cagney <ac131313@ges.redhat.com> writes:
> > I'm not saying the questions below couldn't be worked out by examining
> > the whole document carefully.  But ideally, a document makes sense in
> > the first read-through.
> 
> So, you're saying that this document doesn't make sense on a first
> reading?  I'm not suprized! :-)

Oh, of course!  I'm just explaining what I'd like us to shoot for.

> >> @table @emph
> >> @item cooked
> >> @itemize @bullet
> >> @item
> >> manipulated by core @value{GDBN}
> >> @item
> >> correspond to user level, or abi registers
> > Don't you mean "ISA" here, not "ABI"?  If I disassemble some code and
> > see an instruction that refers to r12, then I should be able to see
> > that register's value by saying "print $r12".  The ABI in use has
> > nothing to do with it.
> 
> No, ABI.  For instance mipsIII and o32.  The o32 ABI thinks registers
> have 32 bits yet the real register has 64 bits.  This gives two views
> of the same register.  When o32 debug info indicates a value in two
> adjacent registers, it is refering to 32 bit and not 64 bit registers.
> 
> (Should user visible registers be displayed according to the
> underlying ISA or ABI is an item for debate.  It has never been
> specified and I suspect in part because GDB, prior to
> gdbarch_register_read/write, couldn't handle both.)

I think the cooked registers should be ISA.  When the user is stepping
by instruction, disassembling code, she needs to be able to see the
values those instructions are really operating on.  If the compiler,
in accordance with a particular ABI, happens to be using those
registers in a limited way (say, that ignores their upper 32 bits),
that's not GDB's business.

For example, suppose the compiler has a bug, and although it intends
to be using the o32 ABI, it accidentally uses a 64-bit instruction.
(I've seen this happen, I think.)  GDB should certainly show the
reality, and not present a "view" that obscures the compiler's
mistake.

Usually the only reason I drop down to assembly language is because
I'm confused by what the optimizer has done, and I want to know what's
*really* happening.  Or I don't trust the compiler at all.  If GDB
decides to show me an "ABI view" of the processor, and not show me
what's really happening, then that's a major lose.

> >> @end itemize
> >> @item raw
> >> @itemize @bullet
> >> @item
> >> manipulated by target backends
> >> @item
> >> correspond to physical registers
> > I think you're introducing a new term here, "physical", which doesn't
> > do anything for you.  When I see "physical", I think of actual
> > flip-flops.  But that's clearly not what you're talking about: GDB has
> > no idea how many ports these registers have, whether they get renamed
> > for speculative execution, etc.
> 
> That was the intent.  The objective is to focus the reader on the
> target architecture's hardware and identify the registers that
> correspond to real hardware.  Often in GDB, people haven't focused on
> the hardware and its registers and instead stored cooked registers in
> the raw register cache.  See older SH and bank registers or d10v and
> its two stack pointers.
> 
> However, often, what GDB gets access to is actually the ``spill
> registers'' - the hardware registers saved to memory.  I guess I
> should refine this.
> 
> (Mind you, with a jtag target, it really is NAND and NOR gates :-)

(Well, but GDB doesn't clock the bits through the snaky JTAG digestive
tract itself.  It talks to some library which "cleans up" the view
from the JTAG.  Talking about `real hardware' or `flip-flops' is only
helpful if one holds naive Z80-era beliefs about hardware.  If one
knows about the amazing hair associated with registers on modern
processors, then there are all sorts of confusing questions that
brings up --- e.g., "How in the world would GDB get hold of the state
of the raw flip-flops on a native Linux system?")

Okay, I understand a little better.  What makes sense to me at the
moment is for the "cooked" registers to be what the instructions refer
to, and the "raw" registers to be the bits --- also ISA-level
constructs --- that make up those values.

The example of the IA-32's MMX and FP registers is a great example for
this.  The MMX registers, MM0--MM7, and the FP registers,
ST(0)--ST(7), actually refer to the same set of eight eighty-bit
registers, R0--R7.  A reference to the floating-point register ST(i)
becomes a reference to R((TOP + i) % 8), where TOP is a three-bit
field in the FPU status register.  But a reference to the MMX register
MM(i) becomes a reference to the lower 64 bits of R(i) (which would be
the mantissa of some ST(i)).

So here I'd say that MM0--MM7 and ST(0)--ST(7) registers are the
"cooked" registers --- the ones instructions refer to --- and R0--R7
are the "raw" registers: the bits that they refer to.  

But this whole circus is ISA-level stuff --- you have to know it to
generate correct code for the architecture.

On the SPARC, the analogous situation would be for g0--g7, i0--i7,
l0--l7, and o0--o7 to be the cooked registers, which are affected by
changes of the window, and the raw registers to be the underlying set
of NWINDOWS * 16 registers.  The SPARC spec avoids giving those
underlying registers any name: it just calls them "register windows".

The same thing follows here: the fact that the registers are windowed
is definitely part of the ISA.  We never leave ISA territory for
flip-flop territory.  Only JTAG drivers do that.


> > In effect, all that phrase does is define "raw" in terms of another
> > undefined term, "physical".
> > By "raw", do you really mean the registers as presented by the
> > underlying protocol GDB uses to examine the inferior (be it remote,
> > /proc, or ptrace)?  I dunno.
> 
> I'll likely change it to ``hardware'', I think I've been using that
> term more consistently elsewhere.

I think "hardware" is misleading for the same reasons "physical" is.
You're still just hinting, and not saying what you mean.

> >> @item
> >> For a 64 bit architecture that is running in 32 bit mode, the register
> >> cache and raw register space would contain the 64 bit hardware
> >> registers.  The raw register space would not include cut down 32 bit
> >> registers.
> > By "32 bit mode", do you mean that there's an actual bit on the
> > processor that makes shift, divide, etc. instructions behave as if
> > there were only 32 bits?  Or do you mean an ABI that simply only uses
> > the lower 32 bits of the registers?
> 
> Who knows --- target architecture dependant detail.  Some targets have
> a true 32 bit mode, some just run 32 bit ABI's on a 64 bit
> architectures. The MIPS floating point registers and their various
> modes shows how tangled the web can get.

I see the former as an ISA thing --- the instructions execute
differently depending on the state of that bit --- and the latter as
an ABI thing --- how the compiler has chosen to use the resources the
ISA gives you.  The ISA level is visible to GDB's user, via
disassembly and register references; those need (and have always
needed) to show the user everything they need to know how the
instructions will behave when executed.

Consider, for example, some code compiled to the o32 ABI, but running
on a 64-bit processor.  There's no reason the user couldn't write some
inline assembly language that uses some 64-bit instructions for a bit,
as long as they get everything back into the right places by the time
they get back into C code.  That's not ABI-conformant, but who cares?
An embedded user knows what they're doing.  They might just be using
o32 for compatibility with legacy code or something.  In any case, GDB
should happily show the full 64-bit register values here.


> >> @item
> >> For an architecture that has memory mapped registers, those registers
> >> are not be part of the register cache or raw register space (there is no
> >> corresponding hardware register).
> > I think it would be nice to use the IA-64's arrangement as an example
> > here.  I assume the raw registers would include the unrotated register
> > file, and the registers that determine how they're rotated at the
> > current point.  Whereas the cooked registers would be numbered the way
> > they appear in the machine instructions.
> 
> As a description of memory registers or bank registers?  Your
> description of the IA-64 sounds more like bank selectable registers
> and similar --- the FP register stack of the i386 is an example that
> more people might be familar with.  (Remember we're talking theory
> here, neither the ia64 nor the i386 use this mechanism.)

Yeah, the FP register stack would be a much better example.
Especially if you can include an explanation of the interactions
between the MMX and FP registers, as I did above.

> Some architectures can refer to memory addresses using a register like
> notation.  In fact I know of one architecture where every memory
> location can also be refered to using a register notation.  Such an
> architecture would have zero raw registers.

Makes sense.

> >> @itemize @bullet
> >> @item
> >> registers refered to by debug information
> >> @item
> >> user visible registers (specified by name)
> >> @item
> >> mode dependant registers (e.g., a 64 bit architecture in 32 bit mode may
> >> need to manipulate the 32 bits of 64 bit registers)
> >> @item
> >> memory mapped registers
> >> space.
> >> @item
> >> state dependant registers (e.g., bank registers)
> >> @end itemize
> >> Architecture methods then map the @code{NUM_REGS + NUM_PSEUDO_REGS}
> >> cooked registers onto raw registers or memory.
> > So pseudo registers are different from cooked registers?  Or are they
> > the same?  If they're the same, then distinguishing NUM_REGS and
> > NUM_PSEUDO_REGS sort of implies that NUM_REGS corresponds to the
> > number of raw registers.  But that shouldn't be visible to the outside
> > at all.
> 
> Pseudo registers?  Beyond the constant NUM_PSEUDO_REGS, this section
> makes no reference to pseudo-registers.
> 
> I guess I should add an historical note to the opening section
> pointing out that the constant NUM_PSEUDO_REGS originated from an
> earlier mechanism called ``pseudo registers''.  While, for the moment,
> the constant remains, new targets use gdbarch register read/write and
> not pseudo registers.

Yeah, that would be helpful.  Just something that says, "for
historical reasons, the number of cooked registers is `NUM_REGS +
NUM_PSEUDO_REGS'; this section doesn't make any distinction between
pseudo-registers and other registers."



> 
> >
> >> cooked:   [0..NUM_REGS)
> >>                 |      \                   |       \
> >> |        \                 |         \
> >> raw:      RAW REGISTERS  MEMORY
> > In other words, given that we're allowing cooked registers to be
> > computed arbitrarily from raw registers and memory contents, why not
> > dispense with pseudo registers altogether?
> 
> The need for pseudo registers was dispensed with a year ago.  Someone
> just needs to clean up the old targets and cleanup some edge cases
> .... As for NUM_REGS, that defines the number of registers in the raw
> register cache (note below).
> 
> I'll add a comment mentioning the rationale behind direct mapping the
> first [0..NUM_REGS) registers.  It was firstly a case of K.I.S.S. and
> secondly due to suspected limitations in the current GDB code - the
> constant NUM_REGS unfortunatly still determines far more than the
> number of raw registers.

That would be great.
Follow-Ups:
- Re: WIP: Register doco
  - From: Andrew Cagney
References:
- WIP: Register doco
  - From: Andrew Cagney
- Re: WIP: Register doco
  - From: Jim Blandy
- Re: WIP: Register doco
  - From: Andrew Cagney
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]