This is the mail archive of the cgen@sources.redhat.com mailing list for the CGEN project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

generalizing the delay rtx function


Hi -

As you may be aware, the delay rtx function is used, despite its
work-in-progress designation, to model delayed branches in
constructs like
	(delay 1
	   (set pc (add pc 42)))
The DELAY-SLOT insn attribute is inferred from this for use by
simulator mainlines.  That's the extent of the effect of the delay
rtx.

In order to model architectures with exposed pipelines (i.e., no
or limited pipeline interlocks), and related effects like delayed
loads, I'd like to take it beyond this, by coupling it to the
parallel-write mechanism.

As you might be aware, ports that have VLIW features tend to use
the "parallel-write" mechanism in their semantic blocks in order
to queue updates to registers/memory? until after all concurrently
executed instructions have been processed.  This lets multiple
reader instructions execute together with a writer instruction,
without detailed worry about the evaluation sequence.

Anyway, how about a scheme such as this:

- Provide a clear definition for the DELAY rtx:
  The numeric argument is the number of instruction cycles
  after the current one, at which the enclosed set expressions
  take effect.
- Restrict the use of the DELAY rtx to only contain SET expressions
  to hardware/memory registers.  Forbid other calculations.
- Possibly, force use of (DELAY 0 ....) to express VLIW concurrency,
  at least in new ports.
- Infer "parallel-write?" (or a new equivalent) from the presence of
  DELAY rtxs.
- Eliminate special treatment of PC by fitting delayed branches into
  this model.

Then, the generated simulator code would be changed, so that:

- Semantic functions, instead of taking a single parexec structure
  pointer (for write queueing), take an array of them.  Within
  (DELAY <N> RTX*) blocks, define OPRND to point to the appropriate
  elements in the parexec array.
- The insn evaluation loop would keep an array of parexec structs
  as a rotating buffer, always running the writeback code on the first
  one, then rotating the set, then passing it to the next insn.  cgen
  could compute the maximum index needed.

This way, code like
	(set reg1 1)
	(delay 0 (set reg1 3))
	(delay 1 (set reg2 5))
	(delay 2 (set reg1 6))
would each be well-defined and useful.

An alternate cgen syntax possibility is to introduce a
	(delayed-set N lvalue rvalue)
rtx.

Any advice?


- FChE

PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]