This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Thumb32 assembler (0/69)


I have just checked into the binutils-csl-arm-2005q1-branch an initial
implementation of assembler and disassembler support for the 32-bit
Thumb instruction set (variously known as "Thumb-2", "Thumb32", and
"32-bit Thumb instructions").  This gives Thumb mode almost all the
capabilities of ARM mode (and, in fairness, some capabilities not
available to ARM mode).  Along with this comes a new syntax for Thumb
mode, in which all the instructions do what they would have done if
you were assembling to ARM-format instructions -- for instance, a
new-mode Thumb "add" no longer sets the condition code, and "adds" is
available.

I believe that the assembler accepts correct input syntax, with one
exception (see below).  However, the "canonical form" and "ability to
produce any valid instruction via annotations and/or choice of two- or
three-operand format" properties, discussed in ARM's specification of
the new Thumb syntax, may not be entirely achieved.  The disassembler
most definitely does *not* produce canonical form; our disassembler
has never had production of reassemblable code as a design goal, so I
decided that unambiguous output was more important.  Also, ARM's
specification proposes that the .thumb directive should enable the new
syntax, whereas the .code16 directive should enable the old
16-bit-only Thumb syntax; however, in the interest of backward
compatibility, I left .thumb alone and defined a new .thumb32
directive to enable the new mode.

Not all of the operand restrictions documented in the ARM supplement
are diagnosed.  In particular, the assembler does not check use of r13
at all, and may not check use of r15 everywhere it should.  I think,
however, that it does diagnose all the overlap cases.  I have not
written tests for any of the diagnostics that can be issued in Thumb32
mode.

The IT instruction is recognized and assembled, but the assembler does
not attempt to validate subsequent instructions against it, nor
automatically generate it from conditional suffixes on instructions.
(Currently, in Thumb32 mode, only the branches allow conditional
suffixes.)  This is the major divergence from the specified syntax.
The assembler is also unaware that 16-bit Thumb instructions do not
set the flags inside an IT block.

Section relaxation is not yet implemented.  This means not only that
all branches are long (unless you use an .n suffix), but that all
instructions with an immediate operand are long (even if you try to
use an .n suffix).  A precise implementation of ARM's specified
semantics for the .n and .w suffixes is going to be difficult, since
that information is currently lost by the time we get to the section
relax hook.

In ARM mode, many instructions allow any immediate expression that can
be resolved to a constant at md_apply_fix3 time.  In Thumb2 mode, most
of those instructions will only accept a constant that is resolvable
at md_assemble time.  This is merely because appropriate
assembler-internal relocation numbers have not been defined yet.  Most
of these are probably never used with anything other than an integer
literal, but there is one important exception: movw and movt, which
finally give the ARM family the standard RISCy way of constructing
32-bit constants.  AAELF defines a bunch of relocations that allow
them to be used with symbols as well as constants.  Right now, though,
we can't take advantage of that.  The usual hi(), lo() notation needs
implementing, as do all those relocs.  Also, it would be good for 
'mov rd, #constant' and 'ldr rd, =constant' to know how to generate
these instructions.  I didn't do that primarily because we don't yet
have a way of knowing that it's safe for the assembler to generate V6+
instructions behind the programmer's back.

The BL instruction has two more bits of displacement in Thumb32 mode
than it does in old Thumb mode, but we currently cannot take advantage
of this, because AAELF does not specify a new relocation for it.
(Perhaps there is some incredibly clever way the existing
R_ARM_THM_CALL relocation can be made to work, but I don't see it.)

None of the specific coprocessor instructions (which group includes
all the floating-point instructions) are recognized in Thumb32 mode,
by either the assembler or the disassembler.  There is a simple
transformation from the ARM encoding to the T32 encoding, so it will
not be hard to make them available.  This should definitely be done
for VFP, and perhaps should be done for some of the vendor
coprocessors; I don't think it's worth bothering to do FPA.

I imported the complete set of official relocation names from AAELF to
include/elf/arm.h, and renamed them in elf32-arm.c when it was
apparent that they were the same thing.  However, several relocations
appear *not* to be the same thing in the GNU ad-hoc ABI as in AAELF.
I left this strictly alone.  (Do not confuse this with the ARM_OLD_ABI
distinction.  ARM_OLD_ABI is entirely dead code at this point.  I have
not pruned it, but only because this patch series is already long
enough.)

The changeset implementing this is enormous; tc-arm.c bears little
resemblance to its former shape.  I have divided the changes into
sixty-nine incremental patches (actually, it was developed that way)
and will send each and every one of them in a separate message, with
explanation, to ease review.  However, I would like to commit the
entire thing at once, and address any problems which are found via
follow-up patches.  There are a number of false leads and bugs lurking
in the intermediate stages; it's the final state that I'm confident
in.  Attached to this message, primarily for amusement value, is a
graph of the overall length of tc-arm.c at each stage of development.

zw

PNG image


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]