This is the mail archive of the binutils@sourceware.cygnus.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Link-time call sizing and section stretching


Hi binutils hackers,

I have this one problem that our company very much needs solved and that I
think should be doable, but binutils aren't my forte...

I want to do link-time call sizing. I'm programming for an embedded platform
based on Motorola 68000 (plain 68000, not 68020) using binutils-2.9.1 for the
m68k-*-coff target. I'm using COFF only as an intermediate format and the
"executable" produced by ld isn't actually executed on anything, but is fed
into a program that rips the raw section images out of it and blasts them into
ROM. I know that folks tend to prefer ELF to COFF these days. However, ELF
stands for Executable and Linkable Format, and almost all of its fancy features
not present in COFF have to do with the executable part and UNIX ABIs and
shared libraries. These features are of no use for an intermediate format and
would be just dead weight, which is why I think COFF is a better choice.
Anyway, given binutils' BFD library, this shouldn't really matter.

Under this platform programs have to be split into sections not larger than 64
KB, and each section is ROMed independently of others at an unpredictable
address, which means that function calls and code references have to be done in
a very special way. Since all code is in ROM, we can't patch calls at run time
with absolute addresses. This means that all references within a section must
be PC-relative. Since this is a 68000 and not a 68020, we only have signed 16-
bit displacements here. I have devised wacky instruction sequences that
effectively act as bsrl and lea and pea with 32-bit PC-relative displacements,
which you only have on 68020. However, there is a high overhead associated with
them, which makes it very important to use 16-bit references wherever possible
and use 32-bit ones only where necessary, i.e., if a reference is from one end
to the other end of an almost full 64 KB section and exceeds the -32 KB to +32
KB range of 16-bit references. Even more importantly, this is only for
intrasection references. Since not only the absolute address of each section,
but even the relative location of sections is unknown, intersection references
involve special instruction sequences that extract the base address of the
target section from a global variable initialized at run time.

There are several solutions already available to the problem of intersection
calls. Some involve teaching gcc what function is in what section and how to
generate intersection call instruction sequences, others involve patching
assembly code as it comes out of gcc and is fed into as, and still others
involve adding stubs to each section that does the intersection jump and have
everyone just call stubs with normal intrasection calls. However, they all have
major drawbacks. Also there is no solution to the problem of sizing
intrasection calls so that only the ones that need to be 32-bit are 32-bit and
the others are 16-bit. I've written a script that patches assembly code coming
out of gcc before assembly to make all calls 32-bit, but that's really
expensive. And overall the lack of a true synergistic solution to all these
problems causes other inconveniences.

The only real synergistic solution is link-time call sizing. Instead of having
gcc generate the actual instruction for calls and other references and having
as assemble them, have gcc generate magic tokens saying "I want to call
function foo", have as encode them in the object file somehow, and have the
linker turn them into actual instruction sequences.

The problem with this is that it requires the linker to stretch the sections it
gets from object files, rather than just move them around, which is what all
the traditional relocations allow. This requires some hacking. It is my
understanding that none of the traditional UNIX object file formats, neither
a.out, nor COFF, nor ELF, are designed for this. In all of them the assembler
is free to assume that the section it outputs can only be moved around, but
cannot be stretched or compressed. If a single section in a single object
module contains a label and a relative reference to it, the assembler will
finalize the reference itself and won't emit a relocation, as it assumes that
it has sufficient information for this.

What I'm thinking about is creatively breaking these rules to achieve the
desired effect. First hack as (and I guess ld -r) to emit relocations for all
references, even relative ones that seem like they can be finalized. Second,
teach as some special new directives to emit special tokens in the object
indicating where unknown-size relative references would have to be made and to
what target. Here I'm thinking about creating a special section that won't
appear in the final executable and that would contain this information in
object files. Wherever a call instruction is being placed now, place a local
label there instead. Then in the special section put records each containing a
reference to this label, so that the linker knows where to patch, a reference
to the target, so that it knows what are we referencing, and a code indicating
what instruction sequence to generate (a call, a lea-type reference, or a pea-
type reference). Finally, add a new step to the linker to process this special
section and to inject call and reference instruction sequences into other
sections, stretching them as necessary. Stretch a section by correcting all
symbol table and reference table entries appropriately. This step must be added
to the linker after the link script has been processed and all sections have
been combined as directed there, so that everything is finalized as to what is
in what section, but before the relocations are stripped, so that there will
still be one relocation step to fix the relocations blown off by section
stretching, as well as the all-new relocations the generated instruction
sequences will have in them.

As far as I could tell, what I'm describing is doable. However, my binutils
hacking skills are probably a little insufficient for a hack like this.
Therefore, I would very much appreciate any suggestions and help. The
impression I've got from looking at the binutils code is that this is something
that a hacker more GNU-literate than I am would be able to do in a timeframe
from a few hours to a few days. Since this is not a hobby of mine but something
that our company needs for our projects, and we need it urgently, I'm sure we
would be able to pay someone for a couple of days of work for this (I'm Cc'ing
this posting to our management for that).

TIA 2^32 for any help!

--
Michael Sokolov				2695 VILLA CREEK DR STE 240
Software Engineer			DALLAS TX 75234-7329 USA
JP Systems, Inc.			Phone: +1-972-484-5432 x247
					    or +1-888-665-2460 x247
E-mail: msokolov@meson.jpsystems.com	Fax:   +1-972-484-4154

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]