This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
RE: MIPS JAL/JALR to BAL transformation for Linux (o32 ABI)
- From: "Fu, Chao-Ying" <fu at mips dot com>
- To: "Adam Nemet" <anemet at caviumnetworks dot com>
- Cc: "Richard Sandiford" <rdsandiford at googlemail dot com>, <binutils at sourceware dot org>, "Lau, David" <davidlau at mips dot com>, "Garbacea, Ilie" <ilie at mips dot com>
- Date: Mon, 3 Aug 2009 10:29:37 -0700
- Subject: RE: MIPS JAL/JALR to BAL transformation for Linux (o32 ABI)
Adam Nemet wrote:
> > > > +/* True if ABFD is for CPUs that are faster if jal/jalr is
> > > converted to bal.
> > > > + This should be safe for all architectures, but for now
> > > we enable it
> > > > + for RM9000, mips32, mips32r2, mips64, and mips64r2. */
> > > > +#define JAL_JALR_TO_BAL_P(abfd) \
> > > > + ( ((elf_elfheader (abfd)->e_flags & EF_MIPS_MACH) ==
> > > E_MIPS_MACH_9000) \
> > > > + || ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) ==
> > > E_MIPS_ARCH_32) \
> > > > + || ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) ==
> > > E_MIPS_ARCH_32R2) \
> > > > + || ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) ==
> > > E_MIPS_ARCH_64) \
> > > > + || ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) ==
> > > E_MIPS_ARCH_64R2))
> > >
> > > I think this should be a negative predicate. As you say JALR->BAL
> > > should be a profitable transformation on most CPUs.
> >
> > Yes. If everyone is ok, we can just set
> JAL_JALR_TO_BAL_P(abfd) to 1.
> > (And, fix new test failures due to BAL mismatching.)
>
> Just to be sure, what I said applies to JALR->BAL for Octeon.
> JAL->BAL is not
> necessarily profitable on Octeon but I thought the relaxation code was
> performing JALR->BAL or JALR->JAL and not JAL->BAL? Am I
> missing something
> here?
The transformation checks two things: JAL and JALR, to convert them to BAL.
Maybe we can split the predicate to two: JAL_TO_BAL_P and JALR_TO_BAL_P.
Then, you can disable JAL_TO_BAL_P for Octeon.
Ex:
/* On the RM9000, bal is faster than jal, because bal uses branch
prediction hardware. If we are linking for the RM9000, and we
see jal, and bal fits, use it instead. Note that this
transformation should be safe for all architectures. */
if (!info->relocatable
&& !require_jalx
&& ((JAL_TO_BAL_P && (r_type == R_MIPS_26 && (x >> 26) == 0x3)) /* jal addr */
|| (JALR_TO_BAL_P && (r_type == R_MIPS_JALR && x == 0x0320f809)))) /* jalr t9 */
{
bfd_vma addr;
bfd_vma dest;
bfd_signed_vma off;
addr = (input_section->output_section->vma
+ input_section->output_offset
+ relocation->r_offset
+ 4);
if (r_type == R_MIPS_26)
dest = (value << 2) | ((addr >> 28) << 28);
else
dest = value;
off = dest - addr;
if (off <= 0x1ffff && off >= -0x20000)
x = 0x04110000 | (((bfd_vma) off >> 2) & 0xffff); /* bal addr */
}
Regards,
Chao-ying