This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] MIPS: microMIPS ASE support


This is a review of everything up to the end of elfxx-mips.c.  Not sure
when I'll be able to do the rest.

I think we're getting close, so could you post any updates as patches
relative to this one, rather than reposting the whole thing?

"Maciej W. Rozycki" <macro@codesourcery.com> writes:
> @@ -2372,7 +3002,14 @@ bfd_elf64_bfd_reloc_name_lookup (bfd *ab
>         i++)
>      if (mips16_elf64_howto_table_rela[i].name != NULL
>  	&& strcasecmp (mips16_elf64_howto_table_rela[i].name, r_name) == 0)
> -      return &mips16_elf64_howto_table_rela[i];
> +
> +  for (i = 0;
> +       i < (sizeof (micromips_elf64_howto_table_rela)
> +	    / sizeof (micromips_elf64_howto_table_rela[0]));
> +       i++)
> +    if (micromips_elf64_howto_table_rela[i].name != NULL
> +	&& strcasecmp (micromips_elf64_howto_table_rela[i].name, r_name) == 0)
> +      return &micromips_elf64_howto_table_rela[i];
>  
>    if (strcasecmp (elf_mips_gnu_vtinherit_howto.name, r_name) == 0)
>      return &elf_mips_gnu_vtinherit_howto;

This hunk looks wrong.  I doubt you meant to remove the mips16 line.
Same for elfn32-mips.c.

> @@ -5040,6 +5159,10 @@ mips_elf_calculate_relocation (bfd *abfd
>  	}
>  
>        target_is_16_bit_code_p = ELF_ST_IS_MIPS16 (h->root.other);
> +      /* If the output section is the PLT section,
> +         then the target is not microMIPS.  */
> +      target_is_micromips_code_p = (htab->splt != sec)
> +				    && ELF_ST_IS_MICROMIPS (h->root.other);
>      }
>  
>    /* If this is a reference to a 16-bit function with a stub, we need

Formatting nit:

      /* If the output section is the PLT section,
         then the target is not microMIPS.  */
      target_is_micromips_code_p = (htab->splt != sec
				    && ELF_ST_IS_MICROMIPS (h->root.other));

More importantly, the comment isn't any help.  When do we create
statically-resolved relocations against .plt?

>    /* Calls from 16-bit code to 32-bit code and vice versa require the
> -     mode change.  */
> -  *cross_mode_jump_p = !info->relocatable
> -		       && ((r_type == R_MIPS16_26 && !target_is_16_bit_code_p)
> -			   || ((r_type == R_MIPS_26 || r_type == R_MIPS_JALR)
> -			       && target_is_16_bit_code_p));
> +     mode change.  This is not required for calls to undefined weak
> +     symbols, which should never be executed at runtime.  */

But why do we need to go out of our way to check for them?  I'm sure
there's a good reason, but the comment doesn't give much clue what it is.

> @@ -9223,6 +9499,16 @@ _bfd_mips_elf_relocate_section (bfd *out
>  	case bfd_reloc_ok:
>  	  break;
>  
> +	case bfd_reloc_outofrange:
> +	  if (jal_reloc_p (howto->type))
> +	    {
> +	      msg = _("JALX to not a word-aligned address");
> +	      info->callbacks->warning
> +		(info, msg, name, input_bfd, input_section, rel->r_offset);
> +	      return FALSE;
> +	    }
> +	  /* Fall through.  */
> +
>  	default:
>  	  abort ();
>  	  break;

I suppose that should be something like "JALX to a non-word-aligned address".

> +  /* microMIPS local and global symbols have the least significant bit
> +     set.  Because all values are either multiple of 2 or multiple of
> +     4, compensating for this bit is as simple as setting it in
> +     comparisons. Just to be sure, check anyway before proceeding. */
> +  BFD_ASSERT (addr % 2 == 0);
> +  BFD_ASSERT (toaddr % 2 == 0);
> +  BFD_ASSERT (count % 2 == 0);

I'm suspicious about the toaddr assert.  You can create text sections
with odd-valued sizes:

	.section	.text.foo,"ax",@progbits
	.byte	1

and as discussed elsewhere, asserts are really to double-check
conditions that we think must already hold.

> +  addr |= 1;
> +  toaddr |= 1;
> +
> +  /* Adjust the local symbols defined in this section.  */
> +  symtab_hdr = &elf_tdata (abfd)->symtab_hdr;
> +  isym = (Elf_Internal_Sym *) symtab_hdr->contents;
> +  for (isymend = isym + symtab_hdr->sh_info; isym < isymend; isym++)
> +    {
> +      if (isym->st_shndx == sec_shndx
> +	  && isym->st_value > addr
> +	  && isym->st_value < toaddr)
> +	isym->st_value -= count;
> +    }

Hmm, microMIPS symbols created in the proper, blessed way (by proper,
blessed ways of defining MIPS16 labels in assembly) should be even in
the input object, and IIRC, the linker keeps local symbols that way
internally.  Only symbols entered into the hash table are odd.
(c.f. mips_elf_calculate_relocation, where we have to add 1 to
local symbols.)  So I would have expected the "|= 1"s to be applied
after this block rather than before it.  Even then:

> +  /* Now adjust the global symbols defined in this section.  */
> +  symcount = (symtab_hdr->sh_size / sizeof (Elf32_External_Sym)
> +	      - symtab_hdr->sh_info);
> +  sym_hashes = start_hashes = elf_sym_hashes (abfd);
> +  end_hashes = sym_hashes + symcount;
> +
> +  for (; sym_hashes < end_hashes; sym_hashes++)
> +    {
> +      struct elf_link_hash_entry *sym_hash = *sym_hashes;
> +
> +      if ((sym_hash->root.type == bfd_link_hash_defined
> +	   || sym_hash->root.type == bfd_link_hash_defweak)
> +	  && sym_hash->root.u.def.section == sec
> +	  && sym_hash->root.u.def.value > addr
> +	  && sym_hash->root.u.def.value < toaddr)
> +	sym_hash->root.u.def.value -= count;
> +    }

...if we're willing to extend the upper bound in this way, I wonder
whether there's really any point having an upper bound at all.

Then again, why doesn't the standard range (used by most targets)
include toaddr?  If you define an end marker:

end_of_section:
	# nothing after this

then wouldn't end_of_section == toaddr, and shouldn't it be included?

I see some targets rightly adjust st_size too.  We should do
the same here.

> +static unsigned long
> +find_match (unsigned long opcode, const struct opcode_descriptor insn[])
> +{
> +    unsigned long indx;
> +
> +    /* First opcode_table[] entry is ignored.  */
> +    for (indx = 1; insn[indx].mask != 0; indx++)
> +      if (MATCH (opcode, insn[indx]))
> +	return indx;
> +
> +    return 0;
> +}

But _why_ is the first entry ignored?

> +/* Check if there *might* be a 16-bit branch or jump at OFFSET
> +   with a fixed or variable length delay slot.  */
> +
> +static int
> +relax_dslot_norel16 (bfd *abfd, bfd_byte *ptr)

There's no parameter called "offset".  How about:

/* If PTR is known to point to a 16-bit branch or jump, return the
   minimum length of its delay slot, otherwise return 0.  */

At least, that's what the code seems to do, since it returns 2 for
things like:

> +  { /* "b(g|l)(e|t)z", "s,p",	*/ 0x40000000, 0xff200000 },

which as far as I can tell from the opcodes patch allows both 16-bit
and 32-bit delay slots.  (IV-g doesn't seem to be public yet, so I can't
check the spec.)  But from the way the function is used, it looks
like 0 really means "no size constraints", so why doesn't the function
return 0 for instructions like these?

Why are these routines checking for branches anyway?  We don't support
branches without relocations when doing this kind of relaxation.
I'd have expected only indirect jumps to be handled here.

Likewise relax_dslot_norel32.

> +  /* We don't have to do anything for a relocatable link, if
> +     this section does not have relocs, or if this is not a
> +     code section.  */
> +
> +  if (link_info->relocatable
> +      || (sec->flags & SEC_RELOC) == 0
> +      || sec->reloc_count == 0
> +      || (sec->flags & SEC_CODE) == 0)
> +    return TRUE;

Should we also check whether the micromips bit is set in the ELF
header flags?

> +	      nextopc = bfd_get_16 (abfd, contents + irel->r_offset + 4);
> +	      /* B16  */
> +	      if (MATCH (nextopc, b_insn_16))
> +		break;
> +	      /* JR16  */
> +	      if (MATCH (nextopc, jr_insn_16) && reg != JR16_REG (nextopc))
> +		break;
> +	      /* BEQZ16, BNEZ16  */
> +	      if (MATCH (nextopc, bz_insn_16) && reg != BZ16_REG (nextopc))
> +		break;
> +	      /* JALR16  */
> +	      if (MATCH (nextopc, jalr_insn_16_bd32)
> +		  && reg != JR16_REG (nextopc) && reg != RA)
> +		break;
> +	      continue;

I think this should be split out into a subfunction, like the relax_* ones.

/* PTR points to a 2-byte instruction.  Return true if it is a 16-bit
   branch that does not use register REG.  */

static bfd_boolean
is_16bit_branch_p (bfd *abfd, bfd_byte *ptr, unsigned int reg)

Likewise for the 4-byte case.

> +	  /* Give up if not the same register used with both relocations.  */

"Give up unless the same register is used with both relocations."

> +	  /* Now adjust pcrval, subtracting the offset to the LO16 reloc,
> +	     adding the size of the LUI instruction deleted and rounding
> +	     up to take masking of the two LSBs into account.  */
> +	  pcrval = ((pcrval - offset + 4 + 3) | 3) ^ 3;

Maybe its just me, but since pcrval was calculated much earlier,
and hasn't been used since being set, I'd find it easier to understand
if we simply calculate the pcrval for irel[1] directly:

      /* Calculate the pc-relative distance of the target from the LO16,
	 assuming that we can delete the 4-byte LUI.  */
      pcrval = (symval
		- (sec->output_section->vma + sec->output_offset)
		- (irel[1].r_offset - 4));

      /* Round up to take the masking of the two LSBs into account.  */
      pcrval = (prcval + 3) & -4;

What happens when we can't delete the LUI, because of its being in
a branch delay slot?

> +	  int bdsize = -1;
[...]
> +	  /* See if there is a jump or a branch reloc preceding the
> +	     LUI instruction immediately.  If so, then perform jump
> +	     or branch relaxation.  */
> +	  for (ibrrel = internal_relocs; ibrrel < irelend; ibrrel++)
> +	    {
> +	      offset = irel->r_offset - ibrrel->r_offset;
> +	      if (offset != 2 && offset != 4)
> +		continue;
> +
> +	      br_r_type = ELF32_R_TYPE (ibrrel->r_info);
> +	      if (offset == 2
> +		  && br_r_type != R_MICROMIPS_PC7_S1
> +		  && br_r_type != R_MICROMIPS_PC10_S1
> +		  && br_r_type != R_MICROMIPS_JALR)
> +		continue;
> +	      if (offset == 4
> +		  && br_r_type != R_MICROMIPS_26_S1
> +		  && br_r_type != R_MICROMIPS_PC16_S1
> +		  && br_r_type != R_MICROMIPS_JALR)
> +		continue;
> +
> +	      bdsize = relax_dslot_rel (abfd,
> +					contents + ibrrel->r_offset, offset);
> +	      break;
> +	    }
> +
> +	  /* Otherwise see if the LUI instruction *might* be in a
> +	     branch delay slot.  */
> +	  if (bdsize == -1)
> +	    {
> +	      bfd_byte *ptr = contents + ibrrel->r_offset;
> +
> +	      /* Assume the 32-bit LUI instruction is in a fixed delay slot
> +	         and cannot be optimised away.  */
> +	      bdsize = 4;
> +
> +	      if (ibrrel->r_offset >= 2)
> +		bdsize16 = relax_dslot_norel16 (abfd, ptr - 2);
> +	      if (ibrrel->r_offset >= 4)
> +		bdsize32 = relax_dslot_norel32 (abfd, ptr - 4);

I think the "ibrrel"s in the last block should be "irel"s instead.
As best I can tell, "ibrrel" == "irelend" here.  Is this case
covered by the testsuite?

I'm not sure whether I like the idea of making this function quadratic
simply to decide whether the preceding instruction is a branch,
but there's probably not much we can do about that.

Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]