This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH/ARM] PR 16722, support VLDR s/d0, =Imm

From: Richard Earnshaw <rearnsha at arm dot com>
To: Jiong Wang <jiong dot wang at arm dot com>
Cc: Binutils <binutils at sourceware dot org>
Date: Fri, 25 Apr 2014 16:50:03 +0100
Subject: Re: [PATCH/ARM] PR 16722, support VLDR s/d0, =Imm
Authentication-results: sourceware.org; auth=none
References: <5356648F dot 9080205 at arm dot com>

On 22/04/14 13:46, Jiong Wang wrote:
> LDR support load constants from literal pool or optimize to vmov, while no such support on VLDR. This patch fixes this issue.
> 
> the logic of this patch is
> 
> for VLDR s/d, =IMM, (currently no support for FPIMM)
> 
> if (s reg)
>    always spill to literal pool.
> 
> if (d reg)
>    if (IMM acceptable by vmov)
>      {
>        if (thumb_mode)
>          rewrite vldr d, #IMM to vmov.XX d, #IMM under thumb2 encoding.
>        else
>          rewrite under arm encoding.
>      }
>    else
>      spill to literal pool.
> 
> OK for trunk?
> 
> thanks.
> 
> PR target/16722
> gas/
>    * config/tc-arm.c (add_to_lit_pool): Add one parameter 'nbytes'.
>    Support literal pool entry be 8 byte.
>    (s_ltorg): Likewise.
>    (enum lit_type): New enum type.
>    (encode_arm_cp_address): Generate literal pool entry if possible.
>    (move_or_literal_pool): Generate entry for vldr case.
>    (do_ldst): Use new enum type.
>    (do_ldstv4): Likewise.
>    (do_t_ldst): Likewise.
>    (neon_write_immbits): Support Thumb-2 mode.
>   
> gas/testsuite/
>    * gas/arm/ldconst.s: Add new cases for vldr.
>    * gas/arm/ldconst.d: Likewise.
>    * gas/arm/thumb2_pool.s: Likewise.
>    * gas/arm/thumb2_pool.d: Likewise.
> 

I see no evidence that the 8-byte entries have been placed on 8-byte
aligned locations; I think that should be done since misaligned loads
can be slower than aligned loads.  If the pool contains any 8-byte
entries, then the whole pool should be started on an 8-byte boundary;
you'll then have to insert padding before an 8-byte entry if that is
necessary to get back to alignment.  Bonus points for keeping track of
such holes and making use of them for later literal values that are
small enough to fit.  For example, the sequence:

	vldr	s0, =33
	vldr	d1, =55
	vldr	s6, =99

should generate a literal pool that contains

	.p2align 3
<pool_symbol>:
	.word	33
	.word	99	// Back-filled
	.dword	55

Also, please can you put the new tests in separate files; they are big
enough, and distinct enough to justify doing this.

Some additional comments inline.

> 
> vldr-literal-pool-v3.patch
> 
> 
> diff --git a/gas/config/tc-arm.c b/gas/config/tc-arm.c
> index 69299c7..1471fce 100644
> --- a/gas/config/tc-arm.c
> +++ b/gas/config/tc-arm.c
> @@ -3180,20 +3180,24 @@ find_or_make_literal_pool (void)
>     structure to the relevant literal pool.  */
>  
>  static int
> -add_to_lit_pool (void)
> +add_to_lit_pool (int nbytes)
>  {
>    literal_pool * pool;
>    unsigned int entry;
> +  unsigned int entry_for_relocate;
>  
>    pool = find_or_make_literal_pool ();
>  
>    /* Check if this literal value is already in the pool.  */
> -  for (entry = 0; entry < pool->next_free_entry; entry ++)
> +  for (entry = 0, entry_for_relocate = 0;
> +       entry < pool->next_free_entry;
> +       entry ++, entry_for_relocate++)

No space before the ++.  Rather than tracking entry_for_relocate like
this, it might be better to just sum the X_md values as you go along.

> @@ -7386,6 +7403,13 @@ encode_arm_addr_mode_3 (int i, bfd_boolean is_t)
>  static int
>  encode_arm_cp_address (int i, int wb_ok, int unind_ok, int reloc_override)
>  {
> +  if (!inst.operands[i].isreg)
> +    {
> +      gas_assert (inst.operands[0].isvec);
> +      if (move_or_literal_pool (0, /*lit_type=*/CONST_VEC, /*mode_3=*/FALSE))

Now you're using an enum, there's no need to retain the /*lit_type*/
comment.  That was there to clarify the use of true and false with the
old API.  This applies to the other calls to this function as well.


> @@ -7450,9 +7479,12 @@ encode_arm_cp_address (int i, int wb_ok, int unind_ok, int reloc_override)
>     inst.operands[i] describes the destination register.	 */
>  
>  static bfd_boolean
> -move_or_literal_pool (int i, bfd_boolean thumb_p, bfd_boolean mode_3)
> +move_or_literal_pool (int i, enum lit_type t, bfd_boolean mode_3)
>  {
>    unsigned long tbit;
> +  bfd_boolean thumb_p = t == CONST_THUMB;

The code reads easier if written as:

  bfd_boolean thumb_p = (t == CONST_THUMB);


> +  bfd_boolean arm_p = t == CONST_ARM;

Likewise.

> @@ -7503,9 +7535,40 @@ move_or_literal_pool (int i, bfd_boolean thumb_p, bfd_boolean mode_3)
>  	      return TRUE;
>  	    }
>  	}
> +      else if (vec64_p)
> +	{
> +	  int op = 0;
> +	  unsigned immbits = 0;
> +	  unsigned immlo = inst.reloc.exp.X_add_number & 0xFFFFFFFF;
> +	  unsigned immhi = (inst.reloc.exp.X_add_number >> 32) & 0xFFFFFFFF;

Hmm, I suspect X_add_number is just a 32-bit value on 32-bit hosts.
Have you tested this natively?

> +	  int cmode = neon_cmode_for_move_imm (immlo, immhi, FALSE, &immbits,
> +					       &op, 64, NT_invtype);
> +	  if (cmode == FAIL)
> +	    {
> +	      neon_invert_size (&immlo, &immhi, 64);
> +	      op = !op;
> +	      cmode = neon_cmode_for_move_imm (immlo, immhi, FALSE, &immbits,
> +					       &op, 64, NT_invtype);
> +	    }
> +	  if (cmode != FAIL)
> +	    {
> +	      inst.instruction = (inst.instruction & 0x40F000)

Please don't truncate the leading zeros on instruction bit-mask
operations.  Better yet, define a macro that makes the intent here clearer.

R.

Follow-Ups:
- Re: [PATCH/ARM] PR 16722, support VLDR s/d0, =Imm
  - From: Jiong Wang

References:
- [PATCH/ARM] PR 16722, support VLDR s/d0, =Imm
  - From: Jiong Wang

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]