This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFC: MIPS: Workaround the "daddi" and "daddiu" errata


Hello,

 I was working on the "daddi" and "daddiu" errata workaround and I got it
ready as of the state from before Richard applied the recent explicit
relocation support for non-PIC code to gcc.  As at this stage updating my
code to take that change into account is mostly mechanical work, I've
decided to submit the changes as is for discussion and add the missing
bits at the next revision.  The same applies to a set of testcases for 
gas.

 As you may know there are errata in the R4000 and the initial revision of 
the R4400 processor, that lead to an incorrect execution of the "daddi" 
and the "daddiu" instructions whan a two-complement overflow happens.  I 
believe the only way to work this problem around is to assume the affected 
processors don't support these instructions at all.  Fortunately, this can 
be quite easily done by pretending these instructions are actually 
macros and expanding them to a sequence of a "li" ("addiu" to $0) and a 
"dadd" or a "daddu" instruction with an aid of a temporary register, like 
this:

daddiu $2,$3,0x1234 -> addiu $1,$0,0x1234; daddu $2,$3,$1
daddi  $2,$3,0x1234 -> addiu $1,$0,0x1234; dadd  $2,$3,$1

 This works quite well when $1 is available and could be implemented with 
gas exclusively for that case.  Unfortunately, "daddiu" is quite commonly 
used in expansions of address references in macros.  In this case $1 is 
typically already used to hold the address being calculated.  It can be 
dealt with for code emitted by gcc by choosing not to emit compound 
addresses (i.e. other to "imm16(reg)") on memory transfer instructions and 
loading the address in advance to a scratch register.

 Here are two patches, one for gcc and one for binutils, that implement 
this approach.  The gcc part implements additional reloads that let 
compound addresses to be removed.  Where possible $1 is used as a scratch
register as it's a fixed register, so it can be used freely.  There's one 
point to note.  Branch relaxation leads to a need of two scratch register 
when a branch is actually to be converted.  The gcc core assumes branches 
can be done without clobbering any registers.  The current code deals with 
that by cheating and using $1 quietly.  But there's only one such register 
available, so I had to fix another one -- I've chosen $24.  If there's a 
better way to deal with that, then I'd be pleased to hear about it.

 The gas part is straightforward.  If a macro would require a second
register, then gas simply bails out.  The only exception are regular
general register loads, that may use the target register as a scratch pad.  
In addition to command line options, gas also implements a new ".set"  
directive to control the state of the workaround.  This is an aid for
handcoded assembly that wants to have access to "daddi" and/or "daddiu" as
both hardware instructions and macros, for example to support run-time
selection of code variants at performance-critical areas (like a TLB
exception handler).

 I'm looking forward to comments, including but not limited to 
constructive critique.  For example there are currently patterns that 
differ by the constraints used only and everything else, including the 
body is the same.  I don't like this redundancy, but I haven't found a way 
to conditionalize a selected constraint only.  Is there a better way to 
deal with that?

 These patches are essentially continuously tested by me by running 64-bit
Linux kernels that were built with the workarounds activated.  Using
`grep' on a dissassembly reveals no "daddiu" instructions (except a single
one used by Linux to determine if running on an affected processor; gcc
doesn't use "daddi" itself)  and the kernel appears stable.  Certain
changes were of course needed for Linux for handcoded assembly here and
there, but plain C code needs no change.  There's no ABI incompatibility,
either -- code built with the workaround enabled can be intermixed with 
standard one.

  Maciej

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

gcc-3.4-20031107-mips-nodaddi.patch
diff -up --recursive --new-file gcc-3.4-20031107.macro/gcc/config/mips/mips.c gcc-3.4-20031107/gcc/config/mips/mips.c
--- gcc-3.4-20031107.macro/gcc/config/mips/mips.c	2004-03-01 11:59:01.000000000 +0000
+++ gcc-3.4-20031107/gcc/config/mips/mips.c	2004-03-01 12:37:13.000000000 +0000
@@ -1054,7 +1054,7 @@ mips_symbol_insns (enum mips_symbol_type
 
 	 The final address is then $at + %lo(symbol).  With 32-bit
 	 symbols we just need a preparatory lui.  */
-      return (ABI_HAS_64BIT_SYMBOLS ? 6 : 2);
+      return (ABI_HAS_64BIT_SYMBOLS ? (!TARGET_NO_DADDI ? 6 : 8) : 2);
 
     case SYMBOL_SMALL_DATA:
       return 1;
@@ -5033,6 +5033,12 @@ override_options (void)
   if ((target_flags_explicit & MASK_FIX_R4400) == 0
       && mips_matching_cpu_name_p (mips_arch_info->name, "r4400"))
     target_flags |= MASK_FIX_R4400;
+
+  /* Default to working around daddi/daddiu errata when either R4000
+     or R4400 errata workarounds are enabled.  */
+  if ((target_flags_explicit & MASK_NO_DADDI) == 0
+      && (TARGET_FIX_R4000 || TARGET_FIX_R4400))
+    target_flags |= MASK_NO_DADDI;
 }
 
 /* Implement CONDITIONAL_REGISTER_USAGE.  */
@@ -5089,6 +5095,9 @@ mips_conditional_register_usage (void)
       for (regno = FP_REG_FIRST + 21; regno <= FP_REG_FIRST + 31; regno+=2)
 	call_really_used_regs[regno] = call_used_regs[regno] = 1;
     }
+  /* t8 is used for unconditional jumps when -mno-daddi is active.  */
+  if (Pmode == DImode && TARGET_NO_DADDI)
+    fixed_regs[24] = 1;
 }
 
 /* Allocate a chunk of memory for per-function machine-dependent data.  */
@@ -7442,6 +7451,12 @@ mips_secondary_reload_class (enum reg_cl
 	}
     }
 
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && ((! in_p && class == GR_REGS) || class == FP_REGS
+	  || class == COP0_REGS || class == COP2_REGS || class == COP3_REGS)
+      && memory_operand (x, mode) && mips_fetch_insns (x) != 1)
+    return gr_regs;
+
   return NO_REGS;
 }
 
@@ -8840,8 +8855,7 @@ mips_adjust_insn_length (rtx insn, int l
 }
 
 
-/* Return an asm sequence to start a noat block and load the address
-   of a label into $1.  */
+/* Return an asm sequence to load the address of a label into %1.  */
 
 const char *
 mips_output_load_label (void)
@@ -8850,22 +8864,26 @@ mips_output_load_label (void)
     switch (mips_abi)
       {
       case ABI_N32:
-	return "%[lw\t%@,%%got_page(%0)(%+)\n\taddiu\t%@,%@,%%got_ofst(%0)";
+	return "lw\t%1,%%got_page(%0)(%+)\n\taddiu\t%1,%1,%%got_ofst(%0)";
 
       case ABI_64:
-	return "%[ld\t%@,%%got_page(%0)(%+)\n\tdaddiu\t%@,%@,%%got_ofst(%0)";
+	if (!TARGET_NO_DADDI)
+	  return "ld\t%1,%%got_page(%0)(%+)\n\tdaddiu\t%1,%1,%%got_ofst(%0)";
+	else
+	  return "ld\t%1,%%got_page(%0)(%+)\n"
+		 "\t%[addiu\t%@,%.,%%got_ofst(%0)\n\tdaddu\t%1,%1,%@%]";
 
       default:
 	if (ISA_HAS_LOAD_DELAY)
-	  return "%[lw\t%@,%%got(%0)(%+)%#\n\taddiu\t%@,%@,%%lo(%0)";
-	return "%[lw\t%@,%%got(%0)(%+)\n\taddiu\t%@,%@,%%lo(%0)";
+	  return "lw\t%1,%%got(%0)(%+)%#\n\taddiu\t%1,%1,%%lo(%0)";
+	return "lw\t%1,%%got(%0)(%+)\n\taddiu\t%1,%1,%%lo(%0)";
       }
   else
     {
       if (Pmode == DImode)
-	return "%[dla\t%@,%0";
+	return "dla\t%1,%0\n\t";
       else
-	return "%[la\t%@,%0";
+	return "la\t%1,%0";
     }
 }
 
@@ -9053,8 +9071,21 @@ mips_output_conditional_branch (rtx insn
 	  output_asm_insn ("j\t%0", &orig_target);
 	else
 	  {
-	    output_asm_insn (mips_output_load_label (), &orig_target);
-	    output_asm_insn ("jr\t%@%]", 0);
+	    rtx la_operands[2];
+	    la_operands[0] = orig_target;
+	    if (Pmode != DImode || !TARGET_NO_DADDI)
+	      {
+		la_operands[1] = gen_rtx_REG (SImode, 1);
+		sprintf (buffer, "%s%s", "%[", mips_output_load_label ());
+		output_asm_insn (buffer, la_operands);
+		output_asm_insn ("jr\t$1%]", la_operands);
+	      }
+	    else
+	      {
+		la_operands[1] = gen_rtx_REG (SImode, 24);
+		output_asm_insn (mips_output_load_label (), la_operands);
+		output_asm_insn ("jr\t$24", la_operands);
+	      }
 	  }
 
         if (length != 16 && length != 28 && mips_branch_likely)
diff -up --recursive --new-file gcc-3.4-20031107.macro/gcc/config/mips/mips.h gcc-3.4-20031107/gcc/config/mips/mips.h
--- gcc-3.4-20031107.macro/gcc/config/mips/mips.h	2004-03-01 12:00:08.000000000 +0000
+++ gcc-3.4-20031107/gcc/config/mips/mips.h	2004-03-01 12:19:48.000000000 +0000
@@ -172,6 +172,7 @@ extern const struct mips_cpu_info *mips_
 #define MASK_FIX_R4000	   0x01000000	/* Work around R4000 errata.  */
 #define MASK_FIX_R4400	   0x02000000	/* Work around R4400 errata.  */
 #define MASK_FIX_SB1	   0x04000000	/* Work around SB-1 errata.  */
+#define MASK_NO_DADDI	   0x08000000	/* Don't use "daddi" and "daddiu".  */
 
 					/* Debug switches, not documented */
 #define MASK_DEBUG	0		/* unused */
@@ -266,6 +267,9 @@ extern const struct mips_cpu_info *mips_
 					/* Work around R4400 errata.  */
 #define TARGET_FIX_R4400		(target_flags & MASK_FIX_R4400)
 
+					/* Don't use "daddi" and "daddiu".  */
+#define TARGET_NO_DADDI		(target_flags & MASK_NO_DADDI)
+
 /* True if we should use NewABI-style relocation operators for
    symbolic addresses.  This is never true for mips16 code,
    which has its own conventions.  */
@@ -607,6 +611,10 @@ extern const struct mips_cpu_info *mips_
      N_("Work around R4400 errata")},					\
   {"no-fix-r4400",	 -MASK_FIX_R4400,				\
      N_("Don't work around R4400 errata")},				\
+  { "daddi",		 -MASK_NO_DADDI,				\
+     N_("Use ""daddi"" and ""daddiu""")},				\
+  { "no-daddi",		  MASK_NO_DADDI,				\
+     N_("Don't use ""daddi"" and ""daddiu"" (for 4000 and early 4400 errata)")}, \
   {"check-zero-division",-MASK_NO_CHECK_ZERO_DIV,			\
      N_("Trap on integer divide by zero")},				\
   {"no-check-zero-division", MASK_NO_CHECK_ZERO_DIV,			\
@@ -3409,13 +3417,33 @@ do {									\
 #define ASM_OUTPUT_REG_PUSH(STREAM,REGNO)				\
 do									\
   {									\
-    fprintf (STREAM, "\t%s\t%s,%s,8\n\t%s\t%s,0(%s)\n",			\
-	     TARGET_64BIT ? "dsubu" : "subu",				\
-	     reg_names[STACK_POINTER_REGNUM],				\
-	     reg_names[STACK_POINTER_REGNUM],				\
-	     TARGET_64BIT ? "sd" : "sw",				\
-	     reg_names[REGNO],						\
-	     reg_names[STACK_POINTER_REGNUM]);				\
+    if (!TARGET_64BIT || !TARGET_NO_DADDI || TARGET_MIPS16)		\
+      {									\
+	fprintf (STREAM, "\t%s\t%s,%s,-8\n\t%s\t%s,0(%s)\n",		\
+		 TARGET_64BIT ? "daddiu" : "addiu",			\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 TARGET_64BIT ? "sd" : "sw",				\
+		 reg_names[REGNO],					\
+		 reg_names[STACK_POINTER_REGNUM]);			\
+      }									\
+    else								\
+      {									\
+	if (! set_noat)							\
+	  fprintf (STREAM, "\t.set\tnoat\n");				\
+									\
+	fprintf (STREAM, "\taddiu\t%s,%s,-8\n\tdaddu\t%s,%s\n"		\
+			 "\tsd\t%s,0(%s)\n",				\
+		 reg_names[AT_REGNUM],					\
+		 reg_names[GP_REG_FIRST],				\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[AT_REGNUM],					\
+		 reg_names[REGNO],					\
+		 reg_names[STACK_POINTER_REGNUM]);			\
+									\
+	if (! set_noat)							\
+	  fprintf (STREAM, "\t.set\tat\n");				\
+      }									\
   }									\
 while (0)
 
@@ -3425,13 +3453,33 @@ do									\
     if (! set_noreorder)						\
       fprintf (STREAM, "\t.set\tnoreorder\n");				\
 									\
-    fprintf (STREAM, "\t%s\t%s,0(%s)\n\t%s\t%s,%s,8\n",			\
-	     TARGET_64BIT ? "ld" : "lw",				\
-	     reg_names[REGNO],						\
-	     reg_names[STACK_POINTER_REGNUM],				\
-	     TARGET_64BIT ? "daddu" : "addu",				\
-	     reg_names[STACK_POINTER_REGNUM],				\
-	     reg_names[STACK_POINTER_REGNUM]);				\
+    if (!TARGET_64BIT || !TARGET_NO_DADDI || TARGET_MIPS16)		\
+      {									\
+	fprintf (STREAM, "\t%s\t%s,0(%s)\n\t%s\t%s,%s,8\n",		\
+		 TARGET_64BIT ? "ld" : "lw",				\
+		 reg_names[REGNO],					\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 TARGET_64BIT ? "daddiu" : "addiu",			\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[STACK_POINTER_REGNUM]);			\
+      }									\
+    else								\
+      {									\
+	if (! set_noat)							\
+	  fprintf (STREAM, "\t.set\tnoat\n");				\
+									\
+	fprintf (STREAM, "\tld\t%s,0(%s)\n"				\
+			 "\taddiu\t%s,%s,8\n\tdaddu\t%s,%s\n",		\
+		 reg_names[REGNO],					\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[AT_REGNUM],					\
+		 reg_names[GP_REG_FIRST],				\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[AT_REGNUM],					\
+									\
+	if (! set_noat)							\
+	  fprintf (STREAM, "\t.set\tat\n");				\
+      }									\
 									\
     if (! set_noreorder)						\
       fprintf (STREAM, "\t.set\treorder\n");				\
diff -up --recursive --new-file gcc-3.4-20031107.macro/gcc/config/mips/mips.md gcc-3.4-20031107/gcc/config/mips/mips.md
--- gcc-3.4-20031107.macro/gcc/config/mips/mips.md	2004-03-01 11:44:20.000000000 +0000
+++ gcc-3.4-20031107/gcc/config/mips/mips.md	2004-03-01 12:19:48.000000000 +0000
@@ -854,6 +854,42 @@
     }
 })
 
+;; The original R4000 and the initial revision of the R4400 have a cpu
+;; bug.  Under an overflow condition double-word immediate addition
+;; may give an incorrect result.  We handle the problem by using a
+;; sequence of a single-word immediate addition to load a constant to
+;; a temporary register and then a double-word register addition.  We
+;; also provide an aid for the assembler to deal with this problem by
+;; avoiding macros that require a "daddiu" instruction in their
+;; expansion.
+;;
+;; From "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0"
+;; (also valid for MIPS R4000MC processors):
+;;
+;; "23. R4000PC, R4000SC: The 64-bit instruction, daddi, fails to take
+;;	an overflow exception.
+;;	Workaround: There is no workaround for this problem."
+;;
+;; and:
+;;
+;; "41. R4000PC, R4000SC: Under the following condition, the DADDIU
+;;	instruction can produce an incorrect result.  If this
+;;	instruction generates a result value that would cause an
+;;	overflow condition to occur (even though this instruction does
+;;	not take an overflow exception) then the result value will be
+;;	correct in bits 0-31 but bit 31 will be replicated through
+;;	bits 32-63 (so it looks like a 32bit signextended value).  The
+;;	overflow condition is defined when the carries out of bits 62
+;;	and 63 differ (two's compliment overflow).
+;;	Workaround: There is no workaround for this problem."
+;;
+;; Erratum #41 is also present in "MIPS R4400PC/SC Errata, Processor
+;; Revision 1.0" (also valid for MIPS R4400MC processors) as erratum
+;; #7.
+;;
+;; These processors have PRId values of 0x00004220 and 0x00004300 for
+;; the R4000 and 0x00004400 for the R4400.
+
 (define_expand "adddi3"
   [(parallel [(set (match_operand:DI 0 "register_operand" "")
 		   (plus:DI (match_operand:DI 1 "register_operand" "")
@@ -1030,17 +1066,41 @@
 		 (match_dup 3)))]
   "")
 
-(define_insn "adddi3_internal_3"
+(define_expand "adddi3_internal_3"
   [(set (match_operand:DI 0 "register_operand" "=d,d")
 	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ,dJ")
 		 (match_operand:DI 2 "arith_operand" "d,Q")))]
   "TARGET_64BIT && !TARGET_MIPS16"
-  "@
-    daddu\t%0,%z1,%2
-    daddiu\t%0,%z1,%2"
+  "")
+
+(define_insn "adddi3_internal_3a"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
+		 (match_operand:DI 2 "register_operand" "d")))]
+  "TARGET_64BIT && !TARGET_MIPS16"
+  "daddu\t%0,%z1,%2"
+  [(set_attr "type"	"darith")
+   (set_attr "mode"	"DI")])
+
+(define_insn "adddi3_internal_3b"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
+		 (match_operand:DI 2 "const_arith_operand" "Q")))]
+  "TARGET_64BIT && !TARGET_MIPS16 && !TARGET_NO_DADDI"
+  "daddiu\t%0,%z1,%2"
   [(set_attr "type"	"darith")
    (set_attr "mode"	"DI")])
 
+(define_insn "adddi3_internal_3c"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
+		 (match_operand:DI 2 "const_arith_operand" "Q")))]
+  "TARGET_64BIT && !TARGET_MIPS16 && TARGET_NO_DADDI"
+  "%[addiu\\t%@,%.,%2\;daddu\t%0,%z1,%@%]"
+  [(set_attr "type"	"darith")
+   (set_attr "mode"	"DI")
+   (set_attr "length"	"8")])
+
 ;; For the mips16, we need to recognize stack pointer additions
 ;; explicitly, since we don't have a constraint for $sp.  These insns
 ;; will be generated by the save_restore_insns functions.
@@ -2808,19 +2868,27 @@ srl\t%3,%3,1\n\
    (set_attr "mode"	"SI")
    (set_attr "length"	"28")])
 
-(define_insn "ffsdi2"
+(define_expand "ffsdi2"
   [(set (match_operand:DI 0 "register_operand" "=&d")
 	(ffs:DI (match_operand:DI 1 "register_operand" "d")))
    (clobber (match_scratch:DI 2 "=&d"))
    (clobber (match_scratch:DI 3 "=&d"))]
   "TARGET_64BIT && !TARGET_MIPS16"
+  "")
+
+(define_insn "ffsdi2_internal_1"
+  [(set (match_operand:DI 0 "register_operand" "=&d")
+	(ffs:DI (match_operand:DI 1 "register_operand" "d")))
+   (clobber (match_scratch:DI 2 "=&d"))
+   (clobber (match_scratch:DI 3 "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16 && !TARGET_NO_DADDI"
 {
   if (optimize && find_reg_note (insn, REG_DEAD, operands[1]))
     return "%(\
 move\t%0,%.\;\
 beq\t%1,%.,2f\n\
 %~1:\tand\t%2,%1,0x0001\;\
-daddu\t%0,%0,1\;\
+daddiu\t%0,%0,1\;\
 beq\t%2,%.,1b\;\
 dsrl\t%1,%1,1\n\
 %~2:%)";
@@ -2830,7 +2898,7 @@ move\t%0,%.\;\
 move\t%3,%1\;\
 beq\t%3,%.,2f\n\
 %~1:\tand\t%2,%3,0x0001\;\
-daddu\t%0,%0,1\;\
+daddiu\t%0,%0,1\;\
 beq\t%2,%.,1b\;\
 dsrl\t%3,%3,1\n\
 %~2:%)";
@@ -2838,6 +2906,39 @@ dsrl\t%3,%3,1\n\
   [(set_attr "type"	"multi")
    (set_attr "mode"	"DI")
    (set_attr "length"	"28")])
+
+(define_insn "ffsdi2_internal_2"
+  [(set (match_operand:DI 0 "register_operand" "=&d")
+	(ffs:DI (match_operand:DI 1 "register_operand" "d")))
+   (clobber (match_scratch:DI 2 "=&d"))
+   (clobber (match_scratch:DI 3 "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16 && TARGET_NO_DADDI"
+{
+  if (optimize && find_reg_note (insn, REG_DEAD, operands[1]))
+    return "%(\
+move\t%0,%.\;\
+beq\t%1,%.,2f\n\
+%~1:\tand\t%2,%1,0x0001\;\
+not\t%0,%0\;\
+dnegu\t%0,%0\;\
+beq\t%2,%.,1b\;\
+dsrl\t%1,%1,1\n\
+%~2:%)";
+
+  return "%(\
+move\t%0,%.\;\
+move\t%3,%1\;\
+beq\t%3,%.,2f\n\
+%~1:\tand\t%2,%3,0x0001\;\
+not\t%0,%0\;\
+dnegu\t%0,%0\;\
+beq\t%2,%.,1b\;\
+dsrl\t%3,%3,1\n\
+%~2:%)";
+}
+  [(set_attr "type"	"multi")
+   (set_attr "mode"	"DI")
+   (set_attr "length"	"32")])
 
 ;;
 ;;  ...................
@@ -3227,10 +3328,27 @@ dsrl\t%3,%3,1\n\
 ;;
 ;; Step A needs a real instruction but step B does not.
 
-(define_insn "truncdisi2"
+(define_expand "truncdisi2"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=d,m")
         (truncate:SI (match_operand:DI 1 "register_operand" "d,d")))]
   "TARGET_64BIT"
+  "")
+
+(define_insn "*truncdisi2_internal_1"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,m")
+        (truncate:SI (match_operand:DI 1 "register_operand" "d,d")))]
+  "TARGET_64BIT && (Pmode != DImode || !TARGET_NO_DADDI)"
+  "@
+    sll\t%0,%1,0
+    sw\t%1,%0"
+  [(set_attr "type" "darith,store")
+   (set_attr "mode" "SI")
+   (set_attr "extended_mips16" "yes,*")])
+
+(define_insn "*truncdisi2_internal_2"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,R")
+        (truncate:SI (match_operand:DI 1 "register_operand" "d,d")))]
+  "TARGET_64BIT && Pmode == DImode && TARGET_NO_DADDI"
   "@
     sll\t%0,%1,0
     sw\t%1,%0"
@@ -3238,10 +3356,16 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode" "SI")
    (set_attr "extended_mips16" "yes,*")])
 
-(define_insn "truncdihi2"
+(define_expand "truncdihi2"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=d,m")
         (truncate:HI (match_operand:DI 1 "register_operand" "d,d")))]
   "TARGET_64BIT"
+  "")
+
+(define_insn "*truncdihi2_internal_1"
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,m")
+        (truncate:HI (match_operand:DI 1 "register_operand" "d,d")))]
+  "TARGET_64BIT && (Pmode != DImode || !TARGET_NO_DADDI)"
   "@
     sll\t%0,%1,0
     sh\t%1,%0"
@@ -3249,10 +3373,38 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode" "SI")
    (set_attr "extended_mips16" "yes,*")])
 
-(define_insn "truncdiqi2"
+(define_insn "*truncdihi2_internal_2"
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,R")
+        (truncate:HI (match_operand:DI 1 "register_operand" "d,d")))]
+  "TARGET_64BIT && Pmode == DImode && TARGET_NO_DADDI"
+  "@
+    sll\t%0,%1,0
+    sh\t%1,%0"
+  [(set_attr "type" "darith,store")
+   (set_attr "mode" "SI")
+   (set_attr "extended_mips16" "yes,*")])
+
+(define_expand "truncdiqi2"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=d,m")
         (truncate:QI (match_operand:DI 1 "register_operand" "d,d")))]
   "TARGET_64BIT"
+  "")
+
+(define_insn "*truncdiqi2_internal_1"
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,m")
+        (truncate:QI (match_operand:DI 1 "register_operand" "d,d")))]
+  "TARGET_64BIT && (Pmode != DImode || !TARGET_NO_DADDI)"
+  "@
+    sll\t%0,%1,0
+    sb\t%1,%0"
+  [(set_attr "type" "darith,store")
+   (set_attr "mode" "SI")
+   (set_attr "extended_mips16" "yes,*")])
+
+(define_insn "*truncdiqi2_internal_2"
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,R")
+        (truncate:QI (match_operand:DI 1 "register_operand" "d,d")))]
+  "TARGET_64BIT && Pmode == DImode && TARGET_NO_DADDI"
   "@
     sll\t%0,%1,0
     sb\t%1,%0"
@@ -4137,90 +4289,243 @@ dsrl\t%3,%3,1\n\
 ;; We therefore use two memory operands to each instruction, one to
 ;; describe the rtl effect and one to use in the assembly output.
 
-(define_insn "mov_lwl"
+(define_expand "mov_lwl"
   [(set (match_operand:SI 0 "register_operand" "=d")
 	(unspec:SI [(match_operand:BLK 1 "memory_operand" "m")
 		    (match_operand:QI 2 "memory_operand" "m")]
 		   UNSPEC_LWL))]
   "!TARGET_MIPS16"
+  "")
+
+(define_insn "*mov_lwl_internal_1"
+  [(set (match_operand:SI 0 "register_operand" "=d")
+	(unspec:SI [(match_operand:BLK 1 "memory_operand" "m")
+		    (match_operand:QI 2 "memory_operand" "m")]
+		   UNSPEC_LWL))]
+  "!TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)"
+  "lwl\t%0,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "SI")
+   (set_attr "hazard" "none")])
+
+(define_insn "*mov_lwl_internal_2"
+  [(set (match_operand:SI 0 "register_operand" "=d")
+	(unspec:SI [(match_operand:BLK 1 "memory_operand" "R")
+		    (match_operand:QI 2 "memory_operand" "R")]
+		   UNSPEC_LWL))]
+  "!TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI"
   "lwl\t%0,%2"
   [(set_attr "type" "load")
    (set_attr "mode" "SI")
    (set_attr "hazard" "none")])
 
-(define_insn "mov_lwr"
+(define_expand "mov_lwr"
   [(set (match_operand:SI 0 "register_operand" "=d")
 	(unspec:SI [(match_operand:BLK 1 "memory_operand" "m")
 		    (match_operand:QI 2 "memory_operand" "m")
 		    (match_operand:SI 3 "register_operand" "0")]
 		   UNSPEC_LWR))]
   "!TARGET_MIPS16"
+  "")
+
+(define_insn "*mov_lwr_internal_1"
+  [(set (match_operand:SI 0 "register_operand" "=d")
+	(unspec:SI [(match_operand:BLK 1 "memory_operand" "m")
+		    (match_operand:QI 2 "memory_operand" "m")
+		    (match_operand:SI 3 "register_operand" "0")]
+		   UNSPEC_LWR))]
+  "!TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)"
+  "lwr\t%0,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "SI")])
+
+(define_insn "*mov_lwr_internal_2"
+  [(set (match_operand:SI 0 "register_operand" "=d")
+	(unspec:SI [(match_operand:BLK 1 "memory_operand" "R")
+		    (match_operand:QI 2 "memory_operand" "R")
+		    (match_operand:SI 3 "register_operand" "0")]
+		   UNSPEC_LWR))]
+  "!TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI"
   "lwr\t%0,%2"
   [(set_attr "type" "load")
    (set_attr "mode" "SI")])
 
 
-(define_insn "mov_swl"
+(define_expand "mov_swl"
   [(set (match_operand:BLK 0 "memory_operand" "=m")
 	(unspec:BLK [(match_operand:SI 1 "reg_or_0_operand" "dJ")
 		     (match_operand:QI 2 "memory_operand" "m")]
 		    UNSPEC_SWL))]
   "!TARGET_MIPS16"
+  "")
+
+(define_insn "*mov_swl_internal_1"
+  [(set (match_operand:BLK 0 "memory_operand" "=m")
+	(unspec:BLK [(match_operand:SI 1 "reg_or_0_operand" "dJ")
+		     (match_operand:QI 2 "memory_operand" "m")]
+		    UNSPEC_SWL))]
+  "!TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)"
+  "swl\t%z1,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "SI")])
+
+(define_insn "*mov_swl_internal_2"
+  [(set (match_operand:BLK 0 "memory_operand" "=R")
+	(unspec:BLK [(match_operand:SI 1 "reg_or_0_operand" "dJ")
+		     (match_operand:QI 2 "memory_operand" "R")]
+		    UNSPEC_SWL))]
+  "!TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI"
   "swl\t%z1,%2"
   [(set_attr "type" "store")
    (set_attr "mode" "SI")])
 
-(define_insn "mov_swr"
+(define_expand "mov_swr"
   [(set (match_operand:BLK 0 "memory_operand" "+m")
 	(unspec:BLK [(match_operand:SI 1 "reg_or_0_operand" "dJ")
 		     (match_operand:QI 2 "memory_operand" "m")
 		     (match_dup 0)]
 		    UNSPEC_SWR))]
   "!TARGET_MIPS16"
+  "")
+
+(define_insn "*mov_swr_internal_1"
+  [(set (match_operand:BLK 0 "memory_operand" "+m")
+	(unspec:BLK [(match_operand:SI 1 "reg_or_0_operand" "dJ")
+		     (match_operand:QI 2 "memory_operand" "m")
+		     (match_dup 0)]
+		    UNSPEC_SWR))]
+  "!TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)"
+  "swr\t%z1,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "SI")])
+
+(define_insn "*mov_swr_internal_2"
+  [(set (match_operand:BLK 0 "memory_operand" "+R")
+	(unspec:BLK [(match_operand:SI 1 "reg_or_0_operand" "dJ")
+		     (match_operand:QI 2 "memory_operand" "R")
+		     (match_dup 0)]
+		    UNSPEC_SWR))]
+  "!TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI"
   "swr\t%z1,%2"
   [(set_attr "type" "store")
    (set_attr "mode" "SI")])
 
 
-(define_insn "mov_ldl"
+(define_expand "mov_ldl"
   [(set (match_operand:DI 0 "register_operand" "=d")
 	(unspec:DI [(match_operand:BLK 1 "memory_operand" "m")
 		    (match_operand:QI 2 "memory_operand" "m")]
 		   UNSPEC_LDL))]
   "TARGET_64BIT && !TARGET_MIPS16"
+  "")
+
+(define_insn "*mov_ldl_internal_1"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(unspec:DI [(match_operand:BLK 1 "memory_operand" "m")
+		    (match_operand:QI 2 "memory_operand" "m")]
+		   UNSPEC_LDL))]
+  "TARGET_64BIT && !TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)"
   "ldl\t%0,%2"
   [(set_attr "type" "load")
    (set_attr "mode" "DI")])
 
-(define_insn "mov_ldr"
+(define_insn "*mov_ldl_internal_2"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(unspec:DI [(match_operand:BLK 1 "memory_operand" "R")
+		    (match_operand:QI 2 "memory_operand" "R")]
+		   UNSPEC_LDL))]
+  "TARGET_64BIT && !TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI"
+  "ldl\t%0,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "DI")])
+
+(define_expand "mov_ldr"
   [(set (match_operand:DI 0 "register_operand" "=d")
 	(unspec:DI [(match_operand:BLK 1 "memory_operand" "m")
 		    (match_operand:QI 2 "memory_operand" "m")
 		    (match_operand:DI 3 "register_operand" "0")]
 		   UNSPEC_LDR))]
   "TARGET_64BIT && !TARGET_MIPS16"
+  "")
+
+(define_insn "*mov_ldr_internal_1"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(unspec:DI [(match_operand:BLK 1 "memory_operand" "m")
+		    (match_operand:QI 2 "memory_operand" "m")
+		    (match_operand:DI 3 "register_operand" "0")]
+		   UNSPEC_LDR))]
+  "TARGET_64BIT && !TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)"
+  "ldr\t%0,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "DI")])
+
+(define_insn "*mov_ldr_internal_2"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(unspec:DI [(match_operand:BLK 1 "memory_operand" "R")
+		    (match_operand:QI 2 "memory_operand" "R")
+		    (match_operand:DI 3 "register_operand" "0")]
+		   UNSPEC_LDR))]
+  "TARGET_64BIT && !TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI"
   "ldr\t%0,%2"
   [(set_attr "type" "load")
    (set_attr "mode" "DI")])
 
 
-(define_insn "mov_sdl"
+(define_expand "mov_sdl"
   [(set (match_operand:BLK 0 "memory_operand" "=m")
 	(unspec:BLK [(match_operand:DI 1 "reg_or_0_operand" "dJ")
 		     (match_operand:QI 2 "memory_operand" "m")]
 		    UNSPEC_SDL))]
   "TARGET_64BIT && !TARGET_MIPS16"
+  "")
+
+(define_insn "*mov_sdl_internal_1"
+  [(set (match_operand:BLK 0 "memory_operand" "=m")
+	(unspec:BLK [(match_operand:DI 1 "reg_or_0_operand" "dJ")
+		     (match_operand:QI 2 "memory_operand" "m")]
+		    UNSPEC_SDL))]
+  "TARGET_64BIT && !TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)"
   "sdl\t%z1,%2"
   [(set_attr "type" "store")
    (set_attr "mode" "DI")])
 
-(define_insn "mov_sdr"
+(define_insn "*mov_sdl_internal_2"
+  [(set (match_operand:BLK 0 "memory_operand" "=R")
+	(unspec:BLK [(match_operand:DI 1 "reg_or_0_operand" "dJ")
+		     (match_operand:QI 2 "memory_operand" "R")]
+		    UNSPEC_SDL))]
+  "TARGET_64BIT && !TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI"
+  "sdl\t%z1,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "DI")])
+
+(define_expand "mov_sdr"
   [(set (match_operand:BLK 0 "memory_operand" "+m")
 	(unspec:BLK [(match_operand:DI 1 "reg_or_0_operand" "dJ")
 		     (match_operand:QI 2 "memory_operand" "m")
 		     (match_dup 0)]
 		    UNSPEC_SDR))]
   "TARGET_64BIT && !TARGET_MIPS16"
+  "")
+
+(define_insn "*mov_sdr_internal_1"
+  [(set (match_operand:BLK 0 "memory_operand" "+m")
+	(unspec:BLK [(match_operand:DI 1 "reg_or_0_operand" "dJ")
+		     (match_operand:QI 2 "memory_operand" "m")
+		     (match_dup 0)]
+		    UNSPEC_SDR))]
+  "TARGET_64BIT && !TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)"
+  "sdr\t%z1,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "DI")])
+
+(define_insn "*mov_sdr_internal_2"
+  [(set (match_operand:BLK 0 "memory_operand" "+R")
+	(unspec:BLK [(match_operand:DI 1 "reg_or_0_operand" "dJ")
+		     (match_operand:QI 2 "memory_operand" "R")
+		     (match_dup 0)]
+		    UNSPEC_SDR))]
+  "TARGET_64BIT && !TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI"
   "sdr\t%z1,%2"
   [(set_attr "type" "store")
    (set_attr "mode" "DI")])
@@ -4334,15 +4639,25 @@ dsrl\t%3,%3,1\n\
   [(set_attr "type"	"arith")
    (set_attr "mode"	"SI")])
 
-(define_insn "*lowdi"
+(define_insn "*lowdi_daddi"
   [(set (match_operand:DI 0 "register_operand" "=d")
 	(lo_sum:DI (match_operand:DI 1 "register_operand" "d")
 		   (match_operand:DI 2 "immediate_operand" "")))]
-  "!TARGET_MIPS16 && TARGET_64BIT"
+  "!TARGET_MIPS16 && !TARGET_NO_DADDI && TARGET_64BIT"
   "daddiu\t%0,%1,%R2"
   [(set_attr "type"	"arith")
    (set_attr "mode"	"DI")])
 
+(define_insn "*lowdi_no_daddi"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(lo_sum:DI (match_operand:DI 1 "register_operand" "d")
+		   (match_operand:DI 2 "immediate_operand" "")))]
+  "!TARGET_MIPS16 && TARGET_NO_DADDI && TARGET_64BIT"
+  "%[addiu\\t%@,%.,%R2\;daddu\t%0,%1,%@%]"
+  [(set_attr "type"	"arith")
+   (set_attr "mode"	"DI")
+   (set_attr "length"	"8")])
+
 (define_insn "*lowsi_mips16"
   [(set (match_operand:SI 0 "register_operand" "=d")
 	(lo_sum:SI (match_operand:SI 1 "register_operand" "0")
@@ -4431,10 +4746,23 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode"	"DI")
    (set_attr "length"	"8,8,8,8,12,*,*,8")])
 
-(define_insn "movdi_internal2"
+(define_insn "movdi_internal2a"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=d,d,e,d,m,*f,*f,*f,*d,*m,*x,*d,*x,*B*C*D,*B*C*D,*d,*m")
 	(match_operand:DI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*m,*f,*f,*J,*x,*d,*d,*m,*B*C*D,*B*C*D"))]
-  "TARGET_64BIT && !TARGET_MIPS16
+  "TARGET_64BIT && !TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)
+   && (register_operand (operands[0], DImode)
+       || register_operand (operands[1], DImode)
+       || (GET_CODE (operands[1]) == CONST_INT && INTVAL (operands[1]) == 0)
+       || operands[1] == CONST0_RTX (DImode))"
+  { return mips_output_move (operands[0], operands[1]); }
+  [(set_attr "type"	"move,const,const,load,store,move,xfer,load,xfer,store,hilo,hilo,hilo,xfer,load,xfer,store")
+   (set_attr "mode"	"DI")
+   (set_attr "length"	"4,*,*,*,*,4,4,*,4,*,4,4,4,8,*,8,*")])
+
+(define_insn "movdi_internal2"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=d,d,e,d,R,*f,*f,*f,*d,*R,*x,*d,*x,*B*C*D,*B*C*D,*d,*R")
+	(match_operand:DI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*R,*f,*f,*J,*x,*d,*d,*R,*B*C*D,*B*C*D"))]
+  "TARGET_64BIT && !TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI
    && (register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode)
        || (GET_CODE (operands[1]) == CONST_INT && INTVAL (operands[1]) == 0)
@@ -4468,6 +4796,26 @@ dsrl\t%3,%3,1\n\
 		 (const_string "*")
 		 (const_int 4)])])
 
+(define_expand "reload_outdi"
+  [(set (match_operand:DI 0 "memory_operand" "=m")
+	(match_operand:DI 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16"
+  "
+{
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && GET_CODE (operands[0]) == MEM && mips_fetch_insns (operands[0]) != 1)
+    {
+      emit_move_insn (operands[2], XEXP (operands[0], 0));
+      operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+    }
+  else if (hilo_operand (operands[1], GET_MODE (operands[1])))
+    {
+      emit_move_insn (operands[2], operands[1]);
+      operands[1] = operands[2];
+    }
+}")
+
 
 ;; On the mips16, we can split ld $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
@@ -4557,10 +4905,22 @@ dsrl\t%3,%3,1\n\
 ;; The difference between these two is whether or not ints are allowed
 ;; in FP registers (off by default, use -mdebugh to enable).
 
-(define_insn "movsi_internal"
+(define_insn "movsi_internal_1"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=d,d,e,d,m,*f,*f,*f,*d,*m,*d,*z,*x,*d,*x,*B*C*D,*B*C*D,*d,*m")
 	(match_operand:SI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*m,*f,*f,*z,*d,J,*x,*d,*d,*m,*B*C*D,*B*C*D"))]
-  "!TARGET_MIPS16
+  "!TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)
+   && (register_operand (operands[0], SImode)
+       || register_operand (operands[1], SImode)
+       || (GET_CODE (operands[1]) == CONST_INT && INTVAL (operands[1]) == 0))"
+  { return mips_output_move (operands[0], operands[1]); }
+  [(set_attr "type"	"move,const,const,load,store,move,xfer,load,xfer,store,xfer,xfer,hilo,hilo,hilo,xfer,load,xfer,store")
+   (set_attr "mode"	"SI")
+   (set_attr "length"	"4,*,*,*,*,4,4,*,4,*,4,4,4,4,4,4,*,4,*")])
+
+(define_insn "movsi_internal_2"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,d,e,d,R,*f,*f,*f,*d,*R,*d,*z,*x,*d,*x,*B*C*D,*B*C*D,*d,*R")
+	(match_operand:SI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*R,*f,*f,*z,*d,J,*x,*d,*d,*R,*B*C*D,*B*C*D"))]
+  "!TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI
    && (register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode)
        || (GET_CODE (operands[1]) == CONST_INT && INTVAL (operands[1]) == 0))"
@@ -4593,6 +4953,28 @@ dsrl\t%3,%3,1\n\
 		 (const_string "*")
 		 (const_int 4)])])
 
+(define_expand "reload_outsi"
+  [(set (match_operand:SI 0 "memory_operand" "=m")
+	(match_operand:SI 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16"
+  "
+{
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && GET_CODE (operands[0]) == MEM && mips_fetch_insns (operands[0]) != 1)
+    {
+      emit_move_insn (operands[2], XEXP (operands[0], 0));
+      operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+    }
+  else if (hilo_operand (operands[1], GET_MODE (operands[1])))
+    {
+      operands[2] = gen_rtx_REG (GET_MODE (operands[1]), REGNO (operands[2]));
+      emit_move_insn (operands[2], operands[1]);
+      operands[1] = operands[2];
+    }
+}")
+
+
 ;; On the mips16, we can split lw $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
 ;; load are 2 2 byte instructions.
@@ -4839,10 +5221,31 @@ dsrl\t%3,%3,1\n\
     }
 })
 
-(define_insn "movhi_internal"
+(define_insn "movhi_internal_1"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=d,d,d,m,*d,*f,*f,*x,*d")
 	(match_operand:HI 1 "general_operand"       "d,IK,m,dJ,*f,*d,*f,*d,*x"))]
-  "!TARGET_MIPS16
+  "!TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)
+   && (register_operand (operands[0], HImode)
+       || register_operand (operands[1], HImode)
+       || (GET_CODE (operands[1]) == CONST_INT && INTVAL (operands[1]) == 0))"
+  "@
+    move\t%0,%1
+    li\t%0,%1
+    lhu\t%0,%1
+    sh\t%z1,%0
+    mfc1\t%0,%1
+    mtc1\t%1,%0
+    mov.s\t%0,%1
+    mt%0\t%1
+    mf%1\t%0"
+  [(set_attr "type"	"move,arith,load,store,xfer,xfer,move,hilo,hilo")
+   (set_attr "mode"	"HI")
+   (set_attr "length"	"4,4,*,*,4,4,4,4,4")])
+
+(define_insn "movhi_internal_2"
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,d,d,R,*d,*f,*f,*x,*d")
+	(match_operand:HI 1 "general_operand"       "d,IK,m,dJ,*f,*d,*f,*d,*x"))]
+  "!TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI
    && (register_operand (operands[0], HImode)
        || register_operand (operands[1], HImode)
        || (GET_CODE (operands[1]) == CONST_INT && INTVAL (operands[1]) == 0))"
@@ -4891,6 +5294,21 @@ dsrl\t%3,%3,1\n\
 		 (const_string "*")
 		 (const_int 4)])])
 
+(define_expand "reload_outhi"
+  [(set (match_operand:HI 0 "memory_operand" "=m")
+	(match_operand:HI 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16"
+  "
+{
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && GET_CODE (operands[0]) == MEM && mips_fetch_insns (operands[0]) != 1)
+    {
+      emit_move_insn (operands[2], XEXP (operands[0], 0));
+      operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+    }
+}")
+
 
 ;; On the mips16, we can split lh $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
@@ -4959,10 +5377,31 @@ dsrl\t%3,%3,1\n\
     }
 })
 
-(define_insn "movqi_internal"
+(define_insn "movqi_internal_1"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=d,d,d,m,*d,*f,*f,*x,*d")
 	(match_operand:QI 1 "general_operand"       "d,IK,m,dJ,*f,*d,*f,*d,*x"))]
-  "!TARGET_MIPS16
+  "!TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)
+   && (register_operand (operands[0], QImode)
+       || register_operand (operands[1], QImode)
+       || (GET_CODE (operands[1]) == CONST_INT && INTVAL (operands[1]) == 0))"
+  "@
+    move\t%0,%1
+    li\t%0,%1
+    lbu\t%0,%1
+    sb\t%z1,%0
+    mfc1\t%0,%1
+    mtc1\t%1,%0
+    mov.s\t%0,%1
+    mt%0\t%1
+    mf%1\t%0"
+  [(set_attr "type"	"move,arith,load,store,xfer,xfer,move,hilo,hilo")
+   (set_attr "mode"	"QI")
+   (set_attr "length"	"4,4,*,*,4,4,4,4,4")])
+
+(define_insn "movqi_internal_2"
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,d,d,R,*d,*f,*f,*x,*d")
+	(match_operand:QI 1 "general_operand"       "d,IK,m,dJ,*f,*d,*f,*d,*x"))]
+  "!TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI
    && (register_operand (operands[0], QImode)
        || register_operand (operands[1], QImode)
        || (GET_CODE (operands[1]) == CONST_INT && INTVAL (operands[1]) == 0))"
@@ -4999,6 +5438,22 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode"	"QI")
    (set_attr "length"	"4,4,4,4,8,*,*,4")])
 
+(define_expand "reload_outqi"
+  [(set (match_operand:QI 0 "memory_operand" "=m")
+	(match_operand:QI 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16"
+  "
+{
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && GET_CODE (operands[0]) == MEM && mips_fetch_insns (operands[0]) != 1)
+    {
+      emit_move_insn (operands[2], XEXP (operands[0], 0));
+      operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+    }
+}")
+
+
 ;; On the mips16, we can split lb $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
 ;; load are 2 2 byte instructions.
@@ -5042,10 +5497,10 @@ dsrl\t%3,%3,1\n\
     operands[1] = force_reg (SFmode, operands[1]);
 })
 
-(define_insn "movsf_internal1"
+(define_insn "movsf_internal1a"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,*f,*d,*d,*d,*m")
 	(match_operand:SF 1 "general_operand" "f,G,m,fG,*d,*f,*G*d,*m,*d"))]
-  "TARGET_HARD_FLOAT
+  "TARGET_HARD_FLOAT && (Pmode != DImode || !TARGET_NO_DADDI)
    && (register_operand (operands[0], SFmode)
        || nonmemory_operand (operands[1], SFmode))"
   { return mips_output_move (operands[0], operands[1]); }
@@ -5053,10 +5508,32 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode"	"SF")
    (set_attr "length"	"4,4,*,*,4,4,4,*,*")])
 
-(define_insn "movsf_internal2"
+(define_insn "movsf_internal1b"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,R,*f,*d,*d,*d,*R")
+	(match_operand:SF 1 "general_operand" "f,G,R,fG,*d,*f,*G*d,*m,*d"))]
+  "TARGET_HARD_FLOAT && Pmode == DImode && TARGET_NO_DADDI
+   && (register_operand (operands[0], SFmode)
+       || nonmemory_operand (operands[1], SFmode))"
+  { return mips_output_move (operands[0], operands[1]); }
+  [(set_attr "type"	"move,xfer,load,store,xfer,xfer,move,load,store")
+   (set_attr "mode"	"SF")
+   (set_attr "length"	"4,4,*,*,4,4,4,*,*")])
+
+(define_insn "movsf_internal2a"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=d,d,m")
 	(match_operand:SF 1 "general_operand" "      Gd,m,d"))]
-  "TARGET_SOFT_FLOAT && !TARGET_MIPS16
+  "TARGET_SOFT_FLOAT && !TARGET_MIPS16 && (Pmode != DImode || !TARGET_NO_DADDI)
+   && (register_operand (operands[0], SFmode)
+       || nonmemory_operand (operands[1], SFmode))"
+  { return mips_output_move (operands[0], operands[1]); }
+  [(set_attr "type"	"move,load,store")
+   (set_attr "mode"	"SF")
+   (set_attr "length"	"4,*,*")])
+
+(define_insn "movsf_internal2b"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=d,d,R")
+	(match_operand:SF 1 "general_operand" "      Gd,m,d"))]
+  "TARGET_SOFT_FLOAT && !TARGET_MIPS16 && Pmode == DImode && TARGET_NO_DADDI
    && (register_operand (operands[0], SFmode)
        || nonmemory_operand (operands[1], SFmode))"
   { return mips_output_move (operands[0], operands[1]); }
@@ -5075,6 +5552,36 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode"	"SF")
    (set_attr "length"	"4,4,4,*,*")])
 
+(define_expand "reload_insf"
+  [(set (match_operand:SF 0 "" "=b")
+	(match_operand:SF 1 "memory_operand" "m"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  ""
+  "
+{
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && GET_CODE (operands[1]) == MEM && mips_fetch_insns (operands[1]) != 1)
+    {
+      emit_move_insn (operands[2], XEXP (operands[1], 0));
+      operands[1] = gen_rtx_MEM (GET_MODE (operands[1]), operands[2]);
+    }
+}")
+
+(define_expand "reload_outsf"
+  [(set (match_operand:SF 0 "memory_operand" "=m")
+	(match_operand:SF 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  ""
+  "
+{
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && GET_CODE (operands[0]) == MEM && mips_fetch_insns (operands[0]) != 1)
+    {
+      emit_move_insn (operands[2], XEXP (operands[0], 0));
+      operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+    }
+}")
+
 
 ;; 64-bit floating point moves
 
@@ -5089,10 +5596,23 @@ dsrl\t%3,%3,1\n\
     operands[1] = force_reg (DFmode, operands[1]);
 })
 
-(define_insn "movdf_internal1a"
+(define_insn "movdf_internal1a_1"
   [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,*f,*d,*d,*d,*m")
 	(match_operand:DF 1 "general_operand" "f,G,m,fG,*d,*f,*d*G,*m,*d"))]
   "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_64BIT
+   && (Pmode != DImode || !TARGET_NO_DADDI)
+   && (register_operand (operands[0], DFmode)
+       || nonmemory_operand (operands[1], DFmode))"
+  { return mips_output_move (operands[0], operands[1]); }
+  [(set_attr "type"	"move,xfer,load,store,xfer,xfer,move,load,store")
+   (set_attr "mode"	"DF")
+   (set_attr "length"	"4,4,*,*,4,4,4,*,*")])
+
+(define_insn "movdf_internal1a_2"
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,R,*f,*d,*d,*d,*R")
+	(match_operand:DF 1 "general_operand" "f,G,R,fG,*d,*f,*d*G,*m,*d"))]
+  "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_64BIT
+   && Pmode == DImode && TARGET_NO_DADDI
    && (register_operand (operands[0], DFmode)
        || nonmemory_operand (operands[1], DFmode))"
   { return mips_output_move (operands[0], operands[1]); }
@@ -5111,10 +5631,23 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode"	"DF")
    (set_attr "length"	"4,8,*,*,8,8,8,*,*")])
 
-(define_insn "movdf_internal2"
+(define_insn "movdf_internal2a"
   [(set (match_operand:DF 0 "nonimmediate_operand" "=d,d,m,d,f,f")
 	(match_operand:DF 1 "general_operand" "dG,m,dG,f,d,f"))]
   "(TARGET_SOFT_FLOAT || TARGET_SINGLE_FLOAT) && !TARGET_MIPS16
+   && (Pmode != DImode || !TARGET_NO_DADDI)
+   && (register_operand (operands[0], DFmode)
+       || nonmemory_operand (operands[1], DFmode))"
+  { return mips_output_move (operands[0], operands[1]); }
+  [(set_attr "type"	"move,load,store,xfer,xfer,move")
+   (set_attr "mode"	"DF")
+   (set_attr "length"	"8,*,*,4,4,4")])
+
+(define_insn "movdf_internal2b"
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=d,d,R,d,f,f")
+	(match_operand:DF 1 "general_operand" "dG,R,dG,f,d,f"))]
+  "(TARGET_SOFT_FLOAT || TARGET_SINGLE_FLOAT) && !TARGET_MIPS16
+   && Pmode == DImode && TARGET_NO_DADDI
    && (register_operand (operands[0], DFmode)
        || nonmemory_operand (operands[1], DFmode))"
   { return mips_output_move (operands[0], operands[1]); }
@@ -5133,6 +5666,37 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode"	"DF")
    (set_attr "length"	"8,8,8,*,*")])
 
+(define_expand "reload_indf"
+  [(set (match_operand:DF 0 "" "=b")
+	(match_operand:DF 1 "memory_operand" "m"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  ""
+  "
+{
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && GET_CODE (operands[1]) == MEM && mips_fetch_insns (operands[1]) != 1)
+    {
+      emit_move_insn (operands[2], XEXP (operands[1], 0));
+      operands[1] = gen_rtx_MEM (GET_MODE (operands[1]), operands[2]);
+    }
+}")
+
+(define_expand "reload_outdf"
+  [(set (match_operand:DF 0 "memory_operand" "=m")
+	(match_operand:DF 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  ""
+  "
+{
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && GET_CODE (operands[0]) == MEM && mips_fetch_insns (operands[0]) != 1)
+    {
+      emit_move_insn (operands[2], XEXP (operands[0], 0));
+      operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+    }
+}")
+
+
 (define_split
   [(set (match_operand:DI 0 "nonimmediate_operand" "")
 	(match_operand:DI 1 "general_operand" ""))]
@@ -7987,8 +8551,20 @@ srl\t%M0,%M1,%2\n\
 	return "%*b\t%l0%/";
       else
 	{
-	  output_asm_insn (mips_output_load_label (), operands);
-	  return "%*jr\t%@%/%]";
+	  if (Pmode != DImode || !TARGET_NO_DADDI)
+	    {
+	      static char buffer[200];
+	      operands[1] = gen_rtx_REG (SImode, 1);
+	      sprintf (buffer, "%s%s", "%[", mips_output_load_label ());
+	      output_asm_insn (buffer, operands);
+	      return "%*jr\t$1%/%]";
+	    }
+	  else
+	    {
+	      operands[1] = gen_rtx_REG (SImode, 24);
+	      output_asm_insn (mips_output_load_label (), operands);
+	      return "%*jr\t$24%/";
+	    }
 	}
     }
   else
diff -up --recursive --new-file gcc-3.4-20031107.macro/gcc/doc/invoke.texi gcc-3.4-20031107/gcc/doc/invoke.texi
--- gcc-3.4-20031107.macro/gcc/doc/invoke.texi	2004-03-01 12:09:15.000000000 +0000
+++ gcc-3.4-20031107/gcc/doc/invoke.texi	2004-03-01 12:20:18.000000000 +0000
@@ -491,7 +491,7 @@ in the following sections.
 -EL  -EB  -G @var{num}  -nocpp @gol
 -mabi=32  -mabi=n32  -mabi=64  -mabi=eabi  -mabi-fake-default @gol
 -mfix7000  -mfix-r4000  -mno-fix-r4000  -mfix-r4400  -mno-fix-r4400 @gol
--mfix-sb1  -mno-fix-sb1 @gol
+-mfix-sb1  -mno-fix-sb1  -mno-daddi  -mdaddi @gol
 -mno-crt0 -mflush-func=@var{func} -mno-flush-func @gol
 -mbranch-likely -mno-branch-likely}
 
@@ -8473,6 +8473,30 @@ Work around certain SB-1 CPU core errata
 (This flag currently works around the SB-1 revision 2
 ``F1'' and ``F2'' floating point errata.)
 
+@item -mno-daddi
+@itemx -mdaddi
+@opindex mno-daddi
+@opindex mdaddi
+Provide support for eliminating the @samp{daddiu} instruction from generated
+code by taking the following precautions:
+@itemize @minus
+@item
+Do not emit @samp{daddiu} instructions.
+@item
+Do not emit macros that expand to @samp{daddiu} instructions.
+@item
+Emit only such address references that can be expanded by the assembler
+without the use of @samp{daddiu} instructions.
+@end itemize
+This options requires appropriate support from the assembler to be
+effective.  Otherwise the generated code will still be correct, but
+@samp{daddiu} instructions may appear.
+
+This is needed for the R4000 processor and the initial revision of the
+R4400 processor as they have errata leading to @samp{daddi} and @samp{daddiu}
+instructions being executed incorrectly.  The @option{-mno-daddi} setting is
+implied by @option{-mfix-4000} and @option{-mfix-4400}.
+
 @item -no-crt0
 @opindex no-crt0
 Do not include the default crt0.


binutils-2.15.90-20040301-mips-nodaddi.patch
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/config/tc-mips.c binutils-2.15.90-20040301/gas/config/tc-mips.c
--- binutils-2.15.90-20040301.macro/gas/config/tc-mips.c	2004-02-27 04:25:28.000000000 +0000
+++ binutils-2.15.90-20040301/gas/config/tc-mips.c	2004-03-02 13:35:28.000000000 +0000
@@ -158,6 +158,10 @@ struct mips_set_options
      Changed by `.set mips16' and `.set nomips16', and the -mips16 and
      -nomips16 command line options, and the default CPU.  */
   int mips16;
+  /* Non-zero if we should not emit "daddi" and "daddiu" instructions.
+     Changed by `.set daddi' and `.set nodaddi', the -mdaddi and
+     -mno-daddi command line options, and the default CPU.  */
+  int nodaddi;
   /* Non-zero if we should not reorder instructions.  Changed by `.set
      reorder' and `.set noreorder'.  */
   int noreorder;
@@ -201,7 +205,7 @@ static int file_mips_fp32 = -1;
 
 static struct mips_set_options mips_opts =
 {
-  ISA_UNKNOWN, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, CPU_UNKNOWN
+  ISA_UNKNOWN, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, CPU_UNKNOWN
 };
 
 /* These variables are filled in with the masks of registers used.
@@ -225,6 +229,10 @@ static int file_ase_mips3d;
    command line (e.g., by -march).  */
 static int file_ase_mdmx;
 
+/* True if -mno-daddi was passed or implied by arguments passed on the
+   command line (e.g., by -march).  */
+static int file_nodaddi;
+
 /* The argument of the -march= flag.  The architecture we are assembling.  */
 static int file_mips_arch = CPU_UNKNOWN;
 static const char *mips_arch_string;
@@ -325,6 +333,44 @@ static int mips_32bitmode = 0;
 /* True if CPU has a ror instruction.  */
 #define CPU_HAS_ROR(CPU)	CPU_HAS_DROR (CPU)
 
+/* Return true if the given CPU has at least one of the "daddi" bugs.
+
+   From "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0"
+   (also valid for MIPS R4000MC processors):
+
+   "23. R4000PC, R4000SC: The 64-bit instruction, daddi, fails to take
+	an overflow exception.
+
+	Workaround: There is no workaround for this problem."
+
+   and:
+
+   "41. R4000PC, R4000SC: Under the following condition, the DADDIU
+	instruction can produce an incorrect result.  If this
+	instruction generates a result value that would cause an
+	overflow condition to occur (even though this instruction does
+	not take an overflow exception) then the result value will be
+	correct in bits 0-31 but bit 31 will be replicated through
+	bits 32-63 (so it looks like a 32bit signextended value).  The
+	overflow condition is defined when the carries out of bits 62
+	and 63 differ (two's compliment overflow).
+
+	Workaround: There is no workaround for this problem."
+
+   Erratum #41 is also present in "MIPS R4400PC/SC Errata, Processor
+   Revision 1.0" (also valid for MIPS R4400MC processors) as erratum
+   #7.  The erratum is fixed in later revisions of R4400PC/SC/MC
+   processors (cf. "MIPS R4400PC/SC Errata, Processor Revision 2.0 &
+   3.0").
+
+   We default to working around the "daddi" bugs for both R4000 and
+   R4400, conservatively.  The default can be changed with the
+   -mno-daddi and -mdaddi command line options.  */
+#define CPU_HAS_DADDI_BUG(cpu)					\
+   (mips_matching_cpu_name_p (mips_arch_string, "r4000")	\
+    || mips_matching_cpu_name_p (mips_arch_string, "r4400")	\
+    )
+
 /* True if mflo and mfhi can be immediately followed by instructions
    which write to the HI and LO registers.
 
@@ -3839,30 +3885,55 @@ load_address (int reg, expressionS *ep, 
 	     It used not to be possible with the original relaxation code,
 	     but it could be done now.  */
 
-	  if (*used_at == 0 && ! mips_opts.noat)
+	  if (! mips_opts.nodaddi)
 	    {
-	      macro_build (ep, "lui", "t,u", reg, BFD_RELOC_MIPS_HIGHEST);
-	      macro_build (ep, "lui", "t,u", AT, BFD_RELOC_HI16_S);
-	      macro_build (ep, "daddiu", "t,r,j", reg, reg,
-			   BFD_RELOC_MIPS_HIGHER);
-	      macro_build (ep, "daddiu", "t,r,j", AT, AT, BFD_RELOC_LO16);
-	      macro_build (NULL, "dsll32", "d,w,<", reg, reg, 0);
-	      macro_build (NULL, "daddu", "d,v,t", reg, reg, AT);
-	      *used_at = 1;
+	      if (*used_at == 0 && ! mips_opts.noat)
+		{
+		  macro_build (ep, "lui", "t,u", reg, BFD_RELOC_MIPS_HIGHEST);
+		  macro_build (ep, "lui", "t,u", AT, BFD_RELOC_HI16_S);
+		  macro_build (ep, "daddiu", "t,r,j", reg, reg,
+			       BFD_RELOC_MIPS_HIGHER);
+		  macro_build (ep, "daddiu", "t,r,j", AT, AT, BFD_RELOC_LO16);
+		  macro_build (NULL, "dsll32", "d,w,<", reg, reg, 0);
+		  macro_build (NULL, "daddu", "d,v,t", reg, reg, AT);
+		  *used_at = 1;
+		}
+	      else
+		{
+		  macro_build (ep, "lui", "t,u", reg, BFD_RELOC_MIPS_HIGHEST);
+		  macro_build (ep, "daddiu", "t,r,j", reg, reg,
+			       BFD_RELOC_MIPS_HIGHER);
+		  macro_build (NULL, "dsll", "d,w,<", reg, reg, 16);
+		  macro_build (ep, "daddiu", "t,r,j", reg, reg,
+			       BFD_RELOC_HI16_S);
+		  macro_build (NULL, "dsll", "d,w,<", reg, reg, 16);
+		  macro_build (ep, "daddiu", "t,r,j", reg, reg,
+			       BFD_RELOC_LO16);
+		}
 	    }
 	  else
 	    {
-	      macro_build (ep, "lui", "t,u", reg, BFD_RELOC_MIPS_HIGHEST);
-	      macro_build (ep, "daddiu", "t,r,j", reg, reg,
-			   BFD_RELOC_MIPS_HIGHER);
-	      macro_build (NULL, "dsll", "d,w,<", reg, reg, 16);
-	      macro_build (ep, "daddiu", "t,r,j", reg, reg, BFD_RELOC_HI16_S);
-	      macro_build (NULL, "dsll", "d,w,<", reg, reg, 16);
-	      macro_build (ep, "daddiu", "t,r,j", reg, reg, BFD_RELOC_LO16);
+	      if (*used_at == 0)
+		{
+		  macro_build (ep, "lui", "t,u", reg, BFD_RELOC_MIPS_HIGHEST);
+		  macro_build (ep, "addiu", "t,r,j", reg, reg,
+			       BFD_RELOC_MIPS_HIGHER);
+		  macro_build (NULL, "dsll", "d,w,<", reg, reg, 16);
+		  macro_build (ep, "addiu", "t,r,j", AT, 0, BFD_RELOC_HI16_S);
+		  macro_build (NULL, "daddu", "d,v,t", reg, reg, AT);
+		  macro_build (NULL, "dsll", "d,w,<", reg, reg, 16);
+		  macro_build (ep, "addiu", "t,r,j", AT, 0, BFD_RELOC_LO16);
+		  macro_build (NULL, "daddu", "d,v,t", reg, reg, AT);
+		  *used_at = 1;
+		}
+	      else
+		as_bad (_("Macro needs a temporary register due to "
+			  "`nodaddi' while $at is already in use"));
 	    }
 	}
       else
 	{
+	  assert (HAVE_32BIT_ADDRESSES);		/* nodaddi safety */
 	  if ((valueT) ep->X_add_number <= MAX_GPREL_OFFSET
 	      && ! nopic_need_relax (ep->X_add_symbol, 1))
 	    {
@@ -3906,8 +3977,18 @@ load_address (int reg, expressionS *ep, 
 	      if (ex.X_add_number < -0x8000 || ex.X_add_number >= 0x8000)
 		as_bad (_("PIC code offset overflow (max 16 signed bits)"));
 	      ex.X_op = O_constant;
-	      macro_build (&ex, ADDRESS_ADDI_INSN, "t,r,j",
-			   reg, reg, BFD_RELOC_LO16);
+	      if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
+		macro_build (&ex, ADDRESS_ADDI_INSN, "t,r,j",
+			     reg, reg, BFD_RELOC_LO16);
+	      else if (*used_at == 0)
+		{
+		  macro_build (&ex, "addiu", "t,r,j", AT, 0, BFD_RELOC_LO16);
+		  macro_build (NULL, "daddu", "d,v,t", reg, reg, AT);
+		  *used_at = 1;
+		}
+	      else
+		as_bad (_("Macro needs a temporary register due to "
+			  "`nodaddi' while $at is already in use"));
 	      ep->X_add_number = ex.X_add_number;
 	      relax_switch ();
 	    }
@@ -3918,6 +3999,7 @@ load_address (int reg, expressionS *ep, 
 	}
       else
 	{
+	  assert (HAVE_32BIT_ADDRESSES);		/* nodaddi safety */
 	  ex.X_add_number = ep->X_add_number;
 	  ep->X_add_number = 0;
 	  macro_build (ep, ADDRESS_LOAD_INSN, "t,o(b)", reg,
@@ -3974,20 +4056,41 @@ load_address (int reg, expressionS *ep, 
 	  else if (ex.X_add_number)
 	    {
 	      ex.X_op = O_constant;
-	      macro_build (&ex, ADDRESS_ADDI_INSN, "t,r,j", reg, reg,
-			   BFD_RELOC_LO16);
+	      if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
+		macro_build (&ex, ADDRESS_ADDI_INSN, "t,r,j", reg, reg,
+			     BFD_RELOC_LO16);
+	      else if (*used_at == 0)
+		{
+		  macro_build (&ex, "addiu", "t,r,j", AT, 0, BFD_RELOC_LO16);
+		  macro_build (NULL, "daddu", "d,v,t", reg, reg, AT);
+		}
+	      else
+		as_bad (_("Macro needs a temporary register due to "
+			  "`nodaddi' while $at is already in use"));
 	    }
 
 	  ep->X_add_number = ex.X_add_number;
 	  relax_switch ();
 	  macro_build (ep, ADDRESS_LOAD_INSN, "t,o(b)", reg,
 		       BFD_RELOC_MIPS_GOT_PAGE, mips_gp_register);
-	  macro_build (ep, ADDRESS_ADDI_INSN, "t,r,j", reg, reg,
-		       BFD_RELOC_MIPS_GOT_OFST);
+	  if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
+	    macro_build (ep, ADDRESS_ADDI_INSN, "t,r,j", reg, reg,
+			 BFD_RELOC_MIPS_GOT_OFST);
+	  else if (*used_at == 0)
+	    {
+	      macro_build (ep, "addiu", "t,r,j", AT, 0,
+			   BFD_RELOC_MIPS_GOT_OFST);
+	      macro_build (NULL, "daddu", "d,v,t", reg, reg, AT);
+	      *used_at = 1;
+	    }
+	  else
+	    as_bad (_("Macro needs a temporary register due to "
+		      "`nodaddi' while $at is already in use"));
 	  relax_end ();
 	}
       else
 	{
+	  assert (HAVE_32BIT_ADDRESSES);		/* nodaddi safety */
 	  ex.X_add_number = ep->X_add_number;
 	  ep->X_add_number = 0;
 	  relax_start (ep->X_add_symbol);
@@ -4027,8 +4130,20 @@ load_address (int reg, expressionS *ep, 
       /* We always do
 	   addiu	$reg,$gp,<sym>		(BFD_RELOC_GPREL16)
        */
-      macro_build (ep, ADDRESS_ADDI_INSN, "t,r,j",
-		   reg, mips_gp_register, BFD_RELOC_GPREL16);
+      if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
+	macro_build (ep, ADDRESS_ADDI_INSN, "t,r,j",
+		     reg, mips_gp_register, BFD_RELOC_GPREL16);
+      else if (*used_at == 0)
+	{
+	  macro_build (ep, "addiu", "t,r,j", AT,
+		       0, BFD_RELOC_GPREL16);
+	  macro_build (NULL, "daddu", "d,v,t", reg,
+		       mips_gp_register, AT);
+	  *used_at = 1;
+	}
+      else
+	as_bad (_("Macro needs a temporary register due to "
+		  "`nodaddi' while $at is already in use"));
     }
   else
     abort ();
@@ -4213,12 +4328,26 @@ macro (struct mips_cl_insn *ip)
       s = "daddiu";
       s2 = "daddu";
     do_addi:
-      if (imm_expr.X_op == O_constant
-	  && imm_expr.X_add_number >= -0x8000
-	  && imm_expr.X_add_number < 0x8000)
-	{
-	  macro_build (&imm_expr, s, "t,r,j", treg, sreg, BFD_RELOC_LO16);
-	  return;
+      if ((imm_expr.X_op == O_constant
+	   && imm_expr.X_add_number >= -0x8000
+	   && imm_expr.X_add_number < 0x8000)
+	  || imm_expr.X_op == O_symbol)
+	{
+	  int reloc_type = BFD_RELOC_LO16;
+
+	  if (imm_expr.X_op == O_symbol)
+	    reloc_type = *imm_reloc;
+	  if (! dbl || ! mips_opts.nodaddi)
+	    {
+	      macro_build (&imm_expr, s, "t,r,j", treg, sreg, reloc_type);
+	      return;
+	    }
+	  else
+	    {
+	      macro_build (&imm_expr, "addiu", "t,r,j", AT, 0, reloc_type);
+	      macro_build (NULL, s2, "d,v,t", treg, sreg, AT);
+	      break;
+	    }
 	}
       load_register (AT, &imm_expr, dbl);
       macro_build (NULL, s2, "d,v,t", treg, sreg, AT);
@@ -4871,10 +5000,24 @@ macro (struct mips_cl_insn *ip)
 	  && offset_expr.X_add_number >= -0x8000
 	  && offset_expr.X_add_number < 0x8000)
 	{
-	  macro_build (&offset_expr,
-		       (dbl || HAVE_64BIT_ADDRESSES) ? "daddiu" : "addiu",
-		       "t,r,j", treg, sreg, BFD_RELOC_LO16);
-	  return;
+	  if (! ((dbl || HAVE_64BIT_ADDRESSES) && mips_opts.nodaddi))
+	    {
+	      macro_build (&offset_expr,
+			   (dbl || HAVE_64BIT_ADDRESSES) ? "daddiu" : "addiu",
+			   "t,r,j", treg, sreg, BFD_RELOC_LO16);
+	      return;
+	    }
+	  else if (sreg == 0)
+	    {
+	      load_register (treg, &offset_expr, dbl);
+	      return;
+	    }
+	  else
+	    {
+	      load_register (AT, &offset_expr, dbl);
+	      macro_build (NULL, "daddu", "d,v,t", treg, sreg, AT);
+	      break;
+	    }
 	}
 
       if (treg == breg)
@@ -4924,9 +5067,25 @@ macro (struct mips_cl_insn *ip)
 			   (dbl || HAVE_64BIT_ADDRESSES) ? "daddu" : "addu",
 			   "d,v,t", tempreg, tempreg, breg);
 	    }
-	  macro_build (&offset_expr,
-		       (dbl || HAVE_64BIT_ADDRESSES) ? "daddiu" : "addiu",
-		       "t,r,j", treg, tempreg, BFD_RELOC_PCREL_LO16);
+	  if (! (dbl || HAVE_64BIT_ADDRESSES) || ! mips_opts.nodaddi)
+	    macro_build (&offset_expr,
+			 (dbl || HAVE_64BIT_ADDRESSES) ? "daddiu" : "addiu",
+			 "t,r,j", treg, tempreg, BFD_RELOC_PCREL_LO16);
+	  else
+	    {
+	      /* If we are going to add in a base register, and the
+		 target register and the base register are the same,
+		 then we are using AT as a temporary register.  Since
+		 we want to load the constant into AT, we add our
+		 current AT and the register into the register now.  */
+	      if (treg == breg)
+		macro_build (NULL, "daddu", "d,v,t", treg, treg, tempreg);
+	      macro_build (&offset_expr, "addiu", "t,r,j", AT, 0,
+			   BFD_RELOC_PCREL_LO16);
+	      macro_build (NULL, "daddu", "d,v,t", treg, treg, AT);
+	      used_at = 1;
+	    }
+	
 	  if (! used_at)
 	    return;
 	  break;
@@ -4977,32 +5136,90 @@ macro (struct mips_cl_insn *ip)
 		 these macros.  It used not to be possible with the
 		 original relaxation code, but it could be done now.  */
 
-	      if (used_at == 0 && ! mips_opts.noat)
+	      if (! mips_opts.nodaddi)
 		{
-		  macro_build (&offset_expr, "lui", "t,u",
-			       tempreg, BFD_RELOC_MIPS_HIGHEST);
-		  macro_build (&offset_expr, "lui", "t,u",
-			       AT, BFD_RELOC_HI16_S);
-		  macro_build (&offset_expr, "daddiu", "t,r,j",
-			       tempreg, tempreg, BFD_RELOC_MIPS_HIGHER);
-		  macro_build (&offset_expr, "daddiu", "t,r,j",
-			       AT, AT, BFD_RELOC_LO16);
-		  macro_build (NULL, "dsll32", "d,w,<", tempreg, tempreg, 0);
-		  macro_build (NULL, "daddu", "d,v,t", tempreg, tempreg, AT);
-		  used_at = 1;
+		  if (used_at == 0 && ! mips_opts.noat)
+		    {
+		      macro_build (&offset_expr, "lui", "t,u",
+				   tempreg, BFD_RELOC_MIPS_HIGHEST);
+		      macro_build (&offset_expr, "lui", "t,u",
+				   AT, BFD_RELOC_HI16_S);
+		      macro_build (&offset_expr, "daddiu", "t,r,j",
+				   tempreg, tempreg, BFD_RELOC_MIPS_HIGHER);
+		      macro_build (&offset_expr, "daddiu", "t,r,j",
+				   AT, AT, BFD_RELOC_LO16);
+		      macro_build (NULL, "dsll32", "d,w,<",
+				   tempreg, tempreg, 0);
+		      macro_build (NULL, "daddu", "d,v,t",
+				   tempreg, tempreg, AT);
+		      used_at = 1;
+		    }
+		  else
+		    {
+		      macro_build (&offset_expr, "lui", "t,u",
+				   tempreg, BFD_RELOC_MIPS_HIGHEST);
+		      macro_build (&offset_expr, "daddiu", "t,r,j",
+				   tempreg, tempreg, BFD_RELOC_MIPS_HIGHER);
+		      macro_build (NULL, "dsll", "d,w,<",
+				   tempreg, tempreg, 16);
+		      macro_build (&offset_expr, "daddiu", "t,r,j",
+				   tempreg, tempreg, BFD_RELOC_HI16_S);
+		      macro_build (NULL, "dsll", "d,w,<",
+				   tempreg, tempreg, 16);
+		      macro_build (&offset_expr, "daddiu", "t,r,j",
+				   tempreg, tempreg, BFD_RELOC_LO16);
+		    }
 		}
 	      else
 		{
-		  macro_build (&offset_expr, "lui", "t,u",
-			       tempreg, BFD_RELOC_MIPS_HIGHEST);
-		  macro_build (&offset_expr, "daddiu", "t,r,j",
-			       tempreg, tempreg, BFD_RELOC_MIPS_HIGHER);
-		  macro_build (NULL, "dsll", "d,w,<", tempreg, tempreg, 16);
-		  macro_build (&offset_expr, "daddiu", "t,r,j",
-			       tempreg, tempreg, BFD_RELOC_HI16_S);
-		  macro_build (NULL, "dsll", "d,w,<", tempreg, tempreg, 16);
-		  macro_build (&offset_expr, "daddiu", "t,r,j",
-			       tempreg, tempreg, BFD_RELOC_LO16);
+		  if (used_at == 0)
+		    {
+		      macro_build (&offset_expr, "lui", "t,u",
+				   tempreg, BFD_RELOC_MIPS_HIGHEST);
+		      macro_build (&offset_expr, "addiu", "t,r,j",
+				   tempreg, tempreg, BFD_RELOC_MIPS_HIGHER);
+		      macro_build (NULL, "dsll", "d,w,<",
+				   tempreg, tempreg, 16);
+		      macro_build (&offset_expr, "addiu", "t,r,j",
+				   AT, 0, BFD_RELOC_HI16_S);
+		      macro_build (NULL, "daddu", "d,v,t",
+				   tempreg, tempreg, AT);
+		      macro_build (NULL, "dsll", "d,w,<",
+				   tempreg, tempreg, 16);
+		      macro_build (&offset_expr, "addiu", "t,r,j",
+				   AT, 0, BFD_RELOC_LO16);
+		      macro_build (NULL, "daddu", "d,v,t",
+				   tempreg, tempreg, AT);
+		      used_at = 1;
+		    }
+		  else
+		    {
+		      /* If we are going to add in a base register, and the
+			 target register and the base register are the same,
+			 then we are using AT as a temporary register.  Since
+			 we cannot use an additional register, we add the
+			 constant consecutively into the register now, and
+			 pretend we were not using a base register.  */
+		      macro_build (&offset_expr, "lui", "t,u",
+				   tempreg, BFD_RELOC_MIPS_HIGHEST);
+		      macro_build (&offset_expr, "addiu", "t,r,j",
+				   tempreg, tempreg, BFD_RELOC_MIPS_HIGHER);
+		      macro_build (NULL, "dsll32", "d,w,<",
+				   tempreg, tempreg, 0);
+		      macro_build (NULL, "daddu", "d,v,t",
+				   treg, tempreg, breg);
+		      macro_build (&offset_expr, "addiu", "t,r,j",
+				   tempreg, 0, BFD_RELOC_HI16_S);
+		      macro_build (NULL, "dsll", "d,w,<",
+				   tempreg, tempreg, 16);
+		      macro_build (NULL, "daddu", "d,v,t",
+				   treg, treg, tempreg);
+		      macro_build (&offset_expr, "addiu", "t,r,j",
+				   tempreg, 0, BFD_RELOC_LO16);
+		      macro_build (NULL, "daddu", "d,v,t",
+				   treg, treg, tempreg);
+		      breg = 0;
+		    }
 		}
 	    }
 	  else
@@ -5056,6 +5273,7 @@ macro (struct mips_cl_insn *ip)
 	     addiu instruction.
 	   */
 
+	  assert (HAVE_32BIT_ADDRESSES);		/* nodaddi safety */
 	  if (offset_expr.X_add_number == 0)
 	    {
 	      if (breg == 0 && (call || tempreg == PIC_CALL_REG))
@@ -5154,8 +5372,31 @@ macro (struct mips_cl_insn *ip)
 	      if (expr1.X_add_number >= -0x8000
 		  && expr1.X_add_number < 0x8000)
 		{
-		  macro_build (&expr1, ADDRESS_ADDI_INSN, "t,r,j",
-			       tempreg, tempreg, BFD_RELOC_LO16);
+		  if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
+		    macro_build (&expr1, ADDRESS_ADDI_INSN, "t,r,j",
+				 tempreg, tempreg, BFD_RELOC_LO16);
+		  else
+		    {
+		      /* If we are going to add in a base register, and the
+			 target register and the base register are the same,
+			 then we are using AT as a temporary register.  Since
+			 we want to load the constant into AT, we add our
+			 current AT (from the global offset table) and the
+			 register into the register now, and pretend we were
+			 not using a base register.  */
+		      if (treg == breg)
+			{
+			  macro_build (NULL, ADDRESS_ADD_INSN, "d,v,t",
+				       treg, AT, breg);
+			  breg = 0;
+			  tempreg = treg;
+			}
+		      macro_build (&expr1, "addiu", "t,r,j",
+				   AT, 0, BFD_RELOC_LO16);
+		      macro_build (NULL, "daddu", "d,v,t",
+				   tempreg, tempreg, AT);
+		      used_at = 1;
+		    }
 		}
 	      else if (IS_SEXT_32BIT_NUM (expr1.X_add_number + 0x8000))
 		{
@@ -5266,6 +5507,7 @@ macro (struct mips_cl_insn *ip)
 	       addu	$tempreg,$tempreg,$at
 	  */
 
+	  assert (HAVE_32BIT_ADDRESSES);		/* nodaddi safety */
 	  expr1.X_add_number = offset_expr.X_add_number;
 	  offset_expr.X_add_number = 0;
 	  relax_start (offset_expr.X_add_symbol);
@@ -5433,8 +5675,31 @@ macro (struct mips_cl_insn *ip)
 	  else if (expr1.X_add_number >= -0x8000
 		   && expr1.X_add_number < 0x8000)
 	    {
-	      macro_build (&expr1, ADDRESS_ADDI_INSN, "t,r,j",
-			   tempreg, tempreg, BFD_RELOC_LO16);
+	      if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
+		macro_build (&expr1, ADDRESS_ADDI_INSN, "t,r,j",
+			     tempreg, tempreg, BFD_RELOC_LO16);
+	      else
+		{
+		  /* If we are going to add in a base register, and the
+		     target register and the base register are the same,
+		     then we are using AT as a temporary register.  Since
+		     we want to load the constant into AT, we add our
+		     current AT (from the global offset table) and the
+		     register into the register now, and pretend we were
+		     not using a base register.  */
+		  if (treg == breg)
+		    {
+		      macro_build (NULL, ADDRESS_ADD_INSN, "d,v,t",
+				   treg, AT, breg);
+		      breg = 0;
+		      tempreg = treg;
+		    }
+		  macro_build (&expr1, "addiu", "t,r,j",
+			       AT, 0, BFD_RELOC_LO16);
+		  macro_build (NULL, "daddu", "d,v,t", tempreg, tempreg, AT);
+
+		  used_at = 1;
+		}
 	    }
 	  else if (IS_SEXT_32BIT_NUM (expr1.X_add_number + 0x8000))
 	    {
@@ -5470,14 +5735,38 @@ macro (struct mips_cl_insn *ip)
 	  offset_expr.X_add_number = expr1.X_add_number;
 	  macro_build (&offset_expr, ADDRESS_LOAD_INSN, "t,o(b)", tempreg,
 		       BFD_RELOC_MIPS_GOT_PAGE, mips_gp_register);
-	  macro_build (&offset_expr, ADDRESS_ADDI_INSN, "t,r,j", tempreg,
-		       tempreg, BFD_RELOC_MIPS_GOT_OFST);
-	  if (add_breg_early)
+	  if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
 	    {
-	      macro_build (NULL, ADDRESS_ADD_INSN, "d,v,t",
-			   treg, tempreg, breg);
-	      breg = 0;
-	      tempreg = treg;
+	      macro_build (&offset_expr, ADDRESS_ADDI_INSN, "t,r,j", tempreg,
+			   tempreg, BFD_RELOC_MIPS_GOT_OFST);
+	      if (add_breg_early)
+		{
+		  macro_build (NULL, ADDRESS_ADD_INSN, "d,v,t",
+			       treg, tempreg, breg);
+		  breg = 0;
+		  tempreg = treg;
+		}
+	    }
+	  else
+	    {
+	      /* If we are going to add in a base register, and the
+		 target register and the base register are the same,
+		 then we are using AT as a temporary register.  Since
+		 we want to load the constant into AT, we add our
+		 current AT (from the global offset table) and the
+		 register into the register now, and pretend we were
+		 not using a base register.  */
+	      if (add_breg_early)
+		{
+		  macro_build (NULL, ADDRESS_ADD_INSN, "d,v,t",
+			       treg, tempreg, breg);
+		  breg = 0;
+		  tempreg = treg;
+		}
+	      macro_build (&offset_expr, "addiu", "t,r,j",
+			   AT, 0, BFD_RELOC_MIPS_GOT_OFST);
+	      macro_build (NULL, "daddu", "d,v,t", treg, treg, AT);
+	      used_at = 1;
 	    }
 	  relax_end ();
 	}
@@ -5486,8 +5775,16 @@ macro (struct mips_cl_insn *ip)
 	  /* We use
 	       addiu	$tempreg,$gp,<sym>	(BFD_RELOC_GPREL16)
 	     */
-	  macro_build (&offset_expr, ADDRESS_ADDI_INSN, "t,r,j", tempreg,
-		       mips_gp_register, BFD_RELOC_GPREL16);
+	  if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
+	    macro_build (&offset_expr, ADDRESS_ADDI_INSN, "t,r,j", tempreg,
+			 mips_gp_register, BFD_RELOC_GPREL16);
+	  else
+	    {
+	      macro_build (&offset_expr, "addiu", "t,r,j", tempreg,
+			   0, BFD_RELOC_GPREL16);
+	      macro_build (NULL, "daddu", "d,v,t", tempreg,
+			   tempreg, mips_gp_register);
+	    }
 	}
       else
 	abort ();
@@ -5627,9 +5924,18 @@ macro (struct mips_cl_insn *ip)
 		  macro_build (&offset_expr, ADDRESS_LOAD_INSN, "t,o(b)",
 			       PIC_CALL_REG, BFD_RELOC_MIPS_GOT_PAGE,
 			       mips_gp_register);
-		  macro_build (&offset_expr, ADDRESS_ADDI_INSN, "t,r,j",
-			       PIC_CALL_REG, PIC_CALL_REG,
-			       BFD_RELOC_MIPS_GOT_OFST);
+		  if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
+		    macro_build (&offset_expr, ADDRESS_ADDI_INSN, "t,r,j",
+				 PIC_CALL_REG, PIC_CALL_REG,
+				 BFD_RELOC_MIPS_GOT_OFST);
+		  else
+		    {
+		      macro_build (&offset_expr, "addiu", "t,r,j",
+				   AT, 0, BFD_RELOC_MIPS_GOT_OFST);
+		      macro_build (NULL, "daddu", "d,v,t",
+				   PIC_CALL_REG, PIC_CALL_REG, AT);
+		      used_at = 1;
+		    }
 		  relax_end ();
 		}
 
@@ -5637,6 +5943,7 @@ macro (struct mips_cl_insn *ip)
 	    }
 	  else
 	    {
+	      assert (HAVE_32BIT_ADDRESSES);		/* nodaddi safety */
 	      relax_start (offset_expr.X_add_symbol);
 	      if (! mips_big_got)
 		{
@@ -5708,7 +6015,10 @@ macro (struct mips_cl_insn *ip)
       else
 	abort ();
 
-      return;
+      if (! used_at)
+	return;
+
+      break;
 
     case M_LB_AB:
       s = "lb";
@@ -6041,40 +6351,76 @@ macro (struct mips_cl_insn *ip)
 		 these macros.  It used not to be possible with the
 		 original relaxation code, but it could be done now.  */
 
-	      if (used_at == 0 && ! mips_opts.noat)
+	      if (! mips_opts.nodaddi)
 		{
-		  macro_build (&offset_expr, "lui", "t,u", tempreg,
-			       BFD_RELOC_MIPS_HIGHEST);
-		  macro_build (&offset_expr, "lui", "t,u", AT,
-			       BFD_RELOC_HI16_S);
-		  macro_build (&offset_expr, "daddiu", "t,r,j", tempreg,
-			       tempreg, BFD_RELOC_MIPS_HIGHER);
-		  if (breg != 0)
-		    macro_build (NULL, "daddu", "d,v,t", AT, AT, breg);
-		  macro_build (NULL, "dsll32", "d,w,<", tempreg, tempreg, 0);
-		  macro_build (NULL, "daddu", "d,v,t", tempreg, tempreg, AT);
-		  macro_build (&offset_expr, s, fmt, treg, BFD_RELOC_LO16,
-			       tempreg);
-		  used_at = 1;
+		  if (used_at == 0 && ! mips_opts.noat)
+		    {
+		      macro_build (&offset_expr, "lui", "t,u", tempreg,
+				   BFD_RELOC_MIPS_HIGHEST);
+		      macro_build (&offset_expr, "lui", "t,u", AT,
+				   BFD_RELOC_HI16_S);
+		      macro_build (&offset_expr, "daddiu", "t,r,j", tempreg,
+				   tempreg, BFD_RELOC_MIPS_HIGHER);
+		      if (breg != 0)
+			macro_build (NULL, "daddu", "d,v,t", AT, AT, breg);
+		      macro_build (NULL, "dsll32", "d,w,<", tempreg,
+				   tempreg, 0);
+		      macro_build (NULL, "daddu", "d,v,t", tempreg,
+				   tempreg, AT);
+		      macro_build (&offset_expr, s, fmt, treg, BFD_RELOC_LO16,
+				   tempreg);
+		      used_at = 1;
+		    }
+		  else
+		    {
+		      macro_build (&offset_expr, "lui", "t,u", tempreg,
+				   BFD_RELOC_MIPS_HIGHEST);
+		      macro_build (&offset_expr, "daddiu", "t,r,j", tempreg,
+				   tempreg, BFD_RELOC_MIPS_HIGHER);
+		      macro_build (NULL, "dsll", "d,w,<", tempreg,
+				   tempreg, 16);
+		      macro_build (&offset_expr, "daddiu", "t,r,j", tempreg,
+				   tempreg, BFD_RELOC_HI16_S);
+		      macro_build (NULL, "dsll", "d,w,<", tempreg,
+				   tempreg, 16);
+		      if (breg != 0)
+			macro_build (NULL, "daddu", "d,v,t",
+				     tempreg, tempreg, breg);
+		      macro_build (&offset_expr, s, fmt, treg,
+				   BFD_RELOC_LO16, tempreg);
+		    }
 		}
 	      else
 		{
-		  macro_build (&offset_expr, "lui", "t,u", tempreg,
-			       BFD_RELOC_MIPS_HIGHEST);
-		  macro_build (&offset_expr, "daddiu", "t,r,j", tempreg,
-			       tempreg, BFD_RELOC_MIPS_HIGHER);
-		  macro_build (NULL, "dsll", "d,w,<", tempreg, tempreg, 16);
-		  macro_build (&offset_expr, "daddiu", "t,r,j", tempreg,
-			       tempreg, BFD_RELOC_HI16_S);
-		  macro_build (NULL, "dsll", "d,w,<", tempreg, tempreg, 16);
-		  if (breg != 0)
-		    macro_build (NULL, "daddu", "d,v,t",
-				 tempreg, tempreg, breg);
-		  macro_build (&offset_expr, s, fmt, treg,
-			       BFD_RELOC_LO16, tempreg);
+		  if (used_at == 0)
+		    {
+		      macro_build (&offset_expr, "lui", "t,u", tempreg,
+				   BFD_RELOC_MIPS_HIGHEST);
+		      macro_build (&offset_expr, "addiu", "t,r,j", tempreg,
+				   tempreg, BFD_RELOC_MIPS_HIGHER);
+		      macro_build (NULL, "dsll", "d,w,<", tempreg,
+				   tempreg, 16);
+		      macro_build (&offset_expr, "addiu", "t,r,j", AT,
+				   0, BFD_RELOC_HI16_S);
+		      macro_build (NULL, "daddu", "d,v,t", tempreg,
+				   tempreg, AT);
+		      macro_build (NULL, "dsll", "d,w,<",
+				   tempreg, tempreg, 16);
+		      if (breg != 0)
+			macro_build (NULL, "daddu", "d,v,t",
+				     tempreg, tempreg, breg);
+		      macro_build (&offset_expr, s, fmt, treg,
+				   BFD_RELOC_LO16, tempreg);
+		      used_at = 1;
+		    }
+		  else
+		    as_bad (_("Macro needs a temporary register due to "
+			      "`nodaddi' while $at is already in use"));
 		}
 
-	      return;
+	      if (! used_at)
+		return;
+	      break;
 	    }
 
 	  if (offset_expr.X_op == O_constant
@@ -6159,6 +6505,7 @@ macro (struct mips_cl_insn *ip)
 
 	      break;
 	    }
+	  assert (HAVE_32BIT_ADDRESSES);		/* nodaddi safety */
 	  expr1.X_add_number = offset_expr.X_add_number;
 	  offset_expr.X_add_number = 0;
 	  if (expr1.X_add_number < -0x8000
@@ -6197,6 +6544,7 @@ macro (struct mips_cl_insn *ip)
 	     16 bits, because we have no way to load the upper 16 bits
 	     (actually, we could handle them for the subset of cases
 	     in which we are not using $at).  */
+	  assert (HAVE_32BIT_ADDRESSES);		/* nodaddi safety */
 	  assert (offset_expr.X_op == O_symbol);
 	  expr1.X_add_number = offset_expr.X_add_number;
 	  offset_expr.X_add_number = 0;
@@ -6379,8 +6727,15 @@ macro (struct mips_cl_insn *ip)
 	{
 	  /* For embedded PIC we pick up the entire address off $gp in
 	     a single instruction.  */
-	  macro_build (&offset_expr, ADDRESS_ADDI_INSN, "t,r,j", AT,
-		       mips_gp_register, BFD_RELOC_GPREL16);
+	  if (HAVE_32BIT_ADDRESSES || ! mips_opts.nodaddi)
+	    macro_build (&offset_expr, ADDRESS_ADDI_INSN, "t,r,j", AT,
+			 mips_gp_register, BFD_RELOC_GPREL16);
+	  else
+	    {
+	      macro_build (&offset_expr, "addiu", "t,r,j", AT,
+			   0, BFD_RELOC_GPREL16);
+	      macro_build (NULL, "daddu", "d,v,t", AT, AT, mips_gp_register);
+	    }
 	  offset_expr.X_op = O_constant;
 	  offset_expr.X_add_number = 0;
 	}
@@ -7253,7 +7608,8 @@ macro2 (struct mips_cl_insn *ip)
 	}
       else if (imm_expr.X_op == O_constant
 	       && imm_expr.X_add_number > -0x8000
-	       && imm_expr.X_add_number < 0)
+	       && imm_expr.X_add_number < 0
+	       && ! (HAVE_64BIT_GPRS && mips_opts.nodaddi))
 	{
 	  imm_expr.X_add_number = -imm_expr.X_add_number;
 	  macro_build (&imm_expr, HAVE_32BIT_GPRS ? "addiu" : "daddiu",
@@ -7390,8 +7746,7 @@ macro2 (struct mips_cl_insn *ip)
 	{
 	  as_warn (_("Instruction %s: result is always true"),
 		   ip->insn_mo->name);
-	  macro_build (&expr1, HAVE_32BIT_GPRS ? "addiu" : "daddiu", "t,r,j",
-		       dreg, 0, BFD_RELOC_LO16);
+	  macro_build (&expr1, "addiu", "t,r,j", dreg, 0, BFD_RELOC_LO16);
 	  return;
 	}
       if (imm_expr.X_op == O_constant
@@ -7403,7 +7758,8 @@ macro2 (struct mips_cl_insn *ip)
 	}
       else if (imm_expr.X_op == O_constant
 	       && imm_expr.X_add_number > -0x8000
-	       && imm_expr.X_add_number < 0)
+	       && imm_expr.X_add_number < 0
+	       && ! (HAVE_64BIT_GPRS && mips_opts.nodaddi))
 	{
 	  imm_expr.X_add_number = -imm_expr.X_add_number;
 	  macro_build (&imm_expr, HAVE_32BIT_GPRS ? "addiu" : "daddiu",
@@ -7426,7 +7782,8 @@ macro2 (struct mips_cl_insn *ip)
     case M_SUB_I:
       if (imm_expr.X_op == O_constant
 	  && imm_expr.X_add_number > -0x8000
-	  && imm_expr.X_add_number <= 0x8000)
+	  && imm_expr.X_add_number <= 0x8000
+	  && ! (dbl && mips_opts.nodaddi))
 	{
 	  imm_expr.X_add_number = -imm_expr.X_add_number;
 	  macro_build (&imm_expr, dbl ? "daddi" : "addi", "t,r,j",
@@ -7442,7 +7799,8 @@ macro2 (struct mips_cl_insn *ip)
     case M_SUBU_I:
       if (imm_expr.X_op == O_constant
 	  && imm_expr.X_add_number > -0x8000
-	  && imm_expr.X_add_number <= 0x8000)
+	  && imm_expr.X_add_number <= 0x8000
+	  && ! (dbl && mips_opts.nodaddi))
 	{
 	  imm_expr.X_add_number = -imm_expr.X_add_number;
 	  macro_build (&imm_expr, dbl ? "daddiu" : "addiu", "t,r,j",
@@ -10258,9 +10616,13 @@ struct option md_longopts[] =
 #define OPTION_NO_FIX_VR4122 (OPTION_FIX_BASE + 3)
   {"mfix-vr4122-bugs",    no_argument, NULL, OPTION_FIX_VR4122},
   {"no-mfix-vr4122-bugs", no_argument, NULL, OPTION_NO_FIX_VR4122},
+#define OPTION_MNO_DADDI (OPTION_FIX_BASE + 4)
+#define OPTION_MDADDI (OPTION_FIX_BASE + 5)
+  {"mno-daddi", no_argument, NULL, OPTION_MNO_DADDI},
+  {"mdaddi", no_argument, NULL, OPTION_MDADDI},
 
   /* Miscellaneous options.  */
-#define OPTION_MISC_BASE (OPTION_FIX_BASE + 4)
+#define OPTION_MISC_BASE (OPTION_FIX_BASE + 6)
 #define OPTION_MEMBEDDED_PIC (OPTION_MISC_BASE + 0)
   {"membedded-pic", no_argument, NULL, OPTION_MEMBEDDED_PIC},
 #define OPTION_TRAP (OPTION_MISC_BASE + 1)
@@ -10489,6 +10851,14 @@ md_parse_option (int c, char *arg)
       mips_opts.ase_mips3d = 0;
       break;
 
+    case OPTION_MNO_DADDI:
+      mips_opts.nodaddi = 1;
+      break;
+
+    case OPTION_MDADDI:
+      mips_opts.nodaddi = 0;
+      break;
+
     case OPTION_MEMBEDDED_PIC:
       mips_pic = EMBEDDED_PIC;
       if (USE_GLOBAL_POINTER_OPT && g_switch_seen)
@@ -10800,6 +11170,9 @@ mips_after_parse_args (void)
 
   /* End of GCC-shared inference code.  */
 
+  if (mips_opts.nodaddi == -1)
+    mips_opts.nodaddi = (CPU_HAS_DADDI_BUG (file_mips_arch)) ? 1 : 0;
+
   /* This flag is set when we have a 64-bit capable CPU but use only
      32-bit wide registers.  Note that EABI does not use it.  */
   if (ISA_HAS_64BIT_REGS (mips_opts.isa)
@@ -10823,6 +11196,7 @@ mips_after_parse_args (void)
   file_ase_mips16 = mips_opts.mips16;
   file_ase_mips3d = mips_opts.ase_mips3d;
   file_ase_mdmx = mips_opts.ase_mdmx;
+  file_nodaddi = mips_opts.nodaddi;
   mips_opts.gp32 = file_mips_gp32;
   mips_opts.fp32 = file_mips_fp32;
 
@@ -11969,6 +12343,10 @@ s_mipsset (int x ATTRIBUTE_UNUSED)
     mips_opts.noautoextend = 0;
   else if (strcmp (name, "noautoextend") == 0)
     mips_opts.noautoextend = 1;
+  else if (strcmp (name, "daddi") == 0)
+    mips_opts.nodaddi = 0;
+  else if (strcmp (name, "nodaddi") == 0)
+    mips_opts.nodaddi = 1;
   else if (strcmp (name, "push") == 0)
     {
       struct mips_option_stack *s;
@@ -13244,6 +13622,11 @@ md_convert_frag (bfd *abfd ATTRIBUTE_UNU
 	{
 	  int i;
 
+	  if (HAVE_64BIT_ADDRESSES && mips_opts.nodaddi)
+	    as_bad_where (fragp->fr_file, fragp->fr_line,
+			  _("Cannot relax out-of-range branch "
+			    "due to `nodaddi'"));
+
 	  as_warn_where (fragp->fr_file, fragp->fr_line,
 			 _("relaxed out-of-range branch into a jump"));
 
@@ -13387,6 +13770,7 @@ md_convert_frag (bfd *abfd ATTRIBUTE_UNU
 		}
 
 	      /* d/addiu $at, $at, <sym>  R_MIPS_LO16 */
+	      assert (! mips_opts.nodaddi || ! HAVE_64BIT_ADDRESSES);
 	      insn = HAVE_64BIT_ADDRESSES ? 0x64210000 : 0x24210000;
 
 	      fixp = fix_new_exp (fragp, buf - (bfd_byte *)fragp->fr_literal,
@@ -14370,6 +14754,10 @@ MIPS options:\n\
   fputc ('\n', stream);
 
   fprintf (stream, _("\
+-mdaddi			generate daddi and daddiu instructions\n\
+-mno-daddi		do not generate daddi and daddiu instructions\n\
+			[default for -march=4000 and -march=4400]\n"));
+  fprintf (stream, _("\
 -mips16			generate mips16 instructions\n\
 -no-mips16		do not generate mips16 instructions\n"));
   fprintf (stream, _("\
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/doc/as.texinfo binutils-2.15.90-20040301/gas/doc/as.texinfo
--- binutils-2.15.90-20040301.macro/gas/doc/as.texinfo	2004-01-09 04:25:19.000000000 +0000
+++ binutils-2.15.90-20040301/gas/doc/as.texinfo	2004-03-02 11:20:39.000000000 +0000
@@ -359,7 +359,7 @@ gcc(1), ld(1), and the Info entries for 
    [@b{-mips64}] [@b{-mips64r2}]
    [@b{-construct-floats}] [@b{-no-construct-floats}]
    [@b{-trap}] [@b{-no-break}] [@b{-break}] [@b{-no-trap}]
-   [@b{-mfix7000}] [@b{-mno-fix7000}]
+   [@b{-mfix7000}] [@b{-mno-fix7000}] [@b{-mno-daddi}] [@b{-mdaddi}]
    [@b{-mips16}] [@b{-no-mips16}]
    [@b{-mips3d}] [@b{-no-mips3d}]
    [@b{-mdmx}] [@b{-no-mdmx}]
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/doc/c-mips.texi binutils-2.15.90-20040301/gas/doc/c-mips.texi
--- binutils-2.15.90-20040301.macro/gas/doc/c-mips.texi	2004-01-09 04:25:19.000000000 +0000
+++ binutils-2.15.90-20040301/gas/doc/c-mips.texi	2004-03-02 12:15:07.000000000 +0000
@@ -128,6 +128,17 @@ Insert @samp{nop} instructions to avoid 
 the vr4122 core.  This option is intended to be used on GCC-generated
 code: it is not designed to catch errors in hand-written assembler code.
 
+@item -mno-daddi
+@itemx -mdaddi
+Disable the @samp{daddi} and the @samp{daddiu} hardware instructions and
+treat them as macros that expand as a sequence of a @samp{li} to the @code{at}
+register and a @samp{dadd} or a @samp{daddu}, respectively, to the target.
+This also affects expansions of macros that operate on addresses that would
+normally use @samp{daddiu} instructions.  @samp{-mno-daddi} is the default for
+the @sc{r4000} and the @sc{r4400} processors as the instructions may give
+erroneous results with these processors.  There is a pair of corresponding
+@samp{.set nodaddi} and @samp{.set daddi} directives as well.
+
 @item -m4010
 @itemx -no-m4010
 Generate code for the LSI @sc{r4010} chip.  This tells the assembler to
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/elf-rel-got-n64.d binutils-2.15.90-20040301/gas/testsuite/gas/mips/elf-rel-got-n64.d
--- binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/elf-rel-got-n64.d	2004-02-03 04:25:38.000000000 +0000
+++ binutils-2.15.90-20040301/gas/testsuite/gas/mips/elf-rel-got-n64.d	2004-03-02 09:53:00.000000000 +0000
@@ -1,6 +1,6 @@
 #objdump: -dr --prefix-addresses --show-raw-insn
 #name: MIPS ELF got reloc n64
-#as: -64 -KPIC
+#as: -64 -KPIC -mdaddi
 
 .*: +file format elf64-.*mips.*
 
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/elf-rel-xgot-n64.d binutils-2.15.90-20040301/gas/testsuite/gas/mips/elf-rel-xgot-n64.d
--- binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/elf-rel-xgot-n64.d	2004-02-03 04:25:38.000000000 +0000
+++ binutils-2.15.90-20040301/gas/testsuite/gas/mips/elf-rel-xgot-n64.d	2004-03-02 09:53:00.000000000 +0000
@@ -1,6 +1,6 @@
 #objdump: -dr --prefix-addresses --show-raw-insn
 #name: MIPS ELF xgot reloc n64
-#as: -64 -KPIC -xgot
+#as: -64 -KPIC -xgot -mdaddi
 #source: elf-rel-got-n64.s
 
 .*: +file format elf64-.*mips.*
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/elf-rel11.d binutils-2.15.90-20040301/gas/testsuite/gas/mips/elf-rel11.d
--- binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/elf-rel11.d	2003-02-02 19:37:20.000000000 +0000
+++ binutils-2.15.90-20040301/gas/testsuite/gas/mips/elf-rel11.d	2003-09-06 17:15:31.000000000 +0000
@@ -1,4 +1,4 @@
-#as: -march=mips3 -mabi=64
+#as: -march=mips3 -mabi=64 -mdaddi
 #readelf: --relocs
 #name: MIPS ELF reloc 11
 
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/empic2.d binutils-2.15.90-20040301/gas/testsuite/gas/mips/empic2.d
--- binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/empic2.d	2003-05-08 03:25:31.000000000 +0000
+++ binutils-2.15.90-20040301/gas/testsuite/gas/mips/empic2.d	2003-11-09 17:42:01.000000000 +0000
@@ -1,6 +1,6 @@
 #objdump: --prefix-addresses -dr --show-raw-insn -mmips:4000
 #name: MIPS empic2
-#as: -mabi=o64 -membedded-pic -mips3
+#as: -mabi=o64 -membedded-pic -mips3 -mdaddi
 
 # Check assembly of and relocs for -membedded-pic la, lw, ld, sw, sd macros.
 
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/empic3_e.d binutils-2.15.90-20040301/gas/testsuite/gas/mips/empic3_e.d
--- binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/empic3_e.d	2003-05-08 03:25:31.000000000 +0000
+++ binutils-2.15.90-20040301/gas/testsuite/gas/mips/empic3_e.d	2003-11-09 17:42:01.000000000 +0000
@@ -1,6 +1,6 @@
 #objdump: --prefix-addresses -dr --show-raw-insn -mmips:4000
 #name: MIPS empic3 (external)
-#as: -mabi=o64 -membedded-pic -mips3
+#as: -mabi=o64 -membedded-pic -mips3 -mdaddi
 
 # Check PC-relative HI/LO relocs relocs for -membedded-pic when HI and
 # LO are split over a 32K boundary.
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/empic3_g1.d binutils-2.15.90-20040301/gas/testsuite/gas/mips/empic3_g1.d
--- binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/empic3_g1.d	2003-05-08 03:25:31.000000000 +0000
+++ binutils-2.15.90-20040301/gas/testsuite/gas/mips/empic3_g1.d	2003-11-09 17:42:01.000000000 +0000
@@ -1,6 +1,6 @@
 #objdump: --prefix-addresses -dr --show-raw-insn -mmips:4000
 #name: MIPS empic3 (global, negative)
-#as: -mabi=o64 -membedded-pic -mips3
+#as: -mabi=o64 -membedded-pic -mips3 -mdaddi
 
 # Check PC-relative HI/LO relocs relocs for -membedded-pic when HI and
 # LO are split over a 32K boundary.
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/empic3_g2.d binutils-2.15.90-20040301/gas/testsuite/gas/mips/empic3_g2.d
--- binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/empic3_g2.d	2003-05-08 03:25:31.000000000 +0000
+++ binutils-2.15.90-20040301/gas/testsuite/gas/mips/empic3_g2.d	2003-11-09 17:42:01.000000000 +0000
@@ -1,6 +1,6 @@
 #objdump: --prefix-addresses -dr --show-raw-insn -mmips:4000
 #name: MIPS empic3 (global, positive)
-#as: -mabi=o64 -membedded-pic -mips3
+#as: -mabi=o64 -membedded-pic -mips3 -mdaddi
 
 # Check PC-relative HI/LO relocs relocs for -membedded-pic when HI and
 # LO are split over a 32K boundary.
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/telempic.d binutils-2.15.90-20040301/gas/testsuite/gas/mips/telempic.d
--- binutils-2.15.90-20040301.macro/gas/testsuite/gas/mips/telempic.d	2003-09-04 03:25:37.000000000 +0000
+++ binutils-2.15.90-20040301/gas/testsuite/gas/mips/telempic.d	2003-11-09 17:42:01.000000000 +0000
@@ -1,6 +1,6 @@
 #objdump: -rst -mmips:4000
 #name: MIPS empic
-#as: -mabi=o64 -membedded-pic -mips3
+#as: -mabi=o64 -membedded-pic -mips3 -mdaddi
 #source: empic.s
 #stderr: empic.l
 
diff -up --recursive --new-file binutils-2.15.90-20040301.macro/opcodes/mips-opc.c binutils-2.15.90-20040301/opcodes/mips-opc.c
--- binutils-2.15.90-20040301.macro/opcodes/mips-opc.c	2003-11-19 04:25:23.000000000 +0000
+++ binutils-2.15.90-20040301/opcodes/mips-opc.c	2004-01-29 18:28:50.000000000 +0000
@@ -471,7 +471,9 @@ const struct mips_opcode mips_builtin_op
 {"dabs",    "d,v",	0,    (int) M_DABS,	INSN_MACRO,		I3	},
 {"dadd",    "d,v,t",	0x0000002c, 0xfc0007ff, WR_d|RD_s|RD_t,		I3	},
 {"dadd",    "t,r,I",	0,    (int) M_DADD_I,	INSN_MACRO,		I3	},
+{"daddi",   "t,r,j",	0,    (int) M_DADD_I,	INSN_MACRO,		I3	},
 {"daddi",   "t,r,j",	0x60000000, 0xfc000000, WR_t|RD_s,		I3	},
+{"daddiu",  "t,r,j",	0,    (int) M_DADDU_I,	INSN_MACRO,		I3	},
 {"daddiu",  "t,r,j",	0x64000000, 0xfc000000, WR_t|RD_s,		I3	},
 {"daddu",   "d,v,t",	0x0000002d, 0xfc0007ff, WR_d|RD_s|RD_t,		I3	},
 {"daddu",   "t,r,I",	0,    (int) M_DADDU_I,	INSN_MACRO,		I3	},


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]