[PATCH] D16220: [mips] Fix RetCC_MipsN to promote types smaller than i64 to GPR-width.

Fri Jan 29 09:30:05 PST 2016

dsanders requested changes to this revision.
dsanders added a comment.
This revision now requires changes to proceed.

Well caught. It turns out we've been falling through to the O32 handler in some cases.

There's a few changes that don't look right to me and I've commented on these below. I'd like to explain one set of the comments in advance. Up until select-flt.ll I've pointed out a number of redundant instructions. These look like regressions but it's possible that we've always emitted some of them and have just not mentioned them in the test. Could you check whether these are regressions or not?


================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6934-6935
@@ +6933,4 @@
+
+    if ((N00VTBits - N00NSB + 1) == EVTBits)
+      return N0;
+  }
----------------
I think we can do better here. We really want to know whether the sign bit that sext_in_reg will replicate is any one of the sign-bits that ComputeNumSignBits reports so I believe the condition should be:
  (N00VTBits - N00NSB) < EVTBits

In any case I'd suggest posting this as a separate patch without the '[mips]' tag since this affects all targets.

================
Comment at: lib/Target/Mips/MipsCallingConv.td:183-185
@@ -182,2 +182,5 @@
 def RetCC_MipsN : CallingConv<[
+  // Promote i1/i8/i16/i32 arguments to i64
+  CCIfType<[i1, i8, i16, i32], CCPromoteToType<i64>>,
+
   // f128 needs to be handled similarly to f32 and f64. However, f128 is not
----------------
Putting this here prevents the big-endian CCIfInReg path on line 205 from occurring. As a result small structs won't return correctly.

I think it belongs on line 206 at which point the little-endian CCIfInReg case could be folded into it

================
Comment at: lib/Target/Mips/MipsCallingConv.td:344
@@ -340,3 +343,3 @@
   CCIfSubtarget<"isABI_N64()", CCDelegateTo<RetCC_MipsN>>,
   CCDelegateTo<RetCC_MipsO32>
 ]>;
----------------
Could you also make this conditional on isABI_O32()? We'd have caught this sooner if we hadn't fallen out into the O32 implementation.

================
Comment at: lib/Target/Mips/MipsISelLowering.cpp:825-826
@@ +824,4 @@
+
+  // fold (asext (trunc (asext x) -> (trunc (asext x)) with a smaller type for
+  // asext.
+  if (N0.getOpcode() == ISD::TRUNCATE &&
----------------
What is 'asext'? I assume it's 'AssertSExt' but it's confusing when we already have 'aext' for any extend. Also, could you clarify which 'asext' you're referring to at the end of the sentence?

================
Comment at: lib/Target/Mips/MipsISelLowering.cpp:829-830
@@ +828,4 @@
+      N0.getOperand(0).getOpcode() == ISD::AssertSext) {
+    SDValue OldAssertSext = N0.getOperand(0);
+    EVT OldVT = cast<VTSDNode>(OldAssertSext->getOperand(1))->getVT();
+
----------------
Optional: Would 'Outer' make more sense than 'Old'?

================
Comment at: lib/Target/Mips/MipsISelLowering.cpp:836
@@ +835,3 @@
+                      OldAssertSext.getOperand(0), DAG.getValueType(NewVT));
+      return DAG.getSExtOrTrunc(NewAssertSext, SDLoc(N), N->getValueType(0));
+    }
----------------
Isn't it always a trunc because NewVT has fewer bits than OldVT?

================
Comment at: test/CodeGen/Mips/atomic.ll:100
@@ -99,3 +99,3 @@
 
-define signext i8 @AtomicLoadAdd8(i8 signext %incr) nounwind {
+define i8 @AtomicLoadAdd8(i8 signext %incr) nounwind {
 entry:
----------------
Why did you remove the signext's from these tests?

================
Comment at: test/CodeGen/Mips/llvm-ir/ashr.ll:43-56
@@ -54,5 +42,16 @@
 
-  ; FIXME: The andi instruction is redundant.
-  ; ALL:        andi    $[[T0:[0-9]+]], $5, 255
-  ; ALL:        srav    $2, $4, $[[T0]]
+  ; GP32-NOT-R2-R6:        andi      $[[T0:[0-9]+]], $5, 255
+  ; GP32-NOT-R2-R6:        srav      $2, $4, $[[T0]]
+
+  ; GP32-R2-R6:       andi      $[[T0:[0-9]+]], $5, 255
+  ; GP32-R2-R6:       srav      $2, $4, $[[T0]]
+
+  ; GP64-NOT-R2-R6:        andi      $[[T0:[0-9]+]], $5, 255
+  ; GP64-NOT-R2-R6:        srav      $[[T1:[0-9]+]], $4, $[[T0]]
+  ; GP64-NOT-R2-R6:        dsll      $[[T2:[0-9]+]], $[[T1]], 56
+  ; GP64-NOT-R2-R6:        dsra      $2, $[[T2]], 56
+
+  ; GP64-R2-R6:       andi      $[[T0:[0-9]+]], $5, 255
+  ; GP64-R2-R6:       srav      $[[T1:[0-9]+]], $4, $[[T0]]
+  ; GP64-R2-R6:       seb       $2, $[[T1]]
 
----------------
indentation. Similarly below.

================
Comment at: test/CodeGen/Mips/llvm-ir/ashr.ll:43-56
@@ -54,5 +42,16 @@
 
-  ; FIXME: The andi instruction is redundant.
-  ; ALL:        andi    $[[T0:[0-9]+]], $5, 255
-  ; ALL:        srav    $2, $4, $[[T0]]
+  ; GP32-NOT-R2-R6:        andi      $[[T0:[0-9]+]], $5, 255
+  ; GP32-NOT-R2-R6:        srav      $2, $4, $[[T0]]
+
+  ; GP32-R2-R6:       andi      $[[T0:[0-9]+]], $5, 255
+  ; GP32-R2-R6:       srav      $2, $4, $[[T0]]
+
+  ; GP64-NOT-R2-R6:        andi      $[[T0:[0-9]+]], $5, 255
+  ; GP64-NOT-R2-R6:        srav      $[[T1:[0-9]+]], $4, $[[T0]]
+  ; GP64-NOT-R2-R6:        dsll      $[[T2:[0-9]+]], $[[T1]], 56
+  ; GP64-NOT-R2-R6:        dsra      $2, $[[T2]], 56
+
+  ; GP64-R2-R6:       andi      $[[T0:[0-9]+]], $5, 255
+  ; GP64-R2-R6:       srav      $[[T1:[0-9]+]], $4, $[[T0]]
+  ; GP64-R2-R6:       seb       $2, $[[T1]]
 
----------------
dsanders wrote:
> indentation. Similarly below.
Something is odd here. The two GP32 cases don't sign-extend but the two GP64 cases do. None of them need to sign-extend because there's no way for ashr to remove the sign extension that was already present in the 'i8 signext' argument. Similarly below

Also, the andi instructions are still redundant because srav only reads the lowest 5 bits. Please restore the fixme.

================
Comment at: test/CodeGen/Mips/llvm-ir/ashr.ll:116-119
@@ +115,6 @@
+  ; M2:           $[[BB0]]:
+  ; M2:           beqz      $[[T1]], $[[BB1:BB[0-9_]+]]
+  ; M2:           nop
+  ; M2:           sra       $2, $4, 31
+  ; M2:           jr        $ra
+
----------------
Why did you remove the definition of the $[[BB1]] label?

================
Comment at: test/CodeGen/Mips/llvm-ir/ashr.ll:117-132
@@ -116,18 @@
-  ; 32R1-R5:    movn      $2, $4, $[[T5]]
-
-  ; 32R6:       srav      $[[T0:[0-9]+]], $4, $7
-  ; 32R6:       andi      $[[T1:[0-9]+]], $7, 32
-  ; 32R6:       seleqz    $[[T2:[0-9]+]], $[[T0]], $[[T1]]
-  ; 32R6:       sra       $[[T3:[0-9]+]], $4, 31
-  ; 32R6:       selnez    $[[T4:[0-9]+]], $[[T3]], $[[T1]]
-  ; 32R6:       or        $[[T5:[0-9]+]], $[[T4]], $[[T2]]
-  ; 32R6:       srlv      $[[T6:[0-9]+]], $5, $7
-  ; 32R6:       not       $[[T7:[0-9]+]], $7
-  ; 32R6:       sll       $[[T8:[0-9]+]], $4, 1
-  ; 32R6:       sllv      $[[T9:[0-9]+]], $[[T8]], $[[T7]]
-  ; 32R6:       or        $[[T10:[0-9]+]], $[[T9]], $[[T6]]
-  ; 32R6:       seleqz    $[[T11:[0-9]+]], $[[T10]], $[[T1]]
-  ; 32R6:       selnez    $[[T12:[0-9]+]], $[[T0]], $[[T1]]
-  ; 32R6:       jr        $ra
-  ; 32R6:       or        $3, $[[T0]], $[[T11]]
-
----------------
Please restore the MIPS32R6 case.

================
Comment at: test/CodeGen/Mips/llvm-ir/ashr.ll:158-160
@@ +157,5 @@
+  ; M3:          $[[BB0]]:
+  ; M3:          beqz       $[[T3]], $[[BB1:BB[0-9_]+]]
+  ; M3:          nop
+  ; M3:          dsra       $2, $4, 63
+
----------------
Why did you remove the definition of the $[[BB1]] label?

================
Comment at: test/CodeGen/Mips/llvm-ir/lshr.ll:80
@@ -78,1 +79,3 @@
+  ; GP64:         srlv      $[[T0:[0-9]+]], $4, $5
+  ; GP64:         sll       $2, $[[T0]], 0
 
----------------
This sign extend is redundant. The srlv already did it

================
Comment at: test/CodeGen/Mips/llvm-ir/mul.ll:145-153
@@ -143,4 +144,11 @@
 
-  ; 64R1-R5:    mul     $2, $4, $5
-  ; 64R6:       mul     $2, $4, $5
+  ; M4:         mult    $4, $5
+  ; M4:         mflo    $[[T0:[0-9]+]]
+  ; M4:         sll     $2, $[[T0]], 0
+
+  ; 64R1-R5:    mul     $[[T0:[0-9]+]], $4, $5
+  ; 64R1-R5:    sll     $2, $[[T0]], 0
+
+  ; 64R6:       mul     $[[T0:[0-9]+]], $4, $5
+  ; 64R6:       sll     $2, $[[T0]], 0
   %r = mul i32 %a, %b
----------------
These three sign extends are redundant. For the M4 case, the mflo already did it and for the other two it was the mul

================
Comment at: test/CodeGen/Mips/llvm-ir/ret.ll:121
@@ -95,1 +120,3 @@
+; GP64:          lui $[[T0:[0-9]+]], 1
+; GP32-DAG:      daddiu $2, $[[T0]], 1
 
----------------
I doubt you meant 'GP32' here.

Also why are the GP32 and GP64 cases different now?

================
Comment at: test/CodeGen/Mips/llvm-ir/sdiv.ll:6-10
@@ -5,7 +5,7 @@
 ; RUN: llc < %s -march=mips -mcpu=mips32r2 | FileCheck %s \
-; RUN:    -check-prefix=NOT-R6 -check-prefix=R2-R5 -check-prefix=GP32
+; RUN:    -check-prefix=GP32-NOT-R6 -check-prefix=R2-R6
 ; RUN: llc < %s -march=mips -mcpu=mips32r3 | FileCheck %s \
-; RUN:    -check-prefix=NOT-R6 -check-prefix=R2-R5 -check-prefix=GP32
+; RUN:    -check-prefix=GP32-NOT-R6 -check-prefix=R2-R6
 ; RUN: llc < %s -march=mips -mcpu=mips32r5 | FileCheck %s \
-; RUN:    -check-prefix=NOT-R6 -check-prefix=R2-R5 -check-prefix=GP32
+; RUN:    -check-prefix=GP32-NOT-R6 -check-prefix=R2-R6
 ; RUN: llc < %s -march=mips -mcpu=mips32r6 | FileCheck %s \
----------------
The prefixes contradict each other

================
Comment at: test/CodeGen/Mips/llvm-ir/sdiv.ll:21-26
@@ -19,7 +20,8 @@
 ; RUN: llc < %s -march=mips64 -mcpu=mips64r2 | FileCheck %s \
-; RUN:    -check-prefix=NOT-R6 -check-prefix=R2-R5 -check-prefix=GP64-NOT-R6
+; RUN:    -check-prefix=GP64-NOT-R6 -check-prefix=R2-R6
 ; RUN: llc < %s -march=mips64 -mcpu=mips64r3 | FileCheck %s \
-; RUN:    -check-prefix=NOT-R6 -check-prefix=R2-R5 -check-prefix=GP64-NOT-R6
+; RUN:    -check-prefix=GP64-NOT-R6 -check-prefix=R2-R6
 ; RUN: llc < %s -march=mips64 -mcpu=mips64r5 | FileCheck %s \
-; RUN:    -check-prefix=NOT-R6 -check-prefix=R2-R5 -check-prefix=GP64-NOT-R6
+; RUN:    -check-prefix=GP64-NOT-R6 -check-prefix=R2-R6
 ; RUN: llc < %s -march=mips64 -mcpu=mips64r6 | FileCheck %s \
+; RUN:    -check-prefix=GP64-R6 -check-prefix=R6
----------------
The prefixes contradict each other

================
Comment at: test/CodeGen/Mips/llvm-ir/sdiv.ll:74-78
@@ -59,7 +73,7 @@
 
-  ; R2-R5:        div     $zero, $4, $5
-  ; R2-R5:        teq     $5, $zero, 7
-  ; R2-R5:        mflo    $[[T0:[0-9]+]]
+  ; R2-R6:            div     $zero, $4, $5
+  ; R2-R6:            teq     $5, $zero, 7
+  ; R2-R6:            mflo    $[[T0:[0-9]+]]
   ; FIXME: This instruction is redundant.
-  ; R2-R5:        seb     $2, $[[T0]]
+  ; R2-R6:            seb     $2, $[[T0]]
 
----------------
This is not correct for R6. This kind of divide instruction was removed from the ISA.

================
Comment at: test/CodeGen/Mips/llvm-ir/sdiv.ll:100-104
@@ -85,7 +99,7 @@
 
-  ; R2-R5:        div     $zero, $4, $5
-  ; R2-R5:        teq     $5, $zero, 7
-  ; R2-R5:        mflo    $[[T0:[0-9]+]]
-  ; FIXME: This is instruction is redundant since div is signed.
-  ; R2-R5:        seh     $2, $[[T0]]
+  ; R2-R6:            div     $zero, $4, $5
+  ; R2-R6:            teq     $5, $zero, 7
+  ; R2-R6:            mflo    $[[T0:[0-9]+]]
+  ; FIXME: This instruction is redundant.
+  ; R2-R6:            seh     $2, $[[T0]]
 
----------------
This is not correct for R6. This kind of divide instruction was removed from the ISA.

================
Comment at: test/CodeGen/Mips/llvm-ir/sdiv.ll:123-133
@@ -108,4 +122,13 @@
+
+  ; GP64-NOT-R6:      div     $zero, $4, $5
+  ; GP64-NOT-R6:      teq     $5, $zero, 7
+  ; GP64-NOT-R6:      mflo    $[[T0:[0-9]+]]
+  ; GP64-NOT-R6:      sll     $2, $[[T0]], 0
+
+  ; GP32-R6:          div     $2, $4, $5
+  ; GP32-R6:          teq     $5, $zero, 7
 
-  ; R6:           div     $2, $4, $5
-  ; R6:           teq     $5, $zero, 7
+  ; GP64-R6:          div     $[[T0:[0-9]+]], $4, $5
+  ; GP64-R6:          teq     $5, $zero, 7
+  ; GP64-R6:          sll     $2, $[[T0]], 0
 
----------------
The two sll's are redundant. The mflo's already did it

================
Comment at: test/CodeGen/Mips/llvm-ir/select-flt.ll:1
@@ +1,2 @@
+; RUN: llc < %s -march=mips -mcpu=mips2 | FileCheck %s \
+; RUN:    -check-prefix=ALL -check-prefix=M2 -check-prefix=M2-M3
----------------
Regarding the splitting of select.ll: This is a good thing but it makes it difficult to see what's changed. Could we split it and change it in separate commits? Feel free to commit the split part of this without further review.

================
Comment at: test/CodeGen/Mips/llvm-ir/select-flt.ll:37
@@ +36,3 @@
+
+  ; M2-M3:        andi      $[[T0:[0-9]+]], $4, 1
+  ; M2-M3:        bnez      $[[T0]], $[[BB0:BB[0-9_]+]]
----------------
This andi and the similar ones below are redundant because %s can only be 0 or -1 and the branch test is for zero/not-zero

================
Comment at: test/CodeGen/Mips/llvm-ir/srem.ll:73-77
@@ -59,6 +72,7 @@
 
-  ; R2-R5:        div     $zero, $4, $5
-  ; R2-R5:        teq     $5, $zero, 7
-  ; R2-R5:        mfhi    $[[T0:[0-9]+]]
-  ; R2-R5:        seb     $2, $[[T0]]
+  ; R2-R6:            div     $zero, $4, $5
+  ; R2-R6:            teq     $5, $zero, 7
+  ; R2-R6:            mfhi    $[[T0:[0-9]+]]
+  ; FIXME: This instruction is redundant.
+  ; R2-R6:            seb     $2, $[[T0]]
 
----------------
Like in the div.ll test, this isn't correct for R6. Similarly below.


http://reviews.llvm.org/D16220