[PATCH] D22520: [mips] zeroext and logical 'and' mask optimizations

Thu Jul 28 07:52:43 PDT 2016

dsanders accepted this revision.
dsanders added a comment.
This revision is now accepted and ready to land.

LGTM with some nits fixed


================
Comment at: lib/Target/Mips/Mips64InstrInfo.td:589
@@ +588,3 @@
+              (DINSU ZERO_64, (immLO64MaskSize $imm), (immLO64MaskPos $imm),
+               GPR64:$src)>, ISA_MIPS64R2;
+
----------------
indentation (align to the 'ZERO_64')

================
Comment at: lib/Target/Mips/MipsInstrInfo.td:1081
@@ +1080,3 @@
+// but are also used with dext/dins.
+// A mask of all ones from bit 63..0, for dins/ins matching
+def immHIMask32 : PatLeaf<(imm), [{
----------------
I read this only being true for 0xffffffffffffffff but I know that's not what you're trying to say. It might be best to describe it in terms of leading ones and trailing zeros:
  A mask where all bits are either leading ones or trailing zeros where there are at most 32 leading ones.

================
Comment at: lib/Target/Mips/MipsInstrInfo.td:1084-1101
@@ +1083,20 @@
+
+  bool Isi32 = N->getValueType(0) == MVT::i32;
+  if (!Isi32 && countPopulation((uint64_t)N->getZExtValue()) < 32)
+    return false;
+
+  unsigned TrailingZeros = 0;
+  unsigned LeadingOnes = 0;
+  uint64_t Value = N->getZExtValue();
+
+  TrailingZeros = countTrailingZeros(Value);
+  // As count(Leading/Trailing)(Ones/Zeros) takes a value of uint64_t,
+  // the constant has to be adjusted to count the leading ones in the
+  // mask properly if it is an i32.
+  if (Isi32)
+    Value <<= 32;
+
+  LeadingOnes = countLeadingOnes(Value);
+
+  return (LeadingOnes + TrailingZeros) == (Isi32 ? 32 : 64);;
+}]>;
----------------
I think this could be simplified to:
  bool Isi32 = N->getValueType(0) == MVT::i32;

  uint64_t Value = N->getZExtValue();
  if (Isi32)
    Value <<= 32;

  unsigned TrailingZeros = countTrailingZeros(Value);
  unsigned LeadingOnes = countLeadingOnes(Value);

  return (LeadingOnes + TrailingZeros) == 64;
or alternatively:
  if (N->getValueType(0) == MVT::i32) {
    uint32_t Value = N->getZExtValue();
    return countLeadingOnes(Value) + countTrailingZeros(Value) == 32;
  }

  uint64_t Value = N->getZExtValue();
  return countLeadingOnes(Value) + countTrailingZeros(Value) == 64 && countLeadingOnes(Value) < 32;

Similarly for the others

================
Comment at: lib/Target/Mips/MipsInstrInfo.td:1101
@@ +1100,3 @@
+
+  return (LeadingOnes + TrailingZeros) == (Isi32 ? 32 : 64);;
+}]>;
----------------
double semi-colon

================
Comment at: lib/Target/Mips/MipsInstrInfo.td:2622-2629
@@ +2621,10 @@
+                              Instruction INSOP> {
+// Mask from bit 0..31, greater than 16 bits (as andi can already do it)
+// Form: XXXX FFFF
+def : MipsPat<(VT (and RT:$src, immLOMask32:$imm)),
+              (EXTOP RT:$src, 0, (immLOMaskSize $imm))>;
+
+// Mask from bit 31..0. Form: XXXX XXXX
+def : MipsPat<(VT (and RT:$src, immHIMask32:$imm)),
+              (INSOP ZEROReg, 0, (immInvMaskSize $imm), RT:$src)>;
+}
----------------
This is for another patch, but we're not limited to leading or trailing ones. For example:
  0x00ffff00
can be done with:
  ext $rd, $rs, 8, 16
and:
  0xff0000ff
can be done with:
  ins $rd, $zero, 8, 16

================
Comment at: lib/Target/Mips/MipsInstrInfo.td:2643
@@ +2642,3 @@
+              (SRLOP (SLLOP RT:$src, (immInvMaskSize $imm)),
+                     (immInvMaskSize $imm))>;
+
----------------
indentation (align to the 'RT')

================
Comment at: lib/Target/Mips/MipsInstrInfo.td:2647
@@ +2646,3 @@
+              (SLLOP (SRLOP RT:$src, (immInvMaskSize $imm)),
+                     (immInvMaskSize $imm))>;
+}
----------------
indentation (align to the 'RT')

================
Comment at: test/CodeGen/Mips/fcopysign-f32-f64.ll:16
@@ -15,4 +15,3 @@
 
-; 64-DAG: lui  $[[T0:[0-9]+]], 32767
-; 64-DAG: ori  $[[MSK0:[0-9]+]], $[[T0]], 65535
-; 64-DAG: and  $[[AND0:[0-9]+]], ${{[0-9]+}}, $[[MSK0]]
+; 64-DAG: sll  $[[T0:[0-9]+]], ${{[0-9a-z]+}}, 1
+; 64-DAG: srl  $[[T1:[0-9]+]], $[[T0]], 1
----------------
Why change the regex from {{[0-9]+}} to {{[0-9a-z]+}}? The register should always be numeric.

================
Comment at: test/CodeGen/Mips/fcopysign-f32-f64.ll:41
@@ -41,7 +40,3 @@
 ; 64:     dsll    $[[DSLL:[0-9]+]], $[[SRL]], 63
-; 64-DAG: daddiu  $[[R1:[0-9]+]], $zero, 1
-; 64-DAG: dsll    $[[R2:[0-9]+]], $[[R1]], 63
-; 64-DAG: daddiu  $[[R3:[0-9]+]], $[[R2]], -1
-; 64-DAG: dmfc1   $[[R0:[0-9]+]], ${{.*}}
-; 64:     and     $[[AND0:[0-9]+]], $[[R0]], $[[R3]]
-; 64:     or      $[[OR:[0-9]+]], $[[AND0]], $[[DSLL]]
+; 64:     dmfc1   $[[R0:[0-9]+]], ${{.*}}
+; 64:     dsll    $[[R1:[0-9]+]], $[[R0]], 1
----------------
I believe the {{.*}} can only be 'f12'.

================
Comment at: test/CodeGen/Mips/fcopysign.ll:10
@@ -9,8 +9,3 @@
 ;
-; 32: lui  $[[MSK1:[0-9]+]], 32768
-; 32: and  $[[AND1:[0-9]+]], ${{[0-9]+}}, $[[MSK1]]
-; 32: lui  $[[T0:[0-9]+]], 32767
-; 32: ori  $[[MSK0:[0-9]+]], $[[T0]], 65535
-; 32: and  $[[AND0:[0-9]+]], ${{[0-9]+}}, $[[MSK0]]
-; 32: or   $[[OR:[0-9]+]], $[[AND0]], $[[AND1]]
-; 32: mtc1 $[[OR]], $f1
+; 32: srl  $[[R0:[0-9]+]], $[[R0]], 31
+; 32: sll  $[[R1:[0-9]+]], $[[R0]], 31
----------------
This should be:
  mfc1 $[[MFC:[0-9]+]], $f13
  srl $[[R0:[0-9]+]], $[[MFC]], 32

================
Comment at: test/CodeGen/Mips/fcopysign.ll:12
@@ +11,3 @@
+; 32: sll  $[[R1:[0-9]+]], $[[R0]], 31
+; 32: mfc1 $[[R2:[0-9]+]], ${{.*}}
+; 32: sll  $[[R3:[0-9]+]], $[[R2]], 1
----------------
I believe the {{.*}} can only be 'f12'. Likewise for the similar tests below.

================
Comment at: test/CodeGen/Mips/llvm-ir/and-opts.ll:13
@@ +12,3 @@
+; ALL-LABEL: f:
+; ALL:      andi ${{[0-9a-z]+}}, ${{[0-9a-z]+}}, 65535
+  %and = and i64 %a, 65535
----------------
Could you handle the registers like fcopysign.ll does (i.e. with [[FOO:[0-9]+]] and [[FOO]])? This is important for the multi-instruction cases below.

Also, why is [a-z] permitted in the register numbers?

================
Comment at: test/CodeGen/Mips/llvm-ir/and-opts.ll:23
@@ +22,3 @@
+; MIPS:     srl  ${{[0-9a-z]+}}, ${{[0-9a-z]+}}, 12
+; MIPS32R2: ext  ${{[0-9a-z]+}}, ${{[0-9a-z]+}}, 0,
+; MIPS64:   dsll ${{[0-9a-z]+}}, ${{[0-9a-z]+}}, 44
----------------
The '20' is missing on this line

================
Comment at: test/CodeGen/Mips/llvm-ir/and-opts.ll:36
@@ +35,3 @@
+; MIPS:     srl  ${{[0-9a-z]+}}, ${{[0-9a-z]+}}, 8
+; MIPS32R2: ext  ${{[0-9a-z]+}}, ${{[0-9a-z]+}}, 0,
+; MIPS64:   dsll ${{[0-9a-z]+}}, ${{[0-9a-z]+}}, 40
----------------
The '24' is missing on this line

================
Comment at: test/CodeGen/Mips/llvm-ir/and-opts.ll:82
@@ +81,3 @@
+; MIPS64:       dsrl  ${{[0-9a-z]+}}, ${{[0-9a-z]+}}, 28
+; MIPS64R2:     dinsu ${{[0-9a-z]+}}, ${{[0-9a-z]+}}, 36, 28
+  %and = and i64 %a, 68719476735
----------------
I believe the second operand must be $zero. Likewise for the other dinsu's below


Repository:
  rL LLVM

https://reviews.llvm.org/D22520