[PATCH] D90223: [AMDGPU][GlobalISel] Combine shift + logic + shift with constant operands

Wed Oct 28 02:10:53 PDT 2020

foad added inline comments.

================
Comment at: llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h:248-249
+  /// shift-by-constant operand with identical opcode, we may be able to convert
+  /// that into 2 independent shifts followed by the logic op. This is a
+  /// throughput improvement.
+  bool matchShiftOfShiftedLogic(MachineInstr &MI,
----------------
I know you copied this comment from SelectionDAG but I think it really improves latency, not throughput, because the new shifts are independent.

Also you don't mention the other reason for doing this, which is that one of the new shifts might constant-fold away.

================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1611
+                                              ShiftOfShiftedLogic &MatchInfo) {
+  // We're trying to match the following pattern with any of G_SHL/G_ASHR/G_LSHR
+  // shift instructions in combination with any of G_AND/G_OR/G_XOR logic
----------------
Also handle G_USHLSAT and G_SSHLSAT?

================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1655-1658
+    // Shift amount types do not have to match their operand type, so check that
+    // the constants are the same width.
+    if (MRI.getType(Imm).getScalarSizeInBits() != BitWidth)
+      return false;
----------------
I don't see why this check is necessary. All we care about is the shift amount, as a uint64_t.

================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1666-1668
+    // The fold is not valid if the sum of the shift values exceeds bitwidth.
+    if ((ShiftVal + C1Val) > BitWidth)
+      return false;
----------------
First, it would be simpler to check this by line 1690 where you're already computing the sum. Second, I would expect the `>` to be `>=`. Third, you're comparing against BitWidth which is the width of the shift amount, not the value being shifted.

================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1702
+
+  MachineIRBuilder MIB(MI);
+  LLT ShlType = MRI.getType(MI.getOperand(2).getReg());
----------------
Don't create this. CombinerHelper already has a MIRBuilder.

================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1709
+  Register Shift1 =
+      MIB.buildInstr(Opcode, {ShlType}, {Shift1Base, Const}).getReg(0);
+
----------------
ShlType is wrong here and below. You want the type of the result, not the type of the shift amount.

================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1719-1721
+  Register ConstC0 = MatchInfo.Shift2->getOperand(2).getReg();
+  if (MRI.hasOneNonDBGUse(ConstC0))
+    MRI.getUniqueVRegDef(ConstC0)->eraseFromParent();
----------------
It's probably not worth doing this. I think you can rely on dead code elimination.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90223/new/

https://reviews.llvm.org/D90223