[PATCH] D13269: Improved X86-FMA3 mem-folding & coalescing

Vyacheslav Klochkov via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 1 15:25:02 PDT 2015


v_klochkov added a comment.

I also did additional changes accordingly to reviewers' recommendations.


================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3030
@@ +3029,3 @@
+    default:
+      break;
+  }
----------------
qcolombet wrote:
> return false;
Fixed.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3032
@@ +3031,3 @@
+  }
+  return false;
+}
----------------
qcolombet wrote:
> llvm_unreachable(“Opcode not handled by the switch")
Fixed.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3272
@@ +3271,3 @@
+      SrcOpIdx2 < 1 ||
+      (SrcOpIdx2 > RegOpsNum && SrcOpIdx2 != CommuteAnyOperandIndex))
+    return false;
----------------
qcolombet wrote:
> I would rewrite the check to make the CommuteAnyOperandIndex the first discrimination:
> if (SrcOpIdx1 != CommuteAnyOperandIndex && (SrcOpIdx1 < 1 || SrcOpIdx1 > RegOpsNum))
>   return false;
> 
> Same for SrcOpIdx2.
I agree, your version looks a bit more clear. Fixed.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3276
@@ +3275,3 @@
+  if (SrcOpIdx1 == CommuteAnyOperandIndex ||
+      SrcOpIdx2 == CommuteAnyOperandIndex) {
+    unsigned CommutableOpIdx1 = SrcOpIdx1;
----------------
qcolombet wrote:
> Add a comment saying that we look for two registers operands, those are the ones that can be commuted regardless of the FMA opcode. We will adjust the opcode later.
Ok, I added a comment.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3284
@@ +3283,3 @@
+      // Both of operands are not fixed. By default set one of commutable
+      // operands to the last operand of the instruction.
+      CommutableOpIdx2 = RegOpsNum;
----------------
qcolombet wrote:
> the last *register* operand of the instruction.
It is interesting that I added the word "register" here when made changes for your previous comment and only then
noticed this comment asking me to do exactly the same change.
Fixed.


================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3311
@@ +3310,3 @@
+  }
+  return getFMA3OpcodeToCommuteOperands(MI, SrcOpIdx1, SrcOpIdx2) != 0;
+}
----------------
qcolombet wrote:
> Add a comment along the line:
> // Check if we can adjust the opcode so that the registers we change preserve the semantic.
Ok, added a comment.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3321
@@ +3320,3 @@
+  // Struct describing FMA opcodes and dependencies between them.
+  static const struct {
+    int Opc1;
----------------
qcolombet wrote:
> Please explain the structure you are using here.
> In particular, what are those dependencies and how do you represent them.
This is just an array after I removed IsScalar property.
I changed the comment to make it more clear.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3425
@@ +3424,3 @@
+  if (OpcodeAlts[i].IsScalar && (SrcOpIdx1 == 1 || SrcOpIdx2 == 1))
+    return 0;
+
----------------
qcolombet wrote:
> Get rid of this check.
> This is a hack to workaround a bug. The bug should be fixed independently of the improvement for the lowering for commutable operands.
I removed this check from this change-set and updated the OpcodeAlts struct.
BTW, I would not call this check a hack. It was rather a pessimistic correctness check.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3428
@@ +3427,3 @@
+  // Find reversed FMA opcode.
+  if ((SrcOpIdx1 == 1 && SrcOpIdx2 == 2) ||
+      (SrcOpIdx1 == 2 && SrcOpIdx2 == 1)) {
----------------
qcolombet wrote:
> Canonicalize SrcOpIdx1 and SrcOpIdx2 to avoid these duplicated checks.
Ok, Fixed. SrcOpIdx1 has the lowest index now to simplify the next checks.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.h:270
@@ +269,3 @@
+  ///
+  /// The 'SrcOpIdx1' and 'SrcOpIdx2' are INPUT and OUTPUT arguments.
+  /// The output indices of the commuted operands are returned in these
----------------
qcolombet wrote:
> \p SrcOpIdx1 and \p SrcOpIdx2
Fixed. I did not know about \p. Thank you for letting me know.

================
Comment at: llvm/test/CodeGen/X86/fma_patterns.ll:180
@@ -180,1 +179,3 @@
+; CHECK-NEXT:    vfnmadd213ss %xmm2, %xmm0, %xmm1
+; CHECK-NEXT:    vmovaps      %xmm1, %xmm0
 ; CHECK-NEXT:    retq
----------------
qcolombet wrote:
> We shouldn’t regress those.
Ok, Fixed.


http://reviews.llvm.org/D13269





More information about the llvm-commits mailing list