[PATCH] D13269: Improved X86-FMA3 mem-folding & coalescing

Quentin Colombet via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 30 14:37:11 PDT 2015


qcolombet requested changes to this revision.
qcolombet added a comment.

Hi Slava,

Thanks on working on this.

Two main things:

1. Could you explain the structure you used to describe the dependencies between the FMA opcode?

I do not want to reverse engineer it to review the patch!

2. The bug you fix here for the operand we shouldn’t commute for the intrinsic lowering is, IMO, separated from improving the lowering for the commutable operand.

I.e., please address that in a separate patch. As for direction, you could use the *_Int approach, or better, but more involved, model correctly the intrinsic lowering by adding a subregister for FP32 from the VR128 and use insert_subreg.

Cheers,
-Quentin


================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3030
@@ +3029,3 @@
+    default:
+      break;
+  }
----------------
return false;

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3032
@@ +3031,3 @@
+  }
+  return false;
+}
----------------
llvm_unreachable(“Opcode not handled by the switch")

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3272
@@ +3271,3 @@
+      SrcOpIdx2 < 1 ||
+      (SrcOpIdx2 > RegOpsNum && SrcOpIdx2 != CommuteAnyOperandIndex))
+    return false;
----------------
I would rewrite the check to make the CommuteAnyOperandIndex the first discrimination:
if (SrcOpIdx1 != CommuteAnyOperandIndex && (SrcOpIdx1 < 1 || SrcOpIdx1 > RegOpsNum))
  return false;

Same for SrcOpIdx2.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3276
@@ +3275,3 @@
+  if (SrcOpIdx1 == CommuteAnyOperandIndex ||
+      SrcOpIdx2 == CommuteAnyOperandIndex) {
+    unsigned CommutableOpIdx1 = SrcOpIdx1;
----------------
Add a comment saying that we look for two registers operands, those are the ones that can be commuted regardless of the FMA opcode. We will adjust the opcode later.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3284
@@ +3283,3 @@
+      // Both of operands are not fixed. By default set one of commutable
+      // operands to the last operand of the instruction.
+      CommutableOpIdx2 = RegOpsNum;
----------------
the last *register* operand of the instruction.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3311
@@ +3310,3 @@
+  }
+  return getFMA3OpcodeToCommuteOperands(MI, SrcOpIdx1, SrcOpIdx2) != 0;
+}
----------------
Add a comment along the line:
// Check if we can adjust the opcode so that the registers we change preserve the semantic.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3321
@@ +3320,3 @@
+  // Struct describing FMA opcodes and dependencies between them.
+  static const struct {
+    int Opc1;
----------------
Please explain the structure you are using here.
In particular, what are those dependencies and how do you represent them.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3425
@@ +3424,3 @@
+  if (OpcodeAlts[i].IsScalar && (SrcOpIdx1 == 1 || SrcOpIdx2 == 1))
+    return 0;
+
----------------
Get rid of this check.
This is a hack to workaround a bug. The bug should be fixed independently of the improvement for the lowering for commutable operands.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3428
@@ +3427,3 @@
+  // Find reversed FMA opcode.
+  if ((SrcOpIdx1 == 1 && SrcOpIdx2 == 2) ||
+      (SrcOpIdx1 == 2 && SrcOpIdx2 == 1)) {
----------------
Canonicalize SrcOpIdx1 and SrcOpIdx2 to avoid these duplicated checks.

================
Comment at: llvm/lib/Target/X86/X86InstrInfo.h:270
@@ +269,3 @@
+  ///
+  /// The 'SrcOpIdx1' and 'SrcOpIdx2' are INPUT and OUTPUT arguments.
+  /// The output indices of the commuted operands are returned in these
----------------
\p SrcOpIdx1 and \p SrcOpIdx2

================
Comment at: llvm/test/CodeGen/X86/fma_patterns.ll:180
@@ -180,1 +179,3 @@
+; CHECK-NEXT:    vfnmadd213ss %xmm2, %xmm0, %xmm1
+; CHECK-NEXT:    vmovaps      %xmm1, %xmm0
 ; CHECK-NEXT:    retq
----------------
We shouldn’t regress those.


http://reviews.llvm.org/D13269





More information about the llvm-commits mailing list