[PATCH] transform fadd chains to increase parallelism
Quentin Colombet
qcolombet at apple.com
Tue Apr 28 11:16:01 PDT 2015
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:7662
@@ +7661,3 @@
+ // and 1 dependent operation:
+ // (fadd x, (fadd y, (fadd z, w))) -> (fadd (fadd x, y), (fadd z, w))
+ if (N0.getOpcode() == ISD::FADD && N0.hasOneUse() &&
----------------
I would prefer the comment to match the actual code, i.e., invert the order of the operand:
(fadd (fadd (fadd z, w), y), x) -> (fadd (fadd z, w), (fadd x, y))
You could even use named operands like this:
(fadd N0: (fadd N00: (fadd z, w), N01: y), N1: x) -> (fadd N00: (fadd z, w), (fadd N1: x, M01: y))
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:7666
@@ +7665,3 @@
+ SDValue N00 = N0.getOperand(0);
+ SDValue N01 = N0.getOperand(1);
+ if (N00.getOpcode() == ISD::FADD) {
----------------
You can move this assignment into the next if.
================
Comment at: test/CodeGen/X86/fp-fast.ll:124
@@ +123,3 @@
+; CHECK-NEXT: vaddss {{%xmm[0-9], %xmm[0-9]}}, [[XMM1:%xmm[0-9]]]
+; CHECK-NEXT: vaddss {{%xmm[0-9], %xmm[0-9]}}, [[XMM2:%xmm[0-9]]]
+; CHECK-NEXT: vaddss [[XMM2]], [[XMM1]],
----------------
Can’t you be more specific on the input registers?
With a pattern like this, I believe even the old inefficient sequence would match, wouldn’t it?
http://reviews.llvm.org/D9232
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list