[PATCH] Combine fmul vector FP constants when unsafe math is allowed

Andrea Di Biagio Andrea_DiBiagio at sn.scee.net
Tue Sep 9 10:12:57 PDT 2014


Hi Sanjay,

Thanks for the patch.
I have a couple of comments (see below):

================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6812-6814
@@ -6809,1 +6811,5 @@
+    // Canonicalize possible vector constant to RHS.
+    if (N0.getOpcode() == ISD::BUILD_VECTOR &&
+        N1.getOpcode() != ISD::BUILD_VECTOR)
+      return DAG.getNode(N->getOpcode(), SDLoc(N), VT, N1, N0);
   }
----------------
This is ok. However, it might be better to check that N0 is a build vector of all constants before commuting the operands of the FMUL. This is to avoid that we trigger  the canonicalization for nodes where N0 is a build vector, but not all elements in N0 are constants or undef.

================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6835-6846
@@ -6828,8 +6834,14 @@
     // fold (fmul (fmul x, c1), c2) -> (fmul x, (fmul c1, c2))
-    if (N1CFP && N0.getOpcode() == ISD::FMUL &&
-        N0.getNode()->hasOneUse() && isConstOrConstSplatFP(N0.getOperand(1))) {
-      SDLoc SL(N);
-      SDValue MulConsts = DAG.getNode(ISD::FMUL, SL, VT, N0.getOperand(1), N1);
-      return DAG.getNode(ISD::FMUL, SL, VT, N0.getOperand(0), MulConsts);
+    if (N0.getOpcode() == ISD::FMUL && N0.getNode()->hasOneUse()) {
+      // Fold scalars or any vector constants (not just splats).
+      SDValue N01 = N0.getOperand(1);
+      auto *BV1 = dyn_cast<BuildVectorSDNode>(N1);
+      auto *BV01 = dyn_cast<BuildVectorSDNode>(N01);
+      if ((N1CFP && isConstOrConstSplatFP(N01)) ||
+          (BV1 && BV01 && BV1->isConstant() && BV01->isConstant())) {
+        SDLoc SL(N);
+        SDValue MulConsts = DAG.getNode(ISD::FMUL, SL, VT, N01, N1);
+        return DAG.getNode(ISD::FMUL, SL, VT, N0.getOperand(0), MulConsts);
+      }
     }
 
----------------
I think we probably don't need the 'hasOneUse()' constraint here.

in the worst case scenario, we still have two fmul instructions; that is because the new (fmul c1, c2) would be always constant folded.
Also, if you remove the 'hasOneUse()' constraint, every time the dag combiner triggers your new rule, the number of uses for the 'fmul x, c1' is decreased by one.
This would have a positive effect on the following code example:

```
define <4 x float> @foo(<4 x float> %A, <4 x float> %B) {
  %1 = fmul <4 x float> %A, <float 3.0, float 4.0, float 5.0, float 6.0>
  %2 = fmul <4 x float> %1, <float 1.0, float 2.0, float 3.0, float 4.0>
  %3 = fmul <4 x float> %1, <float 2.0, float 3.0, float 4.0, float 5.0>
  %4 = fadd <4 x float> %2, %3
  ret <4 x float> %4
}
```

Without the 'hasOneUse()' check, we only get two (v)mulps. With the 'hasOneUse()' we instead get three mulps.

http://reviews.llvm.org/D5254






More information about the llvm-commits mailing list