[PATCH] D26543: Fix for lost FastMathFlags in SLPVectorizer

Fri Nov 11 03:09:43 PST 2016

v_klochkov created this revision.
v_klochkov added reviewers: mzolotukhin, davide, majnemer.
v_klochkov added a subscriber: llvm-commits.

Hello,

Please review the fix for SLPVectorizer that lost FastMathFlags in FCmp operation.

Initially this fix was proposed together with few other similar fixes
(https://reviews.llvm.org/D26436),
but since then the fix has been re-worked and some reviewers recommendations to have 
separate code-reviews for each of the fixed components were taken into account.

Some explanations for this fix:

1. The call of the method propagateIRFlags() is performed not only for FCmp. It is called for ICmp as well. It is done the same way as it is done for binary operations (i.e. the method is called even for int ADD). I think this is right approach and the method propagateIRFlags() should decide what flags and for what operations it propagates.

2. The fix replaces the casts to <BinaryOperator> with casts to <Instruction> inside the method propagateIRFlags(). The whole work done in this method is done with help of Instruction::andIRFlags(). In my opinion, the AND'able flags used in the set of original/scalar operations should all be AND'ed. Please fix me if this is wrong or too risky.

  The alternative was to have duplicated code due to not quite clear reasons, i.e.: if (auto *VecOp = dyn_cast<BinaryOperator>(I)) { <9 lines of code here> } else if (auto *VecOp = dyn_cast<CmpInst>(I)) { <exactly the same/similar 9 lines of code here> }

The fix includes 2 new test cases inside an existing LIT test.
Also, all other LIT tests successfully passed.

Thank you,
Vyacheslav Klochkov


https://reviews.llvm.org/D26543

Files:
  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
  llvm/test/Transforms/SLPVectorizer/X86/propagate_ir_flags.ll


Index: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
===================================================================

--- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -211,12 +211,12 @@
 /// of each scalar operation (VL) that will be converted into a vector (I).
 /// Flag set: NSW, NUW, exact, and all of fast-math.
 static void propagateIRFlags(Value *I, ArrayRef<Value *> VL) {
-  if (auto *VecOp = dyn_cast<BinaryOperator>(I)) {
-    if (auto *Intersection = dyn_cast<BinaryOperator>(VL[0])) {
+  if (auto *VecOp = dyn_cast<Instruction>(I)) {
+    if (auto *Intersection = dyn_cast<Instruction>(VL[0])) {
       // Intersection is initialized to the 0th scalar,
       // so start counting from index '1'.
       for (int i = 1, e = VL.size(); i < e; ++i) {
-        if (auto *Scalar = dyn_cast<BinaryOperator>(VL[i]))
+        if (auto *Scalar = dyn_cast<Instruction>(VL[i]))
           Intersection->andIRFlags(Scalar);
       }
       VecOp->copyIRFlags(Intersection);
@@ -2430,6 +2430,7 @@
         V = Builder.CreateICmp(P0, L, R);
 
       E->VectorizedValue = V;
+      propagateIRFlags(E->VectorizedValue, E->Scalars);
       ++NumVectorInstructions;
       return V;
     }
Index: llvm/test/Transforms/SLPVectorizer/X86/propagate_ir_flags.ll
===================================================================
--- llvm/test/Transforms/SLPVectorizer/X86/propagate_ir_flags.ll
+++ llvm/test/Transforms/SLPVectorizer/X86/propagate_ir_flags.ll
@@ -348,3 +348,54 @@
   ret void
 }
  
+attributes #1 = { "target-features"="+avx" }
+
+; CHECK-LABEL: @fcmp_fast
+; CHECK: fcmp fast oge <2 x double>
+; CHECK: sub fast <2 x double>
+define void @fcmp_fast(double* %x) #1 {
+  %idx1 = getelementptr inbounds double, double* %x, i64 0
+  %idx2 = getelementptr inbounds double, double* %x, i64 1
+
+  %load1 = load double, double* %idx1, align 8
+  %load2 = load double, double* %idx2, align 8
+
+  %cmp1 = fcmp fast oge double %load1, 0.000000e+00
+  %cmp2 = fcmp fast oge double %load2, 0.000000e+00
+
+  %sub1 = fsub fast double -0.000000e+00, %load1
+  %sub2 = fsub fast double -0.000000e+00, %load2
+
+  %sel1 = select i1 %cmp1, double %load1, double %sub1
+  %sel2 = select i1 %cmp2, double %load2, double %sub2
+
+  store double %sel1, double* %idx1, align 8
+  store double %sel2, double* %idx2, align 8
+
+  ret void
+}
+
+; CHECK-LABEL: @fcmp_no_fast
+; CHECK: fcmp oge <2 x double>
+; CHECK: sub <2 x double>
+define void @fcmp_no_fast(double* %x) #1 {
+  %idx1 = getelementptr inbounds double, double* %x, i64 0
+  %idx2 = getelementptr inbounds double, double* %x, i64 1
+
+  %load1 = load double, double* %idx1, align 8
+  %load2 = load double, double* %idx2, align 8
+
+  %cmp1 = fcmp fast oge double %load1, 0.000000e+00
+  %cmp2 = fcmp oge double %load2, 0.000000e+00
+
+  %sub1 = fsub fast double -0.000000e+00, %load1
+  %sub2 = fsub double -0.000000e+00, %load2
+
+  %sel1 = select i1 %cmp1, double %load1, double %sub1
+  %sel2 = select i1 %cmp2, double %load2, double %sub2
+
+  store double %sel1, double* %idx1, align 8
+  store double %sel2, double* %idx2, align 8
+
+  ret void
+}


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D26543.77605.patch
Type: text/x-patch
Size: 3171 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161111/27985c4e/attachment.bin>