[PATCH] D36115: [Loop Vectorize] Block Probability for Predicated Blocks

Matthew Simpson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 16 09:20:42 PDT 2017


mssimpso added inline comments.


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2006-2007
+      return 2;
+    return (BFI->getBlockFreq(TheLoop->getHeader())).getFrequency() /
+           (BFI->getBlockFreq(BB)).getFrequency();
+  }
----------------
I too am curious if you saw any performance improvements with this change. Can you share any data?

If the goal is to be more precise, it probably makes sense for this function to return a float for block probability and not an unsigned for the reciprocal. I think nearly all the users of this function use it as the denominator of a division. So with the added division here and the ones that follow, we're losing precision. I'm not sure this is any better than always returning 2.


================
Comment at: test/Transforms/LoopVectorize/AArch64/aarch64-predication.ll:25
 ; CHECK-NEXT:    [[TMP2:%.*]] = icmp sgt <2 x i64> [[WIDE_LOAD]], zeroinitializer
+; CHECK-NEXT:    [[TMP6:%.*]] = add nsw <2 x i64> [[WIDE_LOAD]], %broadcast.splat2
 ; CHECK-NEXT:    [[TMP3:%.*]] = extractelement <2 x i1> [[TMP2]], i32 0
----------------
FWIW, the current patch breaks this test. For it to be testing what it's intended to test, the `add` should be scalarized.


https://reviews.llvm.org/D36115





More information about the llvm-commits mailing list