[PATCH] D36115: [Loop Vectorize] Block Probability for Predicated Blocks

Wed Aug 16 09:20:42 PDT 2017

mssimpso added inline comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2006-2007
+      return 2;
+    return (BFI->getBlockFreq(TheLoop->getHeader())).getFrequency() /
+           (BFI->getBlockFreq(BB)).getFrequency();
+  }
----------------
I too am curious if you saw any performance improvements with this change. Can you share any data?

If the goal is to be more precise, it probably makes sense for this function to return a float for block probability and not an unsigned for the reciprocal. I think nearly all the users of this function use it as the denominator of a division. So with the added division here and the ones that follow, we're losing precision. I'm not sure this is any better than always returning 2.

================
Comment at: test/Transforms/LoopVectorize/AArch64/aarch64-predication.ll:25
 ; CHECK-NEXT:    [[TMP2:%.*]] = icmp sgt <2 x i64> [[WIDE_LOAD]], zeroinitializer
+; CHECK-NEXT:    [[TMP6:%.*]] = add nsw <2 x i64> [[WIDE_LOAD]], %broadcast.splat2
 ; CHECK-NEXT:    [[TMP3:%.*]] = extractelement <2 x i1> [[TMP2]], i32 0
----------------
FWIW, the current patch breaks this test. For it to be testing what it's intended to test, the `add` should be scalarized.

https://reviews.llvm.org/D36115