[PATCH] D154314: [LV] Prefer the tail fold according the the user hint

Mon Jul 3 01:34:20 PDT 2023

david-arm added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5180
+    if (MaxVScale && TTI.isVScaleKnownToBeAPowerOfTwo() &&
+        PreferPredicateOverEpilogue == PreferPredicateTy::ScalarEpilogue) {
       MaxPowerOf2RuntimeVF = std::max<unsigned>(
----------------
I think this looks like a regression to me. If we know the mask is always going to be true for every vector iteration, then we should be using unpredicated loops instead even if the target supports tail-folding. That's because creating and maintaining the loop predicate is always going to be a bit more expensive than maintaining a simple integer induction variable.

I believe gcc's code in https://github.com/llvm/llvm-project/issues/63616 is worse than clang, although perhaps it's worth testing this on AArch64 hardware to confirm?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154314/new/

https://reviews.llvm.org/D154314