[PATCH] D154314: [LV] Prefer the tail fold according the the user hint

Wed Jul 5 06:48:19 PDT 2023

Allen added inline comments.
Herald added a subscriber: wangpc.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5180
+    if (MaxVScale && TTI.isVScaleKnownToBeAPowerOfTwo() &&
+        PreferPredicateOverEpilogue == PreferPredicateTy::ScalarEpilogue) {
       MaxPowerOf2RuntimeVF = std::max<unsigned>(
----------------
david-arm wrote:
> I think this looks like a regression to me. If we know the mask is always going to be true for every vector iteration, then we should be using unpredicated loops instead even if the target supports tail-folding. That's because creating and maintaining the loop predicate is always going to be a bit more expensive than maintaining a simple integer induction variable.
> 
> I believe gcc's code in https://github.com/llvm/llvm-project/issues/63616 is worse than clang, although perhaps it's worth testing this on AArch64 hardware to confirm?
Thanks for your idea.

I test their performance , and it doesn't look significantly different base on aarch64 target (the length of scalable vector is 256). 
But on the x86 target,it seems clang is better (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99411), so I agree with you, this is not a problem. 

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154314/new/

https://reviews.llvm.org/D154314