[PATCH] D130618: [AArch64][LoopVectorize] Enable tail-folding of simple loops on neoverse-v1

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 3 08:03:24 PDT 2023

dmgreen added a comment.

Thanks. This still makes me a bit nervous, considering what we know about predication and the performance results I've seen.

Can you explain more about why the limit is 10 instructions? As far as I could see the limit on interleaving in the vectorizer is a cost of 20, and with many instructions like geps, phis and branches being free that will be quite a bit more than 10 instructions. We could have the limit lower than the default for interleaving if that makes more sense, but 10 seems quite low.



More information about the llvm-commits mailing list