[llvm] [LoopVectorize] Allow Early-Exit Loop Vectorization with EVL (PR #130918)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 26 10:48:12 PDT 2025


================
@@ -4038,10 +4038,12 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
   }
 
   // The only loops we can vectorize without a scalar epilogue, are loops with
-  // a bottom-test and a single exiting block. We'd have to handle the fact
-  // that not every instruction executes on the last iteration.  This will
-  // require a lane mask which varies through the vector loop body.  (TODO)
-  if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
+  // a bottom-test and a single exiting block or those with early exits. We'd
+  // have to handle the fact that not every instruction executes on the last
+  // iteration. This will require a lane mask which varies through the vector
+  // loop body. (TODO)
+  if ((TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) &&
+      !Legal->hasUncountableEarlyExit()) {
----------------
david-arm wrote:

So tail predication only works with some early exit loops it seems. Some of the tests in simple_early_exit_predication.ll fail to vectorise because of this:

```
LV: checking if tail can be folded by masking.
LV: Cannot fold tail by masking, loop has an outside user for   %retval = phi i64 [ %index, %loop ], [ 66, %loop.inc ]
LV: Can't fold tail by masking: don't vectorize
LV: Vectorization is possible but not beneficial.
```

That explains why it's not considered beneficial for many loops. This error comes from `LoopVectorizationLegality::canFoldTailByMasking`. In all likelihood a simple search loop such as std::find will have an outside use of an induction variable so I'm not sure how much value there is right now in enabling early exit vectorisation with tail-folding? I'm not against enabling it, but I wonder what loops you're specifically interested in here?

Also some of the tests have an exact trip count of 64, where we know there will not be a tail so we avoid using predication. It would be good to change the tests `same_exit_block_pre_inc_use1`, `same_exit_block_pre_inc_use4`, `loop_contains_safe_call`, `loop_contains_safe_div` and `loop_contains_load_after_early_exit` to have a trip count of 63 instead of 64. Also, would be good to have at least one test that doesn't have any outside uses so we can verify it's working correctly.

https://github.com/llvm/llvm-project/pull/130918


More information about the llvm-commits mailing list