[llvm] [AArch64][LoopVectorize] Use either fixed-width or scalable VF when tail-folding (PR #67543)

Matthew Devereau via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 27 06:10:16 PDT 2023


================
@@ -5142,7 +5142,9 @@ ElementCount LoopVectorizationCostModel::getMaximizedVFForTarget(
     LLVM_DEBUG(dbgs() << "LV: Clamping the MaxVF to maximum power of two not "
                          "exceeding the constant trip count: "
                       << ClampedConstTripCount << "\n");
-    return ElementCount::getFixed(ClampedConstTripCount);
+    return ElementCount::get(
+        ClampedConstTripCount,
+        FoldTailByMasking ? MaxVectorElementCount.isScalable() : false);
----------------
MDevereau wrote:

Is this inline if necessary? If `ConstTripCount` is not a power of 2 then it has already been clamped to one, and `FoldTailByMasking` must be false or we wouldn't have reached here because of the previous if condition.

I think you can just get away with 
```c++
    return ElementCount::get(
        ClampedConstTripCount, MaxVectorElementCount.isScalable());
``` 

There seems to be a another test missing for clamping the trip count as well since ignoring the clamped value and doing 

```c++
    return ElementCount::get(
        ConstTripCount, MaxVectorElementCount.isScalable());
``` 
also passes check-llvm, however I don't think this is because of your changes.

https://github.com/llvm/llvm-project/pull/67543


More information about the llvm-commits mailing list