[llvm] [LoopVectorizer][AArch64] Move getMinTripCountTailFoldingThreshold later. (PR #132170)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 20 07:43:47 PDT 2025
================
@@ -4105,13 +4102,40 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
const SCEV *Rem = SE->getURemExpr(
SE->applyLoopGuards(ExitCount, TheLoop),
SE->getConstant(BackedgeTakenCount->getType(), MaxVFtimesIC));
- if (Rem->isZero()) {
+ return Rem->isZero();
+ };
+
+ if (MaxPowerOf2RuntimeVF && *MaxPowerOf2RuntimeVF > 0) {
+ assert((UserVF.isNonZero() || isPowerOf2_32(*MaxPowerOf2RuntimeVF)) &&
+ "MaxFixedVF must be a power of 2");
+ if (IsKnownModTripCountZero(*MaxPowerOf2RuntimeVF)) {
// Accept MaxFixedVF if we do not have a tail.
LLVM_DEBUG(dbgs() << "LV: No tail will remain for any chosen VF.\n");
return MaxFactors;
}
}
+ if (MaxTC && MaxTC <= TTI.getMinTripCountTailFoldingThreshold()) {
+ if (MaxPowerOf2RuntimeVF && *MaxPowerOf2RuntimeVF > 0) {
----------------
david-arm wrote:
I learned recently that with std::optional you can actually just perform the arithmetic comparison directly, since the operator is overloaded, i.e.
```
if (MaxPowerOf2RuntimeVF > 0) {
```
Perhaps this would be simpler? Although I realise you didn't write the original code!
https://github.com/llvm/llvm-project/pull/132170
More information about the llvm-commits
mailing list