[llvm-branch-commits] [llvm] Patch 3: [LV] Add extra CM instace for EpilogueTF (PR #202820)
Igor Kirillov via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Wed Jun 10 05:10:39 PDT 2026
================
@@ -5517,22 +5518,49 @@ void LoopVectorizationPlanner::plan(ElementCount UserVF, unsigned UserIC,
// for later use by the cost model.
Config.computeMinimalBitwidths();
+ SmallVector<LoopVectorizationCostModel *, 2> EnabledCMs;
+ EnabledCMs.push_back(&CM);
+
+ // Make sure firstly that the epilogue of main vector loop is allowed, then
+ // check if the tail-folded epilogue feature is enabled.
+ if (CM.EpilogueLoweringStatus == CM_EpilogueAllowed &&
+ EpilogueTailFoldingCM) {
+ // To avoid redundant heavy computation, copy computed `ValuesToIgnore`
+ // and `VecValuesToIgnore` to the EpilogueTailFoldingCM as they will be
+ // same.
+ EpilogueTailFoldingCM->ValuesToIgnore.insert_range(CM.ValuesToIgnore);
+ EpilogueTailFoldingCM->VecValuesToIgnore.insert_range(CM.VecValuesToIgnore);
+
+ // After making sure that we can get valid results of computeMaxVF, make
+ // sure that tail-folding for the epilogue loop still valid.
+ if (EpilogueTailFoldingCM->computeMaxVF(UserVF, UserIC) &&
+ EpilogueTailFoldingCM->foldTailByMasking()) {
+ EnabledCMs.push_back(&*EpilogueTailFoldingCM);
+ LLVM_DEBUG(dbgs() << "LV: CM instances: " << EnabledCMs.size() << "\n");
+ }
+ }
+
// Invalidate interleave groups if all blocks of loop will be predicated.
- if (CM.blockNeedsPredicationForAnyReason(OrigLoop->getHeader()) &&
- !useMaskedInterleavedAccesses(TTI)) {
- LLVM_DEBUG(
- dbgs()
- << "LV: Invalidate all interleaved groups due to fold-tail by masking "
- "which requires masked-interleaved support.\n");
- if (CM.InterleaveInfo.invalidateGroups())
- // Invalidating interleave groups also requires invalidating all decisions
- // based on them, which includes widening decisions and uniform and scalar
- // values.
- CM.invalidateCostModelingDecisions();
+ if (!useMaskedInterleavedAccesses(TTI)) {
+ for_each(EnabledCMs, [&](auto *CurrentCM) {
----------------
igogo-x86 wrote:
Could we avoid the repeated for_each blocks here? It looks like we do the same cost model setup for the main CM and for EpilogueTailFoldingCM when it is present. Maybe this can be moved to a small helper, for example:
```
prepareCostModelForVFs(CM, VFCandidates);
if (EpilogueTailFoldingCM)
prepareCostModelForVFs(*EpilogueTailFoldingCM, VFCandidates);
```
This would make the code easier to read and show that both cost models go through the same preparation.
https://github.com/llvm/llvm-project/pull/202820
More information about the llvm-branch-commits
mailing list