[llvm] [WIP][LV] Ignore some costs when loop gets fully unrolled (PR #106699)

Fri Aug 30 02:53:34 PDT 2024

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Igor Kirillov (igogo-x86)

<details>
<summary>Changes</summary>

When fixed-width VF equals the number of iterations, comparison instruction and induction operation will be DCEed later. Ignoring the costs of these instructions improves the cost model.

I have one benchmark that significantly improves when vectorised with fixed VF instead of scalable on AArch64 due to full unrolling. But I am not sure how to ignore the costs properly. This patch, as it is, makes new and legacy cost models out of sync and compiler fails with assertion error on this line:

```
assert((BestFactor.Width == LegacyVF.Width ||
 planContainsAdditionalSimplifications(getPlanFor(BestFactor.Width),
 BestFactor.Width, CostCtx,
 OrigLoop, CM)) &&
 " VPlan cost model and legacy cost model disagreed");
```

Identifying variables to ignore in `LoopVectorizationCostModel::getInstructionCost` is much more challenging.

Is the approach shown in the patch ok, and can I wait for the legacy cost model to be removed? If not, how can it be done differently?

---
Full diff: https://github.com/llvm/llvm-project/pull/106699.diff


1 Files Affected:

- (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+20) 


``````````diff

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 6babfd1eee9108..7dc9a60cb0f034 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7112,6 +7112,26 @@ LoopVectorizationPlanner::precomputeCosts(VPlan &Plan, ElementCount VF,
         continue;
       IVInsts.push_back(CI);
     }
+
+    // If the given VF loop gets fully unrolled, ignore the costs of comparison
+    // and increment instruction, as they'll get simplified away
+    auto TC = CM.PSE.getSE()->getSmallConstantTripCount(OrigLoop);
+    auto *Cmp = OrigLoop->getLatchCmpInst();
+    if (Cmp && VF.isFixed() && VF.getFixedValue() == TC) {
+      CostCtx.SkipCostComputation.insert(Cmp);
+      for (Instruction *IVInst : IVInsts) {
+        bool IsSimplifiedAway = true;
+        for (auto *UIV : IVInst->users()) {
+          if (!Legal->isInductionVariable(UIV) && UIV != Cmp) {
+            IsSimplifiedAway = false;
+            break;
+          }
+        }
+        if (IsSimplifiedAway)
+          CostCtx.SkipCostComputation.insert(IVInst);
+      }
+    }
+
     for (Instruction *IVInst : IVInsts) {
       if (CostCtx.skipCostComputation(IVInst, VF.isVector()))
         continue;

``````````

</details>


https://github.com/llvm/llvm-project/pull/106699