[llvm] [LV] Ignore some costs when loop gets fully unrolled (PR #106699)

Wed Nov 6 05:40:04 PST 2024

================
@@ -5544,14 +5544,28 @@ InstructionCost LoopVectorizationCostModel::computePredInstDiscount(
 InstructionCost LoopVectorizationCostModel::expectedCost(ElementCount VF) {
   InstructionCost Cost;
 
+  // If with the given VF loop gets fully unrolled, ignore the costs of
+  // comparison and induction instructions, as they'll get simplified away
+  SmallPtrSet<const Value *, 16> ValuesToIgnoreForVF;
+  auto TC = PSE.getSE()->getSmallConstantTripCount(TheLoop);
+  auto *Cmp = TheLoop->getLatchCmpInst();
+  if (Cmp && TC == VF.getKnownMinValue()) {
----------------
david-arm wrote:

At the moment you're also including scalable VFs here and I think the only way we can remove the compare, branch and increment for a scalable VF is if:

1. We are not tail-folding. Have you checked that we still remove the branch and compare for scalable VFs?
2. We are tail-folding. In this case you can actually check if TC <= VF.getKnownMinValue(). If you still want to enable this optimisation for tail-folding too then it makes sense to try <= instead for fixed-width and scalable VFs and see if we also remove the branch and compare.

It might be worth pulling out some of the common code in both `expectedCost` and `precomputeCosts` to move into a static function.


https://github.com/llvm/llvm-project/pull/106699