[llvm] [VPlan] Don't apply predication discount to non-originally-predicated blocks (PR #160449)

Wed Sep 24 02:14:02 PDT 2025

lukel97 wrote:

I tried out accounting for tail folding using the estimated trip count:

```c++
  inline unsigned
  getPredBlockCostDivisor(TargetTransformInfo::TargetCostKind CostKind,
                          BasicBlock *BB, ElementCount VF) const {
    if (!Legal->blockNeedsPredication(BB)) {
      if (foldTailByMasking()) {
        if (auto TC = getSmallBestKnownTC(PSE, TheLoop)) {
          unsigned ETC = estimateElementCount(*TC, getVScaleForTuning());
          unsigned EVF = estimateElementCount(VF, getVScaleForTuning());
	  return std::max(1U, EVF / ETC);
        }
      }
      return 1;
    }
    return CostKind == TTI::TCK_CodeSize ? 1 : 2;
  }
```

But there's no difference on any of the in-tree tests. My guess is that the discount really only occurs with a high enough VF, at which point the cost of scalarization for every lane begins to outweight any discount of the blocks potentially not being executed.

https://github.com/llvm/llvm-project/pull/160449