[llvm] [LV] Use vscale for tuning to improve branch weight estimates (PR #144733)

Tue Jul 1 04:06:41 PDT 2025

================
@@ -7326,9 +7326,11 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
                            OrigLoop->getHeader()->getContext());
   VPlanTransforms::runPass(VPlanTransforms::replicateByVF, BestVPlan, BestVF);
   VPlanTransforms::runPass(VPlanTransforms::materializeBroadcasts, BestVPlan);
-  if (hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator()))
+  if (hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) {
+    std::optional<unsigned> VScale = CM.getVScaleForTuning();
     VPlanTransforms::runPass(VPlanTransforms::addBranchWeightToMiddleTerminator,
-                             BestVPlan, BestVF);
+                             BestVPlan, BestVF, VScale);
----------------
paulwalker-arm wrote:

Is this the best interface?  From what I can tell we basically want to set the branch weights based on an assumed number of scalar loop iterations that is performed by a single vector loop iteration.  However, we do this by passing in three bits of state (1) A plan to get the interleave factor, (2) the vectorisation factor (VF) and now (3) the value of vscale that should be used when tuning based on a known vector length.

Can the plan contain all this information? and/or perhaps the interface to this (and similar) functions should be `addBranchWeightToMiddleTerminator(unsigned EstimatedScalarIterationsPerVectorLoop)`. I guess with the latter then it might be worth having a function along the lines of `unsigned getEstimatedScalarIterationsPerVectorLoop()`?

https://github.com/llvm/llvm-project/pull/144733