[llvm-branch-commits] [llvm] [LoopUnroll] Fix freqs for unconditional latches: N<=2 (PR #179520)
Johannes Doerfert via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Wed Apr 8 16:15:35 PDT 2026
================
@@ -439,6 +440,224 @@ static bool canHaveUnrollRemainder(const Loop *L) {
return true;
}
+// If LoopUnroll has proven OriginalLoopProb is incorrect for some iterations
+// of the original loop, adjust latch probabilities in the unrolled loop to
+// maintain the original total frequency of the original loop body.
+//
+// OriginalLoopProb is practical but imprecise
+// -------------------------------------------
+//
+// The latch branch weights that LLVM originally adds to a loop encode one latch
+// probability, OriginalLoopProb, applied uniformly across the loop's infinite
+// set of theoretically possible iterations. While this uniform latch
+// probability serves as a practical statistic summarizing the trip counts
+// observed during profiling, it is imprecise. Specifically, unless it is zero,
+// it is impossible for it to be the actual probability observed at every
+// individual iteration. To see why, consider that the only way to actually
+// observe at run time that the latch probability remains non-zero is to profile
+// at least one loop execution that has an infinite number of iterations. I do
+// not know how to profile an infinite number of loop iterations, and most loops
+// I work with are always finite.
+//
+// LoopUnroll proves OriginalLoopProb is incorrect
+// ------------------------------------------------
+//
+// LoopUnroll reorganizes the original loop so that loop iterations are no
+// longer all implemented by the same code, and then it analyzes some of those
+// loop iteration implementations independently of others. In particular, it
+// converts some of their conditional latches to unconditional. That is, by
+// examining code structure without any profile data, LoopUnroll proves that the
+// actual latch probability at the end of such an iteration is either 1 or 0.
+// When an individual iteration's actual latch probability is 1 or 0, that means
+// it always behaves the same, so it is impossible to observe it as having any
+// other probability. The original uniform latch probability is rarely 1 or 0
+// because, when applied to all possible iterations, that would yield an
+// estimated trip count of infinity or 1, respectively.
+//
+// Thus, the new probabilities of 1 or 0 are proven corrections to
+// OriginalLoopProb for individual iterations in the original loop. However,
+// LoopUnroll often is able to perform these corrections for only some
+// iterations, leaving other iterations with OriginalLoopProb, and thus
+// corrupting the aggregate effect on the total frequency of the original loop
+// body.
+//
+// Adjusting latch probabilities
+// -----------------------------
+//
+// This function ensures that the total frequency of the original loop body,
+// summed across all its occurrences in the unrolled loop after the
+// aforementioned latch conversions, is the same as in the original loop. To do
+// so, it adjusts probabilities on the remaining conditional latches. However,
+// it cannot derive the new probabilities directly from the original uniform
+// latch probability because the latter has been proven incorrect for some
+// original loop iterations.
+//
+// There are often many sets of latch probabilities that can produce the
+// original total loop body frequency. For now, this function computes uniform
+// probabilities when the number of remaining conditional latches is <= 2 and
+// does not handle other cases.
+static void fixProbContradiction(UnrollLoopOptions ULO,
+ BranchProbability OriginalLoopProb,
+ bool CompletelyUnroll,
+ std::vector<unsigned> &IterCounts,
+ const std::vector<BasicBlock *> &CondLatches,
+ std::vector<BasicBlock *> &CondLatchNexts) {
+ // Runtime unrolling is handled later in LoopUnroll not here.
+ //
+ // There are two scenarios in which LoopUnroll sets ProbUpdateRequired to true
+ // because it needs to update probabilities that were originally
+ // OriginalLoopProb, but only in one scenario has LoopUnroll proven
+ // OriginalLoopProb incorrect for iterations within the original loop:
+ // - If ULO.Runtime, LoopUnroll adds new guards that enforce new reaching
+ // conditions for new loop iteration implementations (e.g., one unrolled
+ // loop iteration executes only if at least ULO.Count original loop
+ // iterations remain). Those reaching conditions dictate how conditional
+ // latches can be converted to unconditional (e.g., within an unrolled loop
+ // iteration, there is no need to recheck the number of remaining original
+ // loop iterations). None of this reorganization alters the set of possible
+ // original loop iteration counts or proves OriginalLoopProb incorrect for
+ // any of the original loop iterations. Thus, LoopUnroll derives
+ // probabilities for the new guards and latches directly from
+ // OriginalLoopProb based on the probabilities that their reaching
+ // conditions would occur in the original loop. Doing so maintains the
+ // total frequency of the original loop body.
+ // - If !ULO.Runtime, LoopUnroll initially adds new loop iteration
+ // implementations, which have the same latch probabilities as in the
+ // original loop because there are no new guards that change their reaching
+ // conditions. Sometimes, LoopUnroll is then done, and so does not set
+ // ProbUpdateRequired to true. Other times, LoopUnroll then proves that
+ // some latches are unconditional, directly contradicting OriginalLoopProb
+ // for the corresponding original loop iterations. That reduces the set of
+ // possible original loop iteration counts, possibly producing a finite set
+ // if it manages to eliminate the backedge. LoopUnroll has to choose a new
+ // set of latch probabilities that produce the same total loop body
+ // frequency.
+ //
+ // This function addresses the second scenario only.
+ if (ULO.Runtime)
+ return;
+
+ // If CondLatches.empty(), there are no latch branches with probabilities we
+ // can adjust. That should mean that the actual trip count is always exactly
+ // the number of remaining unrolled iterations, and so OriginalLoopProb should
+ // have yielded that trip count as the original loop body frequency. Of
+ // course, OriginalLoopProb could be based on inaccurate profile data, but
+ // there is nothing we can do about that here.
+ if (CondLatches.empty())
+ return;
+
+ // If the original latch probability is 1, the original frequency is infinity.
+ // Leaving all remaining probabilities set to 1 might or might not get us
+ // there (e.g., a completely unrolled loop cannot be infinite), but it is the
+ // closest we can come.
+ assert(!OriginalLoopProb.isUnknown() &&
+ "Expected to have loop probability to fix");
+ if (OriginalLoopProb.isOne())
+ return;
+
+ // FreqDesired is the frequency implied by the original loop probability.
+ double FreqDesired = 1 / (1 - OriginalLoopProb.toDouble());
+
+ // Set the probability at CondLatches[I] to Prob.
+ auto SetProb = [&](unsigned I, double Prob) {
+ CondBrInst *B = cast<CondBrInst>(CondLatches[I]->getTerminator());
+ bool FirstTargetIsNext = B->getSuccessor(0) == CondLatchNexts[I];
+ setBranchProbability(B, BranchProbability::getBranchProbability(Prob),
+ FirstTargetIsNext);
+ };
+
+ // Set all probabilities in CondLatches to Prob.
+ auto SetAllProbs = [&](double Prob) {
+ for (unsigned I = 0, E = CondLatches.size(); I < E; ++I)
+ SetProb(I, Prob);
+ };
+
+ // If n <= 2, we choose the simplest probability model we can think of: every
+ // remaining conditional branch instruction has the same probability, Prob,
+ // of continuing to the next iteration. This model has several helpful
+ // properties:
+ // - We have no reason to think one latch branch's probability should be
+ // higher or lower than another, and so this model makes them all the same.
----------------
jdoerfert wrote:
I am not sure about this. We have their current probabilities, no?
So if latch 1 has prob p and latch 2 has prob q, couldn't we preserve the ratio p/q for the new probabilities we pick?
For the overall trip count, it doesn't matter, but for probabilities in the loop, it does, right?
This also only affects the N >= 2 case.
The equation system we get for N == 2 would then be
```
A * p' * q' + B * q' + C = 0
p' / q' = p / q
```
or
```
A * p' * p' * q / p + B * p' * q / p + C = 0
```
https://github.com/llvm/llvm-project/pull/179520
More information about the llvm-branch-commits
mailing list