[PATCH] D79485: [BPI] Improve static heuristics for "cold" paths.

Wed Dec 23 04:27:49 PST 2020

yrouban accepted this revision.
yrouban added a comment.
This revision is now accepted and ready to land.

LGTM if my latest comments are addressed.
Let's give this big work a try.

================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:65
+/// UNREACHABLE_WEIGHT) are evaluated according to static estimation and
+/// overrides profile information. If no branch probabilities were calculated
+/// on this step then take the next one.
----------------
override

================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:70
+/// statically known information. Roots of such information are "cold",
+/// "unreachable" and "unwind" blocks. Predefined absolute weight is set for
+/// each such block and propagated to all blocks known to have the
----------------
noreturn blocks?

================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:71
+/// "unreachable" and "unwind" blocks. Predefined absolute weight is set for
+/// each such block and propagated to all blocks known to have the
+/// same execution weight. In addition, if all successors have estimated weights
----------------
//each such// sounds strange. let's rephrase:
Those blocks get their weights set to BlockExecWeight::UNREACHABLE, BlockExecWeight::NORETURN, BlockExecWeight::UNWIND and BlockExecWeight::COLD respectively. Then the weights are propagated to the other blocks up the domination line.

================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:75
+/// is not ideal heuristic in theory it's simple and works reasonably well in
+/// most cases) and the process repeats. All blocks with no explicit weight set
+/// are assumed to have default weight. Weights of loops' back branches are
----------------
We should say about default weight in context of branch probability calculation.
//... the process repeats. Once the process of weights propagation converges we set branch probabilities. It is done for all blocks with at lest one successor with its weight set. For successors without weights we use the default execution weight (BlockExecWeight::DEFAULT). For loop back branches we use their weights scaled by ...//

================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:84
+///
+///          BR1
+///         /   \
----------------
may be rename BR1 to BB1 and BR2 to BB2? 'R' seems redundant.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:139-140
+  COLD = 0xffff,
+  /// Weight associated with all blocks by default if no other information is
+  /// available.
+  DEFAULT = 0xfffff
----------------
I would write that the default weight is used to calculate branch probability of edges which destination blocks does not have their estimated execution weight. The default weight is not set as estimated execution weight and thus is not propagated through domination line.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:347
+        EstimatedWeight.getValue() <=
+            static_cast<uint32_t>(BlockExecWeight::UNREACHABLE))
       UnreachableIdxs.push_back(I - 1);
----------------
why do we need this static_cast? they are so many in the code. can we get rid of them?

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:690
+
+  for (const auto *DTNode = DTStartNode; DTNode != nullptr;
+       DTNode = DTNode->getIDom()) {
----------------
Please add a TODO: In addition to this propagation up the domination line consider propagating down the domination line.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:719
+    for (const auto &I : *BB)
+      if (const CallInst *CI = dyn_cast<CallInst>(&I))
+        if (CI->hasFnAttr(Attribute::NoReturn))
----------------
does it make sense to reverse iterate? noreturn calls are likely close to block ends

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:778
+  do {
+    while (!LoopWorkList.empty()) {
+      const LoopBlock LoopBB = LoopWorkList.pop_back_val();
----------------
please add comment:
  // Process loops and blocks. Order is not important.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:825
+// Note that gathered weights were not scaled for loops. Thus edges entering
+// and exiting loops requires special processing.
+bool BranchProbabilityInfo::calcEstimatedHeuristics(const BasicBlock *BB) {
----------------
... require

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:854-858
+      Weight =
+          Weight.getValueOr(static_cast<uint32_t>(BlockExecWeight::DEFAULT)) /
+          TC;
+      if (Weight.getValue() == static_cast<uint32_t>(BlockExecWeight::ZERO))
+        Weight = static_cast<uint32_t>(BlockExecWeight::LOWEST_NON_ZERO);
----------------
shorter
  Weight = std::max(BlockExecWeight::LOWEST_NON_ZERO, Weight.getValueOr(static_cast<uint32_t>(BlockExecWeight::DEFAULT) / TC);

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:865
+      // 'Unlikely' blocks have twice lower weight.
+      Weight =
+          Weight.getValueOr(static_cast<uint32_t>(BlockExecWeight::DEFAULT)) /
----------------
Weight = std::max(BlockExecWeight::LOWEST_NON_ZERO, ...);

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:881

-  if (uint32_t numUnlikelyEdges = UnlikelyEdges.size()) {
-    BranchProbability UnlikelyProb = BranchProbability(LBH_UNLIKELY_WEIGHT,
-                                                       Denom);
-    auto Prob = UnlikelyProb / numUnlikelyEdges;
-    for (unsigned SuccIdx : UnlikelyEdges)
-      EdgeProbabilities[SuccIdx] = Prob;
+  // If non of blocks have estimated weight bail out.
+  // If TotalWeight is 0 that means weight of each successor is 0 as well and
----------------
none

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:882
+  // If non of blocks have estimated weight bail out.
+  // If TotalWeight is 0 that means weight of each successor is 0 as well and
+  // equally likely. Bail out early to not deal with devision by zero.
----------------
If TotalWeight is zero then weight of ...

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:116
+/// Weight to an 'unreachable' block.
+static const uint32_t UNREACHABLE_WEIGHT = ZERO_WEIGHT;
+
----------------
ebrevnov wrote:
> yrouban wrote:
> > static_assert(UNREACHABLE_WEIGHT< VERY_LOW_WEIGHT)?
> Why? Currently, this assert holds but there is nothing preventing us from increasing UNREACHABLE_WEIGHT if needed by the cost model.
The word //unreachable// means 'cannot be reached' and thus cannot be executed. How this execution weight be anything except zero? Any non-zero weight for unreachable successors would result in the sum of probabilities of all reachable successors be less than 1.0. I do not insist though.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:758
+    if (!BBWeight)
+      for (const auto *Pred : predecessors(BB))
+        if (Pred)
----------------
ebrevnov wrote:
> yrouban wrote:
> > for the //unwind// property we do not need to iterate over all predecessors.
> > if block has one unwind predecessors then all its predecessors must be unwind (that is because the block must start with a landing pad). I would suggest to commit this and check only one of the predecessors.
> Then it will always stop on the first predecessor, right? May unwind have more than one predecessor at all?
I believe that unwind blocks can have more than one predecessor as one catch block for several callsites.

================
Comment at: llvm/test/Analysis/BranchProbabilityInfo/basic.ll:156-157
   br i1 %cond2, label %header, label %exit
-; CHECK: edge body -> header probability is 0x40000000 / 0x80000000 = 50.00%
-
+;CHECK: edge body -> header probability is 0x7fbe1203 / 0x80000000 = 99.80% [HOT edge]
+;CHECK: edge body -> exit probability is 0x0041edfd / 0x80000000 = 0.20%
 exit:
----------------
space between ; and CHECK

================
Comment at: llvm/test/Analysis/BranchProbabilityInfo/basic.ll:275
 ; The cold call heuristic should not kick in when the cold callsite is in EH path.
-; CHECK:  edge if.then -> invoke.cont probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]
-; CHECK:  edge if.then -> lpad probability is 0x00000800 / 0x80000000 = 0.00%
+;CHECK: edge if.then -> invoke.cont probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]
+;CHECK: edge if.then -> lpad probability is 0x00000800 / 0x80000000 = 0.00%
----------------
space

================
Comment at: llvm/test/Analysis/BranchProbabilityInfo/deopt-invoke.ll:43
+define void @test2(i32 %0) personality i32* ()* @"personality_function" {
+;CHECK: edge entry -> unreached probability is 0x00000000 / 0x80000000 = 0.00%
+;CHECK: edge entry -> invoke probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
----------------
space

================
Comment at: llvm/test/Analysis/BranchProbabilityInfo/loop.ll:528
+define void @test13() {
+;CHECK: edge entry -> loop probability is 0x078780e3 / 0x80000000 = 5.88%
+;CHECK: edge entry -> exit probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]
----------------
space

================
Comment at: llvm/test/Analysis/BranchProbabilityInfo/loop.ll:552
+define void @test14() {
+;CHECK: edge entry -> preheader probability is 0x078780e3 / 0x80000000 = 5.88%
+;CHECK: edge entry -> exit probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]
----------------
space please

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79485/new/

https://reviews.llvm.org/D79485