[PATCH] D79485: [BPI] Improve static heuristics for "cold" paths.

Fri Dec 11 03:12:30 PST 2020

yrouban added inline comments.
Herald added a subscriber: pengfei.

================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:383
+
+  /// Iterates over all edges leading from \p SrcBB to \p Successors and
+  /// returns maximum of all estimated weights. If at least one edge has unknown
----------------
There could be no edge from SrcBB to Successor, but there must be a loop edge. Please rephrase a bit.
... edges leading from loop of SrcBB to loops of Successors' block ...

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:628
+  SmallVector<uint32_t, 4> Weights;
+  Optional<uint32_t> MaxWeight;
+  for (const BasicBlock *DstBB : Successors) {
----------------
I would start with the VERY_LOW_WEIGHT. That is if there is no loop exits then the loop is infinite, that is it can be entered at most once.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:690
+
+  for (const auto *DTNode = DTStartNode; DTNode != nullptr;
+       DTNode = DTNode->getIDom()) {
----------------
It seems we have to run a similar loop for the PostDom tree.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:702
+    // Don't propagate weight to blocks belonging to different loops.
+    if (!isLoopEnteringExitingEdge(Edge)) {
+      if (!updateEstimatedBlockWeight(DomLoopBB, BBWeight, WorkList,
----------------
1. if Edge is a loop exiting edge then we can set weight of the loop of DomBB as it lays on the same dom-postdom line with BB. Then we do not need to add the loop to the worklist but have to add its entering blocks to BlockWorkList.
2. Avoid checking for loop exiting condition twice:
```
   if (isLoopExitingEdge(Edge)) // Check before isLoopEntering() as it might be also true.
     LoopWorkList.push_back(DomLoopBB);
   if (isLoopEnteringEdge(Edge))
     ; // Nothing todo. Why?
   else if (!updateEstimatedBlockWeight(...))
     break;
```
Please comment that we do not update block weights for blocks that are not in the same loop with BB. Instead we update their loop weights.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:741-743
+    // Important note regarding the order of checks. They are ordered by weight
+    // from lowest to highest. Doing that allows to avoid "unstable" results
+    // when several conditions heuristics can be applied simultaneously.
----------------
I'm not sure if this place is right but I suggest that we have a long description of the //Estimated Execution Weight//. As its definition starts from the basic blocks weight, I think this place is ok.
.........................................
Lets introduce a notion of Estimated Execution Weight (or just weight). It is defined in several steps.
It will be used to calculate branch probabilities according to the following rule:
if all block successors have their Estimated Execution Weights defined then the branch probabilities can be calculated as if the weights are just !prof branch_weights metadata:
BranchProbability[i] = BranchWeight [i]/( BranchWeight[0] + BranchWeight[1] + ... + BranchWeight[N-1])

Note that the Estimated Execution Weights are not mixed with the profile counters or with the branch_weights metedata. So, it is important for the rule to have all successors’ Estimated Execution Weights defined. If at least one branch undefined then the rule is not applicable.

Definition of Estimated Execution Weight for basic blocks is based on their properties:
Weight is defined for 4 kinds of basic blocks:
-	Unreachable (weight=0) - if the block terminator is the unreachable instruction;
-	Noreturn (weight=1) - if the block has a noreturn call;
-	Unwind (weight=1) - if the block is an unwind branch target (i.e. a landingpad block);
-	Cold (weight=65535) - if the block has a cold call;
These properties are listed in their priority to define the block weight. E.g. if a block is Cold and Noreturn than the Noreturn property defines its weight=1.

Example:
Weights {Cold, Unwind, Unreachable} =>
Probabilities {Cold / (Cold + Unwind + Unreachable), Unwind / (Cold + Unwind + Unreachable), Unreachable /( Cold + Unwind + Unreachable)} = 
{65535/ (65535 + 1 + 0), 1/ (65535 + 1 + 0), 0/(65535 + 1 + 0)}

It is worth mention that this block weight does not depend on number of block’s predecessors. In other words, these properties (//unreachable//, //noreturn//, //unwind//, //cold//) do not become any weaker or stronger if the block has more predecessors. I.e. any edge that ends with such block has its weight defined according to one of the properties of this block. 
Example:
```
       BB1       BB2
      /   \     /   \
ColdBB3   ColdBB4   UnwindBB5

```
The block ColdBB4 has the weight Cold as the block ColdBB3.
Probability(BB1->ColdBB3) = Probability(BB1->ColdBB4) = 50%
Probability(BB2->ColdBB4) = Cold / (Cold + Unwind)
Probability(BB2->UnwindBB5) =  Unwind / (Cold + Unwind)

.........................................
... to be continued for loop weights and edge weights

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:758
+    if (!BBWeight)
+      for (const auto *Pred : predecessors(BB))
+        if (Pred)
----------------
for the //unwind// property we do not need to iterate over all predecessors.
if block has one unwind predecessors then all its predecessors must be unwind (that is because the block must start with a landing pad). I would suggest to commit this and check only one of the predecessors.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:776
+
+    if (BBWeight) {
+      // If we were able to find estimated weight for the block set it to this
----------------
nit: remove { }

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:787
+  // successor/exit having estimated weight. Try to propagate weight to such
+  // blocks/loops from successors/exits.
+  while (!BlockWorkList.empty() || !LoopWorkList.empty()) {
----------------
.. BlockWorkList ...
LoopWorkList has a priority over BlockWorklist. Why?

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:790
+
+    if (!LoopWorkList.empty()) {
+      const LoopBlock LoopBB = LoopWorkList.pop_back_val();
----------------
'while' would be better and the last 'continue' would not be needed.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:793
+
+      if (EstimatedLoopWeight.count(LoopBB.getLoopData()))
+        continue;
----------------
why should we avoid calculating the loop weight? what if we find another weight for this loop?

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:799
+      auto MaxWeight = getMaxEstimatedEdgeWeight(
+          LoopBB, make_range(Exits.begin(), Exits.end()));
+
----------------
I would rename it to //LoopWeight// which is calculated as max of weights of loop exits but at least VERY_LOW_WEIGHT. This is the definition of the //loop weight//. Why/how is it relevant to block weights?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79485/new/

https://reviews.llvm.org/D79485