[PATCH] D79485: [BPI] Improve static heuristics for "cold" paths.

Mon Aug 10 21:50:09 PDT 2020

davidxl added a comment.

A lot of test changes can probably be extracted out as NFC (to make the string test more robust) -- this will reduce the size of the patch.

Do you have performance numbers related to the change?

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:126
+/// with attribute 'cold'.
+static const uint32_t COLD_WEIGHT = 0xffff;
+
----------------
ebrevnov wrote:
> davidxl wrote:
> > why is it set to 0xffff?
> Short answer is to preserve same branch probability as before.
> Old implementation assigned CC_NONTAKEN_WEIGHT weight to "normal" path and  CC_TAKEN_WEIGHT to "cold" path. Absolute values are irrelevant and ratio CC_NONTAKEN_WEIGHT/CC_TAKEN_WEIGHT = 16 defines relative probability of "normal" and "cold" branches. New implementation uses  pre-selected weight for all "nornal" paths DEFAULT_WEIGHT. Thus relative ratio is calculated as DEFAULT_WEIGHT/COLD_WEIGHT = 0xfffff/0xffff = 16.
Why not define a value for DEFAULT_WEIGHT, and define COLD_WEIGHT to be 1/16 of the DEFAULT weight? It is more readable that way.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:643
+
+bool BranchProbabilityInfo::updateEstimatedBlockWeight(
+    LoopBlock &LoopBB, uint32_t BBWeight,
----------------
document the method and params.

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:651
+      EstimatedBlockWeight.insert({BB, BBWeight});
+  if (IsInserted)
+    for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
----------------
explain here. Why the early estimated weight takes precedence?

================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:731
+  // By doing RPO we make sure that all predecessors already have weights
+  // calculated before visiting theirs successors.
+  ReversePostOrderTraversal<const Function *> RPOT(&F);
----------------
why doing RPO walk? the algorithm does not seem to depend on this order.

================
Comment at: llvm/test/Analysis/BlockFrequencyInfo/redundant_edges.ll:12

-; CHECK-NEXT: loop: float = 32.0
+; CHECK-NEXT: loop: float = 16.5
 loop:
----------------
why is the new estimation better?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79485/new/

https://reviews.llvm.org/D79485