[PATCH] D79485: [BPI] Improve static heuristics for "cold" paths.

Yevgeny Rouban via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 12 04:05:21 PDT 2020


yrouban added a comment.

I'm still reading. The patch is long but I'd like to avoid splitting it into pieces that might result in massive tests changes (back and forth). Now the test changes seem to be compact.
It would be great if someone could join the reviewing efforts.



================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:379
+
+  /// Return estimated weight for \p Edge. Returns None if estimated weight is
+  /// unknown.
----------------
Returns


================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:391
+
+  /// If \p LoopBB has no estimated weight then set it to \p BBWeight and
+  /// return true. Otherwise \p BB's weight remains unchanged and false is
----------------
... then sets ... and returns true.


================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:393
+  /// return true. Otherwise \p BB's weight remains unchanged and false is
+  /// returned. In addition all blocks/loops potentially affected by \p BB's
+  /// weight change are put into BlockWorkList/LoopWorkList.
----------------
change: ... all blocks/loops potentially affected by ...
to: ... all blocks/loops that might need their weight re-estimated after ...
or something similar in meaning.


================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:399
+
+  /// Starting from \p LoopBB (including \p LoopBB itself) propagate \p BBWeight
+  /// up the domination tree.
----------------
... propagates ...


================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:410
+
+  /// Based on computed weights by \p computeEstimatedBlockWeight set
+  /// probabilities on branches.
----------------
... sets ...


================
Comment at: llvm/include/llvm/Analysis/BranchProbabilityInfo.h:208
+  uint32_t getSccBlockInfo(const BasicBlock *BB, int SccNum) const;
+  void setSccBlockInfo(const BasicBlock *BB, int SccNum);
+  bool isSCCHeader(const BasicBlock *BB, int SccNum) const {
----------------
john.brawn wrote:
> Probably calculateSccBlockInfo would be better here.
agree, the simple inplace definition would simplify the reading.


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:113
+/// This is minimal possible non zero weight.
+static const uint32_t VERY_LOW_WEIGHT = 0x1;
+
----------------
may be rename to LOWEST_NON_ZERO_WEIGHT or ULP_WEIGHT to explicitly denote its meaning?


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:116
+/// Weight to an 'unreachable' block.
+static const uint32_t UNREACHABLE_WEIGHT = ZERO_WEIGHT;
+
----------------
static_assert(UNREACHABLE_WEIGHT< VERY_LOW_WEIGHT)?


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:131
+/// COLD_WEIGHT have meaning relatively to each other and DEFAULT_WEIGHT only.
+static const uint32_t DEFAULT_WEIGHT = 0xfffff;
 
----------------
I suggest that we introduce the UNKNOWN_WEIGHT for most of the DEFAULT_WEIGHT uses to better reflect its meaning and the way it is treated.
DEFAULT_WEIGHT should be used only for UNKNOWN_WEIGHT at the very last step of BPI calculation (after propagation).



================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:600
+Optional<uint32_t>
+BranchProbabilityInfo::getEstimatedBlockWeight(const BasicBlock *BB) const {
+  auto WeightIt = EstimatedBlockWeight.find(BB);
----------------
could be defined along with its declaration for ease of reading


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:608
+Optional<uint32_t>
+BranchProbabilityInfo::getEstimatedLoopWeight(const LoopData &L) const {
+  auto WeightIt = EstimatedLoopWeight.find(L);
----------------
could be defined along with its declaration for ease of reading


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:616
+Optional<uint32_t>
+BranchProbabilityInfo::getEstimatedEdgeWeight(const LoopEdge &Edge) const {
+  // For edges entering a loop take weight of a loop rather than an individual
----------------
could be defined along with its declaration for ease of reading


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:654
+  BasicBlock *BB = LoopBB.getBlock();
+  std::tie(std::ignore, IsInserted) =
+      EstimatedBlockWeight.insert({BB, BBWeight});
----------------
early return would be easier to read and would not need std::tie and std::ignore:

  if (!EstimatedBlockWeight.insert({BB, BBWeight}).second)
    return false;
  ...
  return true;


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:657-658
+  if (IsInserted)
+    for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
+      auto *PredBlock = *PI;
+      LoopBlock PredLoop = getLoopBlock(PredBlock);
----------------
the idiom:
  for (BasicBlock *P : predecessors(BB))


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:674
+// Importantly, we skip loops here for two reasons. First weights of blocks in
+// a loop should be scaled by TC (yet possibly unknown). Second there is no
+// any value in doing that because that doesn't give any additional information
----------------
change TC to trip count


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:684
+    const LoopBlock &LoopBB, DominatorTree *DT, PostDominatorTree *PDT,
+    uint32_t BBWeight, SmallVectorImpl<BasicBlock *> &WorkList,
+    SmallVectorImpl<LoopBlock> &LoopWorkList) {
----------------
rename WorkList to BlockWorkList as it is in computeEstimatedBlockWeight().


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:703
+    if (!isLoopEnteringExitingEdge(Edge)) {
+      if (!updateEstimatedBlockWeight(DomLoopBB, BBWeight, WorkList,
+                                      LoopWorkList))
----------------
could it result in a deep recursion? may be a worklist would be better?


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:705
+                                      LoopWorkList))
+        // If DomBB has weight set then all it's predecessor are already
+        // processed (since we propagate weight up to the top of IR each time).
----------------
predecessors


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:716
+// 'unreachable', 'noreturn', 'cold', 'unwind' blocks. In addition it does its
+// best to propogate the weight to up/down the IR.
+void BranchProbabilityInfo::computeEstimatedBlockWeight(
----------------
propagate


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:717
+// best to propogate the weight to up/down the IR.
+void BranchProbabilityInfo::computeEstimatedBlockWeight(
+    const Function &F, DominatorTree *DT, PostDominatorTree *PDT) {
----------------
could be renamed to //estimateBlockWeights()// (plural)


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:723
+  // Returns true if \p BB has call marked with "NoReturn" attribute.
+  auto IsNeverReturns = [&](const BasicBlock *BB) {
+    if (BB->getTerminator()->getNumSuccessors() > 0)
----------------
'is never returns' sounds strange. Not sure but may be IsNeverReturn? IsNeverReturning? IsDeadEnd?
The function is used only once and only with blocks of two kinds:
1. blocks that have their terminator instruction preceded with a deoptimize call. So the terminator must be the ret instruction.
2. blocks with //unreachable// terminator instruction.
Both cases imply no block successors. So it is better to have 
assert(BB->getTerminator()->getNumSuccessors() == 0)
Otherwise (for a generic function) there could be a question: why this condition have a lower priority than a call with Attribute::NoReturn?
  call @foo() noreturn
  br label %next



================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:738
+  ReversePostOrderTraversal<const Function *> RPOT(&F);
+  for (const auto *BB : RPOT) {
+    Optional<uint32_t> BBWeight;
----------------
if we extracted a lambda //estimateBlockWeight// (which is worth to be a separate member function with its description) then the structure would be concise:

  auto estimateBlockWight = [&](const BasicBlock *BB) -> Optional<uint32_t>{
    if (isa<UnreachableInst>(BB->getTerminator()) || BB->getTerminatingDeoptimizeCall())
      return IsNeverReturns(BB) ? NORETURN_WEIGHT : UNREACHABLE_WEIGHT;

    for (const auto *Pred : predecessors(BB))
      if (const auto *II = dyn_cast<InvokeInst>(Pred->getTerminator()))
        if (II->getUnwindDest() == BB)
          return UNWIND_WEIGHT;

    for (const auto &I : *BB)
      if (const CallInst *CI = dyn_cast<CallInst>(&I))
        if (CI->hasFnAttr(Attribute::Cold))
          return COLD_WEIGHT;

    return None;
  };

  ReversePostOrderTraversal<const Function *> RPOT(&F);
  for (const auto *BB : RPOT)
    if (auto BBWeight = estimateBlockWight(BB))
      propagateEstimatedBlockWeight(getLoopBlock(BB), DT, PDT,
                                    BBWeight.getValue(), BlockWorkList,
                                    LoopWorkList);



================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:750-752
+      BBWeight = UNREACHABLE_WEIGHT;
+      if (IsNeverReturns(BB)) {
+        BBWeight = NORETURN_WEIGHT;
----------------
  BBWeight = IsNeverReturns(BB) ? NORETURN_WEIGHT : UNREACHABLE_WEIGHT;



================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:759
+      for (const auto *Pred : predecessors(BB))
+        if (Pred)
+          if (const auto *II = dyn_cast<InvokeInst>(Pred->getTerminator()))
----------------
not needed?


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:844
+  uint32_t TC = LBH_TAKEN_WEIGHT / LBH_NONTAKEN_WEIGHT;
+  if (LoopBB.getLoop()) {
+    computeUnlikelySuccessors(BB, LoopBB.getLoop(), UnlikelyBlocks);
----------------
remove braces { .. }


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:643
+
+bool BranchProbabilityInfo::updateEstimatedBlockWeight(
+    LoopBlock &LoopBB, uint32_t BBWeight,
----------------
ebrevnov wrote:
> ebrevnov wrote:
> > davidxl wrote:
> > > document the method and params.
> > sure
> > sure
> 
> In fact, there is a documentation in header file. In source file I can put more details on behavior in case of rewrite attempt as you requested bellow. 
I would remove this doc from the definition (to not deviate from the doc at the declaration) and put the note inside the body.


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:651
+      EstimatedBlockWeight.insert({BB, BBWeight});
+  if (IsInserted)
+    for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
----------------
ebrevnov wrote:
> davidxl wrote:
> > explain here. Why the early estimated weight takes precedence?
> will add a comment.
please explain (what if previous weight is not equal to the new, may be assert(PrevWeight <= BBWeight)?)


================
Comment at: llvm/lib/Analysis/BranchProbabilityInfo.cpp:126
+/// with attribute 'cold'.
+static const uint32_t COLD_WEIGHT = 0xffff;
+
----------------
ebrevnov wrote:
> davidxl wrote:
> > ebrevnov wrote:
> > > davidxl wrote:
> > > > why is it set to 0xffff?
> > > Short answer is to preserve same branch probability as before.
> > > Old implementation assigned CC_NONTAKEN_WEIGHT weight to "normal" path and  CC_TAKEN_WEIGHT to "cold" path. Absolute values are irrelevant and ratio CC_NONTAKEN_WEIGHT/CC_TAKEN_WEIGHT = 16 defines relative probability of "normal" and "cold" branches. New implementation uses  pre-selected weight for all "nornal" paths DEFAULT_WEIGHT. Thus relative ratio is calculated as DEFAULT_WEIGHT/COLD_WEIGHT = 0xfffff/0xffff = 16.
> > Why not define a value for DEFAULT_WEIGHT, and define COLD_WEIGHT to be 1/16 of the DEFAULT weight? It is more readable that way.
> This is just the way how weights were defined before (using direct numbers instead of ratio). Please note all other cases (like ZH_TAKEN_WEIGHT, ZH_NONTAKEN_WEIGHT)  don't use ratio as well. Another theoretical reason could be an ability to represent ratios with non zero remainder.
1. group the new related *_WEIGHT into one enum (to distinguish them from the other *_WEIGH constants)
2. explicitly describe their values (e.g. why COLD_WEIGHT is X times as high as LOWEST_NONZERO_WEIGHT)
3. avoid using hex notation unless it makes sense (e.g. important for bitwise operations)
Something like the following:
  enum {
    ZERO_WEIGHT = 0,
    UNREACHABLE_WEIGHT = ZERO_WEIGHT,
    LOWEST_NONZERO_WEIGHT = 1,
    NORETURN_WEIGHT = LOWEST_NONZERO_WEIGHT,
    UNWIND_WEIGHT = LOWEST_NONZERO_WEIGHT,
    COLD_WEIGHT = 65535 * LOWEST_NONZERO_WEIGHT,
    DEFAULT_WEIGHT = 64 * COLD_WEIGHT
  };


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79485/new/

https://reviews.llvm.org/D79485



More information about the llvm-commits mailing list