[PATCH] D10825: Improvement on computing edge probabilities when choosing the best successor in machine block placement.

Fri Jul 24 19:59:10 PDT 2015

chandlerc added inline comments.

================
Comment at: lib/CodeGen/MachineBlockPlacement.cpp:365-371
@@ -364,4 +364,9 @@
   uint32_t WeightScale = 0;
   uint32_t SumWeight = MBPI->getSumForBlock(BB, WeightScale);
-  DEBUG(dbgs() << "Attempting merge from: " << getBlockName(BB) << "\n");
+
+  // Adjust sum of weights by excluding weights on edges pointing to blocks that
+  // cannot be BB's successor in the chain.
+  uint32_t AdjustedSumWeight = SumWeight;
+  SmallVector<MachineBasicBlock *, 4> Successors;
   for (MachineBasicBlock *Succ : BB->successors()) {
+    bool SkipSucc = false;
----------------
This is made incredibly annoying because we don't rescale the branch probabilities in MBPI the way we do in BPI, and so we don't know for a fact that they fit in 32-bits.

I think we need to fix MBPI to automatically scale all the weights so that the sum of weights is known to fit in 32-bits. Once that's done, WeightScale and all the code in getSumForBlock to rescale things goes away.

Then, here, you should change this to just compute the sum directly rather than getting the sum from MBPI and then adjusting it after-the-fact.

Also, I would just embed the predicate into a lambda, and write two loops over BB->successors() rather than building up a vector of successors.

================
Comment at: lib/CodeGen/MachineBlockPlacement.cpp:381-383
@@ +380,5 @@
+        SkipSucc = true;
+      } else if (Succ != *SuccChain->begin()) {
+        DEBUG(dbgs() << "    " << getBlockName(Succ) << " -> Mid chain!\n");
+        SkipSucc = true;
+      }
----------------
This is a bigger change than the rest, and I think it should go in separately.

The rest of this change is a clear bug-fix for how this code works, and will make the "hot" detection actually reasonable below. This change, however, is shifting a core thing to prefer to outline all optional loop bodies which don't have fallthrough into them. That, to me, is a very significant change, and I'd rather look at it (and test cases and benchmarks showing it is clearly the right call) independently of fixing how we compute the probability of 1 of N successors.


http://reviews.llvm.org/D10825