[PATCH] D10717: Enhance loop rotation with existence of profile data in MachineBlockPlacement pass.

Chandler Carruth chandlerc at gmail.com
Fri Jul 24 11:21:35 PDT 2015

chandlerc requested changes to this revision.
chandlerc added a comment.
This revision now requires changes to proceed.

Sorry for delays getting to this patch.

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:84-86
@@ -83,1 +83,5 @@
+static cl::opt<bool> UseProfileBasedCostAnalysis(
+    "use-profile-based-cost-analysis",
+    cl::desc("Use profile data to guide machine block placement."),
+    cl::init(false), cl::Hidden);
This seems confusing to me. We use profile data to guide many parts of block placement. This flag doesn't guard most of them. Could you select a more accurate name?

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:89-92
@@ +88,6 @@
+static cl::opt<unsigned> MisfetchCost(
+    "misfetch-cost",
+    cl::desc("The cost of misfetch due to conditional/unconditional jumps."),
+    cl::init(1), cl::Hidden);
It might be useful to clarify that this cost models the probabalistic risk of an *instruction* misfetch due to a jump relative to falling through?

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:94-96
@@ +93,5 @@
+static cl::opt<unsigned>
+    JumpInstCost("jump-inst-cost", cl::desc("The cost of jump instructions."),
+                 cl::init(1), cl::Hidden);
We have lots of existing measures of the cost of an instruction already. It would probably help to clarify this is a weight specifically designed to fit with the scale of the misfetch cost above? Not really sure how important it is to be able to tune these two costs independently.

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:269-271
@@ -252,2 +268,5 @@
   void buildCFGChains(MachineFunction &F);
+  bool useProfileData(MachineFunction &F) {
+    return UseProfileBasedCostAnalysis && F.getFunction()->getEntryCount();
+  }
My assumption is that the flag is expected to go away at some point. As such, I wouldn't embed a predicate here. I would sink this into the code that actually needs to test these properties to make decisions.

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:815-826
@@ +814,14 @@
+/// With profile data, we could determine the benefit in terms of fall through
+/// opportunities when rotating a loop chain and select the best rotation.
+/// Basically, there are three kinds of benefit costs to consider for each
+/// rotation:
+///    1. The fall through edge (if it exists) from BB out of the loop to the
+///    loop header.
+///    2. The fall through edge (if it exists) from the loop exit to BB out of
+///    the loop.
+///    3. The fall through edge (if it exists) from the last BB to the first BB
+///    in the loop chain.
+///  Therefore, the benefit for a given rotation is 1 + 2 - 3. We select the
+///  best rotation with the largest benefit.
+void MachineBlockPlacement::rotateLoopWithProfile(
"we could determine" -> "we can determine"

Also, I find the mixture of "benefit" and "cost" confusing in this comment. I would try to use more clear terms here.

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:831
@@ +830,3 @@
+  auto HeaderIter = std::find(LoopChain.begin(), LoopChain.end(), HeaderBB);
+  double BestRotationBenefit = -std::numeric_limits<double>::max();
+  auto RotationPos = LoopChain.end();
Please don't use floating point to compute these kinds of things.

We would like LLVM to not depend no the vagaries of the FPU and would like to avoid any possible numerical instability manifesting as non-deterministic compiles.

Instead, you should compute things using the BlockFrequency and BranchProbability abstract units which do the fixed-point math for you.

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:846-849
@@ +845,6 @@
+      double FallThruBenefit = EdgeFreq * MisfetchCost;
+      // If the predecessor has only an unconditional jump to the header, we
+      // need to consider the cost of this jump.
+      if (Pred->succ_size() == 1)
+        FallThruBenefit += EdgeFreq * JumpInstCost;
+      HeaderFallThruBenefit = std::max(HeaderFallThruBenefit, FallThruBenefit);
Why is a conditional jump's benefit not considered?

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:854-856
@@ +853,5 @@
+  for (auto Iter = LoopChain.begin(), TailIter = std::prev(LoopChain.end()),
+            EndIter = LoopChain.end();
+       Iter != EndIter; Iter++, TailIter++) {
+    if (TailIter == LoopChain.end())
There is no comment explaining the actual algorithm at this point. Please add one. Especially with a loop structure this complex (three iterators??) it would help to give the reader some context for how this loop is supposed to operate.

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:868
@@ +867,3 @@
+    if (Iter == HeaderIter)
+      Benefit += HeaderFallThruBenefit;
This seems to be a *cost* rather than a benefit?

Comment at: lib/CodeGen/MachineBlockPlacement.cpp:877-879
@@ +876,5 @@
+          (!SuccChain || Succ == *SuccChain->begin())) {
+        auto EdgeFreq =
+            MBFI->getBlockFreq(TailBB) * MBPI->getEdgeProbability(TailBB, Succ);
+        ExitFallThruFreq = std::max(ExitFallThruFreq, EdgeFreq);
+      }
You should compute the max edge probability, and factor in the frequency outside of the loop.


More information about the llvm-commits mailing list