[PATCH] D73387: [MBP] Partial tail duplication into hot predecessors

Rong Xu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 28 14:53:28 PST 2020


xur added a comment.

Thanks for working on this. Current Duplication in block-placement does need improvement. We notice that disable it often improved performance.



================
Comment at: llvm/include/llvm/CodeGen/TailDuplicator.h:94
+      function_ref<void(MachineBasicBlock *)> *RemovalCallback = nullptr,
+      SmallVectorImpl<MachineBasicBlock *> *CandidatePtr = nullptr);
 
----------------
Need to update the comments for this function.
Nit: maybe move CandidatePtr before DuplicatePreds?


================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3034
+      return false;
+    if (CandidatePreds.size() < BB->pred_size())
+      CandidatePtr = &CandidatePreds;
----------------
It seems to me that findDuplicateCandidates() will not let CandidatePreds to be full Preds (it will put the first one to fallthrough -- which makes this condition always true).


================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3071
+
+// Returns true if BB is Pred's most possible successor.
+bool MachineBlockPlacement::isBestSuccessor(MachineBasicBlock *BB,
----------------
what does "most possible" mean here?


================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3083
+
+  // Find the successor with largest probability.
+  BranchProbability BestProb = BranchProbability::getZero();
----------------
add "excluding BB".


================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3103
+  BlockFrequency Gain = PredFreq * (BBProb - BestProb);
+  return Gain > DupThreshold.getFrequency() * countMBBInstruction(BB);
+}
----------------
Add some comments to explain the modeling. 
The same value is computed in findDuplicationCandidates() i.e. BBDupThreshold-- should we refactor the code to share the same heuristic?


================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3106
+
+void MachineBlockPlacement::findDuplicateCandidates(
+    SmallVectorImpl<MachineBasicBlock *> &Candidates,
----------------
davidxl wrote:
> Add some description (and examples) showing the algorithm/cost model used here.
As David said, need to have cost model description here.

I give my first impression here:
(1) should you check if BB can be placed after a Pred? Pred could be in the middle of the chain.
(2) It seems that you assume the Succ BBs will be chosen as fall-through by different Dupped BBs.
 Should you check if SuccIt has already be placed? Also, if SuccIt can be duplicated, the assumption does not to hold.
(3) Pred fall-through can be handled better. this algorithm does not actually try to find the optimal fall through of BB. It passively get the fall-through when we cannot duplicate. Maybe we can refactor code better so the post fixup at line 3171 is not needed.


================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3159
+
+    if (DupCost >= OrigCost)
+      continue;
----------------
I don't think this will ever be true.


================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3191
+
+  // FIXME: we may use profile count instead of frequency,
+  // and need more fine tuning.
----------------
I think using profile count will get you the same thing here. Profile count is derived from frequency. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73387/new/

https://reviews.llvm.org/D73387





More information about the llvm-commits mailing list