[PATCH] D73387: [MBP] Partial tail duplication into hot predecessors
Rong Xu via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 28 14:53:28 PST 2020
xur added a comment.
Thanks for working on this. Current Duplication in block-placement does need improvement. We notice that disable it often improved performance.
================
Comment at: llvm/include/llvm/CodeGen/TailDuplicator.h:94
+ function_ref<void(MachineBasicBlock *)> *RemovalCallback = nullptr,
+ SmallVectorImpl<MachineBasicBlock *> *CandidatePtr = nullptr);
----------------
Need to update the comments for this function.
Nit: maybe move CandidatePtr before DuplicatePreds?
================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3034
+ return false;
+ if (CandidatePreds.size() < BB->pred_size())
+ CandidatePtr = &CandidatePreds;
----------------
It seems to me that findDuplicateCandidates() will not let CandidatePreds to be full Preds (it will put the first one to fallthrough -- which makes this condition always true).
================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3071
+
+// Returns true if BB is Pred's most possible successor.
+bool MachineBlockPlacement::isBestSuccessor(MachineBasicBlock *BB,
----------------
what does "most possible" mean here?
================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3083
+
+ // Find the successor with largest probability.
+ BranchProbability BestProb = BranchProbability::getZero();
----------------
add "excluding BB".
================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3103
+ BlockFrequency Gain = PredFreq * (BBProb - BestProb);
+ return Gain > DupThreshold.getFrequency() * countMBBInstruction(BB);
+}
----------------
Add some comments to explain the modeling.
The same value is computed in findDuplicationCandidates() i.e. BBDupThreshold-- should we refactor the code to share the same heuristic?
================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3106
+
+void MachineBlockPlacement::findDuplicateCandidates(
+ SmallVectorImpl<MachineBasicBlock *> &Candidates,
----------------
davidxl wrote:
> Add some description (and examples) showing the algorithm/cost model used here.
As David said, need to have cost model description here.
I give my first impression here:
(1) should you check if BB can be placed after a Pred? Pred could be in the middle of the chain.
(2) It seems that you assume the Succ BBs will be chosen as fall-through by different Dupped BBs.
Should you check if SuccIt has already be placed? Also, if SuccIt can be duplicated, the assumption does not to hold.
(3) Pred fall-through can be handled better. this algorithm does not actually try to find the optimal fall through of BB. It passively get the fall-through when we cannot duplicate. Maybe we can refactor code better so the post fixup at line 3171 is not needed.
================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3159
+
+ if (DupCost >= OrigCost)
+ continue;
----------------
I don't think this will ever be true.
================
Comment at: llvm/lib/CodeGen/MachineBlockPlacement.cpp:3191
+
+ // FIXME: we may use profile count instead of frequency,
+ // and need more fine tuning.
----------------
I think using profile count will get you the same thing here. Profile count is derived from frequency.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73387/new/
https://reviews.llvm.org/D73387
More information about the llvm-commits
mailing list