[PATCH] D147812: [InstrProf] Use BalancedPartitioning to order temporal profiling trace data
Ellis Hoag via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 2 14:24:44 PDT 2023
ellis added inline comments.
================
Comment at: llvm/lib/Support/BalancedPartitioning.cpp:200
+ unsigned NumNodes = std::distance(Nodes.begin(), Nodes.end());
+ DenseMap<BPFunctionNode::UtilityNodeT, unsigned> UtilityNodeDegree;
+ for (auto &N : Nodes)
----------------
spupyrev wrote:
> How much overhead does `DenseMap` have in comparison with `std::vector`? I assume one could first find the maximum utility index and then use a vector instead of a map. Of course, that should only be done if there is visible speedup.
>From looking at `perf record` it seems like very little time is spent with `UtilityNodeDegree`. And since I'm renumbering the utility nodes below, we can't index by utilities at this point, so I think it's best to leave it as it is.
================
Comment at: llvm/lib/Support/BalancedPartitioning.cpp:212
+ // Renumber utility nodes so they can be used to index into Signatures
+ DenseMap<BPFunctionNode::UtilityNodeT, unsigned> UtilityNodeIndex;
+ for (auto &N : Nodes)
----------------
spupyrev wrote:
> maybe add `UtilityNodeIndex.reserve(NumUtilities)`
> (and use the variable for signature initialization too)
We don't actually know how many utilities there are at this point because there could be duplicates.
================
Comment at: llvm/lib/Support/BalancedPartitioning.cpp:362
+ N.UtilityNodes.begin(), N.UtilityNodes.end(), 0.f,
+ [&](float Gain, auto &UN) { return Signatures[UN].CachedCostLR; });
+ } else {
----------------
spupyrev wrote:
> should it be `Gain + Signatures[UN].CachedCostLR` instead?
>
> Assuming I'm correct and this is a bug, can we add a test to catch the problem?
Thanks for the catch! I've added a simple unittest which would have caught this. I guess this explains why I saw a perf improvement. We are still spending the most time (~30%) in `moveGain()`, but I've switched back to a simpler implementation because I'm not seeing any gains from using `std::accumulate()`.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D147812/new/
https://reviews.llvm.org/D147812
More information about the llvm-commits
mailing list