[PATCH] D147812: [InstrProf] Use BalancedPartitioning to order temporal profiling trace data

Mon Apr 17 15:34:52 PDT 2023

ellis added a comment.

I've ran some performance analysis by running the `BalancedPartitioningTest.Large` test with different sizes.

  GTEST_FILTER="BalancedPartitioningTest.Large" perf record --call-graph lbr -- build/unittests/Support/SupportTests

I found that most of the time is spent in `moveGain()`, so I made several changes.

1. Switch from `double` to `float`.
2. Use `std::accumulate()` instead of a for loop to compute gain.
3. Pull out constant `FromLeftToRight` from loop.
4. Use a vector instead of a map for `Signatures`.
5. Separate `CachedCost` into two floats and a bool since `moveGain()` assumes the cache is valid.

================
Comment at: llvm/lib/Support/BalancedPartitioning.cpp:214
+  // is 4x larger than the number of FunctionNodes.
+  SignatureMapT Signatures(/*InitialReserve=*/NumNodes * 4);
+  for (auto &N : Nodes) {
----------------
spupyrev wrote:
> Do we want to get rid of the map? (and the weird 4x initialization)
I've removed the map because I found that it is more efficient to renumber these utility nodes so these signatures can be an array indexed by the utility node.

================
Comment at: llvm/lib/Support/BalancedPartitioning.cpp:366
+double
+BalancedPartitioning::computeGoal(const SignatureMapT &Signatures) const {
+  double Sum = 0;
----------------
spupyrev wrote:
> I think we've got rid of this method
I've deleted `computeGoal()`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147812/new/

https://reviews.llvm.org/D147812