[PATCH] D141077: [MemProf] Context disambiguation cloning pass [patch 3/4]

Wed Apr 26 12:59:21 PDT 2023

snehasish added a comment.

Looks good overall, still thinking if we can make assignFunctions less monolithic and easier to follow.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1347
 static std::string getMemProfFuncName(Twine Base, unsigned CloneNo) {
   if (!CloneNo)
     return Base.str();
----------------
Add a comment why? I believe we use CloneNo == 0 (as convention) to identify the original version thus returning the base name instead of adding the suffix.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2212
+      .emit(OptimizationRemark(DEBUG_TYPE, "MemprofAttribute", Call.call())
+            << NV("AllocationCall", Call.call()) << " in clone "
+            << NV("Caller", Call.call()->getFunction())
----------------
I had trouble finding what NV is. I think it would be clearer if we used ore::NV instead of a `using namespace ore;` directive above. Many uses within LLVM seem to retain the ore qualifier too [1].

[1] https://github.com/search?q=repo%3Allvm%2Fllvm-project%20ore%3A%3ANV&type=code

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2242
+  // Caller cannot be an allocation.
+  assert(CI);
+  assert(CI->Clones.size() > CallerCall.cloneNo());
----------------
Maybe inline the comment into the assert and add the reason it can't be an allocation? E.g. `assert(CI && "Caller cannot be an allocation because ... ");`

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2309
+template <typename DerivedCCG, typename FuncTy, typename CallTy>
+bool CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::assignFunctions() {
+  bool Changed = false;
----------------
Thanks for the detailed inline comments. Can we summarize the heuristic as pseudo-code in a function comment above the implementation too? 

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2314
+  // call.
+  std::map<ContextNode *, FuncInfo> CallsiteToCalleeFuncCloneMap;
+
----------------
Since we don't need the ordering properties here, can we switch to llvm::DenseMap? It would probably be faster and have lower memory overhead than std::map.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2326
+  // cloning for each of its calls.
+  for (auto &FuncEntry : FuncToCallsWithMetadata) {
+    FuncInfo OrigFunc(FuncEntry.first);
----------------
`for (auto &[Func, CallsWithMetadata] : FuncToCallsWithMetadata)` to make it clearer in the usage below.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2340
+      // Not having a call should have prevented cloning.
+      assert(Node->hasCall());
+
----------------
nit: inline the comment into the assert with &&

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2364
+      // Keep track of the clones of callsite Node that need to be assigned to
+      // function clones. This list may be expanded in the loop body below if we
+      // find additional cloning is required.
----------------
[optional] This sounds like we are implementing a worklist. The loop below uses indices and conditions based on them, I wonder if it would be clearer to restructure as a traditional worklist e.g. `while(!ClonesWorklist.empty()) { ...}`. I think some of the conditions inside can be rewritten and if the resulting code would be easier to follow. Not a major concern but wanted to get your opinion.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2385
+          // If this is the first callsite copy, assign to original function.
+          if (I == 0) {
+            // Since FuncClonesToCallMap is empty in this case, no clones have
----------------
Can we hoist the contents of this if condition outside the loop? Then we can start the iteration at index 1 and remove this check.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2432
+          // map.
+          assert(CloneNo > 0);
+          FuncInfo NewFuncClone = cloneFunctionForCallsite(
----------------
Inline the comment into the assert with &&?

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2492
+                continue;
+              if (!Callee->hasCall())
+                continue;
----------------
I think this condition is simple enough to be merged with the condition above with `if (Callee == Clone || !Callee->hasCall()) continue`.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2671
+      if (VerifyCCG) {
+        checkNode<DerivedCCG, FuncTy, CallTy>(Node, /*CheckEdges=*/true);
+        for (const auto &PE : Node->CalleeEdges)
----------------
Should we flip this default? It seems like the callsites with true outnumber the ones without.

================
Comment at: llvm/test/ThinLTO/X86/memprof-basic.ll:109

+attributes #0 = { noinline optnone }
+
----------------
I don't the addition of these attributes is necessary for this patch. Is this left over from the split or just added to make the test more resilient? Perhaps split them out and commit as NFC for the latter.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141077/new/

https://reviews.llvm.org/D141077