[PATCH] D140949: [MemProf] Context disambiguation cloning pass [patch 2/3]

Mon Apr 10 14:49:47 PDT 2023

snehasish added inline comments.

================
Comment at: llvm/lib/Passes/PassBuilder.cpp:124
 #include "llvm/Transforms/IPO/OpenMPOpt.h"
+#include "llvm/Transforms/IPO/MemProfContextDisambiguation.h"
 #include "llvm/Transforms/IPO/PartialInlining.h"
----------------
I don't see anything used from this header in this file?

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:211
+        Clones.push_back(Clone);
         Clone->CloneOf = this;
       }
----------------
Should we assert that Clone->CloneOf is unset?

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1181

+      ContextNode *LastNode = CurNode;
+
----------------
I think this var shadows the LastNode var outside the loop nest. Is this intentional? Should we rename this var if not?

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1942
+  // The recursive call to identifyClones may delete the current edge from the
+  // CallerEdges vector. Make a copy and iterate that, simpler than passing in
+  // an iterator and having recursive call erase from it. Other edges may also
----------------
nit: "iterate on that"?

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1946
+  // pointers (and are deleted later), so we skip those below.
+  auto CallerEdges = Node->CallerEdges;
+  for (auto &Edge : CallerEdges) {
----------------
Should we bound the scope of CallerEdges just to make it clear to the reader and avoid accidental use?

A c++20 init-statement would be nice but we don't allow c++20 in LLVM yet.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1949
+    // Skip any that have been removed by an earlier recursive call.
+    if (Edge->Callee == nullptr && Edge->Caller == nullptr)
+      continue;
----------------
Can we also drop this Edge from the Node->CallerEdges set if both Caller and Callee are set to null?

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1957
+
+  // Check if we reached an unambiguous call or have no more callers.
+  if (hasSingleAllocType(Node->AllocTypes) || Node->CallerEdges.size() <= 1)
----------------
The code below allows 1 caller? 

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1966
+  // vector.
+  auto AllocTypesMatch =
+      [](const std::vector<uint8_t> &InAllocTypes,
----------------
Can we make this functor a static function defined after `allocTypeToUse`? AllocTypesMatch doesn't depend on any function local state and it contains another functor; reducing the nesting would make it easier to read. It is also defined a bit further away from the use.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1994
+  // that is arbitrary, and we assert in that case below.
+  constexpr unsigned AllocTypeCloningPriority[] = {/*None*/ 3, /*NotCold*/ 4,
+                                                   /*Cold*/ 1,
----------------
I think EnumeratedArray [1] is a good fit for this use case. I think it would also provide some checks against out of bounds access if the AllocType enum class changed.

[1] llvm-project/llvm/include/llvm/ADT/EnumeratedArray.h

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1998
+  std::stable_sort(Node->CallerEdges.begin(), Node->CallerEdges.end(),
+                   [&](const std::shared_ptr<ContextEdge> &A,
+                       const std::shared_ptr<ContextEdge> &B) {
----------------
We only need to capture AllocTypeCloningPriority here. Do you know if a more permissive capture has any overhead?

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2013
+  // Iterate until we find no more opportunities for disambiguating the alloc
+  // types via cloning. In most case this loop will terminate once the Node
+  // has a single allocation type, in which case no more cloning is needed.
----------------
nit: cases

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2018
+  for (auto EI = Node->CallerEdges.begin(); EI != Node->CallerEdges.end();) {
+    auto Edge = *EI;
+
----------------
`auto CallerEdge` would make the code below a little easier to read.

================
Comment at: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:2027
+    // for this caller edge.
+    std::vector<uint8_t> CalleeEdgeAllocTypesForCallerEdge;
+    for (auto &CalleeEdge : Node->CalleeEdges)
----------------
nit: `CalleeEdgeAllocTypesForCallerEdge.reserve(Node->CalleeEdges.size());` to avoid reallocations.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140949/new/

https://reviews.llvm.org/D140949