[PATCH] D121179: [memprof] Store callsite metadata with memprof records.

Snehasish Kumar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 9 09:38:50 PST 2022


snehasish marked 5 inline comments as done.
snehasish added a comment.

Thanks for the detailed review! PTAL.



================
Comment at: llvm/lib/ProfileData/RawMemProfReader.cpp:301
+  // keep track of related contexts so that we can fill these in later.
+  llvm::DenseMap<Function::GUID, llvm::SetVector<uint64_t>>
+      PerFunctionCallSites;
----------------
tejohnson wrote:
> Document what the uint64_t values are.
The comment here should really have been down below before the loop. Updated the comment since we now use a pointer to avoid another lookup in the map.


================
Comment at: llvm/lib/ProfileData/RawMemProfReader.cpp:324
+
+      // The leaf function is the first PC in the callstack so check if unset.
+      if (LeafFunction == 0) {
----------------
tejohnson wrote:
> Why can't the leaf function be an inline frame? I.e. what if we have:
> 
> foo() { malloc(); }
> bar() { foo(); }
> 
> and we inline foo into bar.
> 
> Won't the leaf be foo which will be an inline frame?
My usage of the term "leaf" is the cause of confusion, here I identify "last non-inlined" function in the callstack as leaf. Since we will annotate the profile post-inlining, this is the function where the records will be placed. In the example above, bar will have the memprof record entry. Let me know how you want me to reword this to make it clearer.

Also a note about ordering of entries in the callstack which represented by a list of MemProfRecord::Frames. For example

qux() { malloc(); }
inline foo() { qux(); }
bar() { foo(); }

is represented as:

bar (isInline: false); foo (isInline: true); qux (isInline:false)

as opposed to the expectation:

foo (isInline: true); bar (isInline: false); qux (isInline:false)

Based on your comment, this may be surprising since foo is the "leaf" function. I chose this representation to have deterministic access to the function id where the memprof record is to be applied, i.e. it is always at index 0. However, now that we have a map of function ids to memprof records in the reader (previously we just had a list of memprof records) we can change the order of to be more intuitive (i.e. on L339 we will append the frames in reverse). Wdyt?





================
Comment at: llvm/test/tools/llvm-profdata/memprof-basic.test:50
+CHECK-NEXT:     FunctionGUID: {{[0-9]+}} 
+CHECK-NEXT:     AllocSites:
 CHECK-NEXT:     -
----------------
tejohnson wrote:
> Can this test now also look for the CallSites?
The input for this test case only has main which calls malloc directly so there is no callsite data to add in this profile. I'll follow up with another llvm-profdata test which has some inlined allocation calls (also see discussion in leaf function comment).


================
Comment at: llvm/unittests/ProfileData/InstrProfTest.cpp:283
 
-TEST_F(InstrProfTest, test_memprof_invalid_add_record) {
-  llvm::memprof::MemProfRecord MR;
-  // At least one of the frames should be a non-inline frame.
-  MR.CallStack.push_back({0x123, 1, 2, true});
-  MR.CallStack.push_back({0x345, 3, 4, true});
-
-  auto CheckErr = [](Error &&E) {
-    EXPECT_TRUE(ErrorEquals(instrprof_error::invalid_prof, std::move(E)));
-  };
-  Writer.addRecord(MR, CheckErr);
-}
+// TEST_F(InstrProfTest, test_memprof_invalid_add_record) {
+//   llvm::memprof::MemProfRecord MR;
----------------
tejohnson wrote:
> Why is this test commented out now?
Meant to remove it altogether since the failure path is no longer exercised by this test. The error is now guarded by an assert in mapRawProfileToRecords L331.


================
Comment at: llvm/unittests/ProfileData/MemProfTest.cpp:189
+  EXPECT_THAT(Records[0].AllocSites[0].CallStack[1],
+              FrameContains("bar", 51U, 20U, true));
+
----------------
tejohnson wrote:
> Should this callstack now have 4 entries and can all be checked?
> 
> Additionally, should the record for foo now have CallSites since it shows up in the call stack for the allocation in CSM[0x2]?
Sure, added the check below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121179/new/

https://reviews.llvm.org/D121179



More information about the llvm-commits mailing list