[PATCH] D116784: [ProfileData] Read and symbolize raw memprof profiles.

Snehasish Kumar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 13 16:15:31 PST 2022


snehasish added a comment.

PTAL, thanks!



================
Comment at: llvm/include/llvm/ProfileData/MemProf.h:27
+  std::vector<Frame> CallStack;
+  // TODO: Replace this with the entry format described in the RFC so
+  // that the InstrProfRecord reader and writer do not have to be concerned
----------------
tejohnson wrote:
> Can you elaborate on this and what will change?
I shared D117256 which has the implementation referred to in this TODO. I've added a wrapper around the MemInfoBlock. This wrapper makes it easy to read a prior version of a MemInfoBlock from binary data when the schema is provided. It allows us to specify on a per field basis what to write to (or read from) an indexed profile. PTAL, thanks!


================
Comment at: llvm/include/llvm/ProfileData/MemProf.h:42
+      OS << "    -\n"
+         << "      function: " << Frame.Function << "\n"
+         << "      line_offset: " << Frame.LineOffset << "\n"
----------------
davidxl wrote:
> To handle deep stacks, perhaps make one line per frame (with indentation to show level)?
That's a neat idea, I found out that YAML supports flow style (like JSON) so I can put Function, LineOffset, Column and Inline on the same line. I've put in a TODO for now since the YAML format is only used in tests (max stack depth = 3) and I don't have a unittest to verify the validity of the YAML we generate.


================
Comment at: llvm/include/llvm/ProfileData/MemProf.h:50
+
+    // TODO: Replace this once the format is updated to be version agnostic.
+#define PRINT(Field) OS << "      " #Field ": " << Info.Field << "\n"
----------------
davidxl wrote:
> This TODO needs more explanation.
In D117256 I replaced this print with a call to the printYAML method of the MemInfoBlock wrapper (PortableMemInfoBlock). There I used a macro based generation of code for what fields need to be printed out. Let me know if you want me to clarify further or any other changes in this patch.


================
Comment at: llvm/include/llvm/ProfileData/RawMemProfReader.h:80
+  Error initialize();
+  Error setupSymbolizer();
+  Error readRawProfile();
----------------
tejohnson wrote:
> Where is this defined/called?
Removed, leftover from code I deleted.


================
Comment at: llvm/include/llvm/ProfileData/RawMemProfReader.h:100
+  // Iterator to read from the ProfileData MapVector.
+  llvm::MapVector<uint64_t, MemInfoBlock>::iterator Next = ProfileData.end();
 };
----------------
tejohnson wrote:
> Nit: there are a bunch of locals also named "Next" in various methods. For clarity suggest renaming this to something more unique.
Good idea, renamed to Iter (and renamed a local which was Iter to I).


================
Comment at: llvm/lib/ProfileData/RawMemProfReader.cpp:183
+bool RawMemProfReader::hasFormat(const StringRef Path) {
+  auto BufferOr = MemoryBuffer::getFileOrSTDIN(Path, /*IsText=*/true);
+  if (!BufferOr)
----------------
tejohnson wrote:
> Why IsText = true for the raw file? Not sure of the implications.
Good catch, this param controls whether line endings are detected when the file is read [1]. Since we are working with binary files it doesn't seem to have any impact though. Updated to remove the param to default to false.

[1] https://github.com/llvm/llvm-project/blob/345223a7be3c3c93a10f8453d96287a3a03b6754/llvm/lib/Support/MemoryBuffer.cpp#L254


================
Comment at: llvm/lib/ProfileData/RawMemProfReader.cpp:326
+  return object::SectionedAddress{VirtualAddress,
+                                  object::SectionedAddress::UndefSection};
+}
----------------
tejohnson wrote:
> What is the implication of "UndefSection"?
It instructs the lookup to use absolute addresses. I believe the line table lookup happens here [1]. I also noticed that the default constructor for SectionedAddress sets it to UndefSection by default so I removed the second param here.


[1] https://github.com/llvm/llvm-project/blob/ee02cf0797712bb3a4b0686f185794fcb0fd3d9e/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp#L1248


================
Comment at: llvm/test/tools/llvm-profdata/memprof-basic.test:34
 
-RUN: llvm-profdata show --memory %p/Inputs/basic.memprofraw -o - | FileCheck %s
+RUN: llvm-profdata show --memory %p/Inputs/basic.memprofraw --profiled-binary %p/Inputs/basic.memprofexe -o - | FileCheck %s
 
----------------
tejohnson wrote:
> I assume basic.memprofexe == rawprofile.out from above?
Also updated the multi test.


================
Comment at: llvm/test/tools/llvm-profdata/memprof-basic.test:52
+CHECK-NEXT:       function: {{[0-9]+}}
+CHECK-NEXT:       line_offset: 73
+CHECK-NEXT:       column: 3
----------------
davidxl wrote:
> Is this line correct? where is the source file to validate?
The callstack contains frames from the user code (provided at the top of this file), the compiler runtime interceptors and/or shared libraries. I can confirm that the line numbers are correct for all locations which are symbolized from the main binary i.e user program and compiler runtime linked in statically. Some locations below have line and column set to 0 which are from shared libs. Let me know if there is anything I can change to make this clearer.


================
Comment at: llvm/test/tools/llvm-profdata/memprof-basic.test:70
+CHECK-NEXT:       dealloc_timestamp: 987
+CHECK-NEXT:       total_lifetime: 987
+CHECK-NEXT:       min_lifetime: 987
----------------
davidxl wrote:
> unit of the time?
The timestamp should be ms. Let me look into this a bit more.


================
Comment at: llvm/test/tools/llvm-profdata/memprof-basic.test:73
+CHECK-NEXT:       max_lifetime: 987
+CHECK-NEXT:       alloc_cpu_id: 4294967295
+CHECK-NEXT:       dealloc_cpu_id: 56
----------------
davidxl wrote:
> is cpuid this big?
It shouldn't be, this is 2^32-1 so let me check whats going on here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116784/new/

https://reviews.llvm.org/D116784



More information about the llvm-commits mailing list