[PATCH] D89723: [CSSPGO][llvm-profgen]Context-sensitive profile data generation

Mon Nov 2 09:32:15 PST 2020

wlei added inline comments.

================
Comment at: llvm/test/tools/llvm-profgen/inline-cs-noprobe.test:1
+; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-noprobe.perfscript --binary=%S/Inputs/inline-cs-noprobe.perfbin --output=%t
+; RUN: FileCheck %s --input-file %t
----------------
wmi wrote:
> Is it possible to use a small manually crafted perfscript as input? It is easier to know whether the number in the output makes sense or not when the perfscript is small. It will also be easier if something in the test needs to be adjusted in the future.
Thanks for your feedbacks. a small perfscript with only one or two sample is replaced. also add unwinder's test

================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:106
+  // unwind the sample stack as we walk through LBR entries.
+  while (State.LBRIndex < State.LBRStack.size()) {
+    State.checkStateConsistency();
----------------
wenlei wrote:
> Perhaps add a wrapper `State.hasMoreLBRs()` for this?
change to `State.hasNextLBR()`

================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:184
+
+bool PerfReader::extractLBRStack(line_iterator &LBRLine, UnwinderTrace &Trace) {
+  SmallVector<StringRef, 32> Records;
----------------
wmi wrote:
> Can you give an example of LBRStack so it is easy to understand what the code is parsing here?
example is added

================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:126-127
+using BinaryAddressMap = std::map<uint64_t, ProfiledBinary *>;
+// A map of sample counter with context, this is used to aggregate and
+// accumulate sample count with the same context.
+using ContextToSampleMap =
----------------
wmi wrote:
> What do the three uint64_t fields in the map represent?
fixed according wenlei's suggestion below by using `ContextBranchCounter` and `ContextRangeCounter `, also give more explanation

================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:234-238
+  void parseMMap2Event(line_iterator &Line);
+  void parseTrace(StringRef Filename);
+  void parseEvent(line_iterator &Index);
+  // Parse the hybrid trace including the call and LBR line
+  void parseHybridTrace(line_iterator &Line);
----------------
wenlei wrote:
> I suggest let's establish a consistent naming convention here wrt what is a trace and what is an event: 
> 
> - Trace is a series of perf events.
> - Each perf event can be an mmap event or a sample.
> - hybrid trace is a series of perf event: mmap events and hybrid samples.
> - hybrid sample is lbr sample plus stack sample.
> 
> With that, we can rename the following:
> 
> ```
> void parseHybridTrace(line_iterator &Line);
> ->
> void parseHybridSample(line_iterator &Line);
> 
> UnwinderTrace
> ->
> HybridSample
> 
> TraceAggregation
> ->
> SampleAggregation
> ```
> 
Thanks for suggesting for a consistent naming convention, did the refactor.

================
Comment at: llvm/tools/llvm-profgen/llvm-profgen.cpp:37-42
   PerfReader Reader;
   Reader.run();

+  std::unique_ptr<ProfileGenerator> Generator =
+      ProfileGenerator::create(&Reader);
+  Generator->generateProfile();
----------------
wenlei wrote:
> If we let ProfileGenerator be the driver, I think we should also let ProfileGenerator initiate the perf loading (line 38); otherwise if we intend to decouple them, and let PerfReader read profile outside of ProfileGenerator, then it's better only pass the loaded profile to ProfileGenerator for cleaner separation. 
Good suggestion, change to not include PerfReader in ProfileGenerator, then I also decoupled the unwinder from the reader. for the unwinder, the input is the aggregated hybrid sample, the output is the sample counters which is later forwarded to the generator. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89723/new/

https://reviews.llvm.org/D89723