[PATCH] D111750: [llvm-profgen] Allow unsymbolized profile as perf input

Wenlei He via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 22 12:38:59 PDT 2021


wenlei added inline comments.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:357-360
+  PerfInputFile RetFile;
+  RetFile.InputFile = PerfTraceFile;
+  RetFile.Format |= PerfFormat::PerfScript;
+  return RetFile;
----------------
nit: we can do `return {PerfTraceFile, PerfFormat::PerfScript, PerfContent::Unknown};`


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:797
+    }
+    Key->genHashCode();
+    auto Ret =
----------------
wlei wrote:
> wenlei wrote:
> > Would it better if hash code is lazily generated instead of requiring an explicit call? 
> > 
> > ```
> > getHashCode() {
> >   if (HashCode == 0) 
> >      genHashCode()
> >   ...
> > }
> > ```
> Here to explicitly call `genHashCode` is intentional, the reason is we want to avoid making `genHashCode` a virtual function, i, e, avoid call `genHashCode `after casting to base class. so we separate like:
> ```
> derived class:  HashCode = genHashCode();
> 
> base class :  getHashCode{return HashCode;}
> ```
> 
> 
Not sure if I understand. Why do we want to avoid call genHashCode after casting to base class?


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:70
+// The type of perfscript content.
+enum PerfContent {
+  LBR = 0x01,        // Only LBR sample.
----------------
We still want to reserve 0 as unknown or invalid?


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:78-79
+  std::string InputFile;
+  uint32_t Format = 0;
+  uint32_t Content = 0;
 };
----------------
Can we use enum types instead of integer? 


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:640
+/*
+   Format of unsymbolized profile:
+
----------------
Also comment to explain the difference between CS and non-CS profile?


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:63-68
+  PERFSCRIPT_UNKNOWN = 0,
+  PERFSCRIPT_INVALID = 1,
+  PERFSCRIPT_LBR = 2,       // Only LBR sample
+  PERFSCRIPT_LBR_STACK = 3, // Hybrid sample including call stack and LBR stack.
+  PEFFDATA = 4,             // Raw linux perf.data.
+  RAWPROFILE = 5,           // Unsymbolized profile generated by llvm-profgen.
----------------
wlei wrote:
> wlei wrote:
> > wenlei wrote:
> > > wenlei wrote:
> > > > nit: all caps is usually only used for macro. For enum, it's better to follow normal variable naming, which is also the convention used in llvm 
> > > These are not mutually exclusive types. e.g. PERFSCRIPT_UNKNOWN can be either PERFSCRIPT_LBR or PERFSCRIPT_LBR_STACK. 
> > > 
> > > Logically there're two things being represented, the format and the content. To avoid exponential combination and to have a clear separation, perhaps use format, content masks instead, or even separate them out?
> > > 
> > > We can have format = RawProfile|PerfScript|PerfData, and content = LBR|LBRStack|Aggregated.
> > Fixed, thanks!
> Makes sense.Changed to separate them out.
Now that format and content are separated, perhaps they no longer need to use mask within each? Sorry if I wasn't clear, mask or separate field are two ways to have separation for format and content. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111750/new/

https://reviews.llvm.org/D111750



More information about the llvm-commits mailing list