[PATCH] D94110: [CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation

Lei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 19 21:59:58 PST 2021


wlei added inline comments.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:106
+                                         SmallVectorImpl<uint64_t> &CallStack) {
+  if (Cur->Address != 0) {
+    if (Binary->usePseudoProbes()) {
----------------
hoy wrote:
> Is the trie root the only case this can be zero? Can we make an assert for that?
Good point! adding assertion here may need a dummy root parameter, how about changing this condition to `isDummyRoot` and add assertion inside `getOrCreateChildFrame` so that we can also guarantee only the root can be zero address.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:112
+      // Cutting off the context from here since the inline will
+      // not know how to consume a context with unknown callsites.
+      if (!CallProbe) {
----------------
hoy wrote:
> Typo: inliner
fixed


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:175
+  if (Binary->usePseudoProbes()) {
+    State.StackLeaf->Parent->recordBranchCount(Branch.Source, Branch.Target,
+                                               Repeat);
----------------
hoy wrote:
> Would be nice to a comment about why counts are reported on the parent frame for pseudo prob .
Good catch, comments added


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:213
+    Frame(uint64_t Addr) : Address(Addr) {}
+    Frame *getOrCreateFrame(uint64_t Address) {
+      auto Ret = Children.emplace(Address, std::make_unique<Frame>(Address));
----------------
hoy wrote:
> Nit: how about name it `getOrCreateChildFrame`?
Sounds good!


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:226
+
+  Frame FrameTrieRoot;
+  Frame *StackLeaf;
----------------
hoy wrote:
> `FrameTrieRoot` doesn't seem to be initialized anywhere in the constructor of `UnwindState`.
I guess it will be initialized implicitly since it's not a pointer but a value. It will have a zero value address and be taken as a dummy node. I changed the name to `DummyTrieRoot` to make it clearer.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:227
+  Frame FrameTrieRoot;
+  Frame *StackLeaf;
   // Used to fall through the LBR stack
----------------
hoy wrote:
> Nit: name it `FrameCurrentLeaf`?
Good point


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94110/new/

https://reviews.llvm.org/D94110



More information about the llvm-commits mailing list