[PATCH] D121643: [llvm-profgen] On-demand pseudo probe decoding

Hongtao Yu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 22 18:42:39 PDT 2022


hoy added inline comments.


================
Comment at: llvm/tools/llvm-profgen/ProfileGenerator.cpp:374
 
+void ProfileGeneratorBase::collectProfiledFunctions() {
+  std::unordered_set<const BinaryFunction *> ProfiledFunctions;
----------------
wlei wrote:
> wlei wrote:
> > For CS profile,  the input contains CallStack + LBRStack, but here it only considers the addresses from LBRStack(Range + Branch) to compute profiledFunctions. Does it miss some functions whose address(frame) is only from CallStack?
> > 
> > For example, if we only have one following entry:
> > [main @ foo] : {rangeFrom : rangeTo}
> >  
> > Supposing rangeFrom and rangeTo only belong to `foo`(foo is not inlined to main), then the result of ProfiledFunctions only contains `foo` and misses `main`, right?
> > 
> > From the term "ProfiledFunctions", it should be `foo`, but for CS inlining, I guess we still need the frame `main`?
> > 
> Sorry, I missed your notes on the summary. 
> > "Note that in on-demand mode, a calling context may be truncated at a caller which does not have sample. This is a discrepancy from the complete mode. However, with the CS preinliner such context will always be truncated from the compiler inlining point of view." 
> 
> So in on-demand mode, it will implicitly trim some contexts, even before CS-preinliner or cold-context-trimming. 
Yes, and those contexts, if not truncated by the CS preinliner, will also be truncated by the compiler since the sample loader wouldn't consider any inlining in a non-profiled function.


================
Comment at: llvm/tools/llvm-profgen/ProfileGenerator.cpp:953
+  for (auto Addr : reverse(Addresses)) {
+    const MCDecodedPseudoProbe *CallProbe = Binary->getCallProbeForAddr(Addr);
+    // We may not find a probe for a merged or external callsite.
----------------
wenlei wrote:
> hoy wrote:
> > wenlei wrote:
> > > CallProbe can also be null if the caller isn't profiled? Does the comment need to be updated here? 
> > Comment updated.
> nit: We may not find a probe for functions that are not sampled.  -> We may not find a probe for functions that are not sampled when --decode-probe-for-profiled-functions-only is on. 
> 
> also clearer if you list the three scenario with bullets as it now grow larger. 
Sounds good.


================
Comment at: llvm/tools/llvm-profgen/ProfiledBinary.cpp:362
+  std::unordered_set<uint64_t> ProfiledGuids;
+  for (auto *F : ProfiledFunctions) {
+    ProfiledGuids.insert(Function::getGUID(F->FuncName));
----------------
wenlei wrote:
> wenlei wrote:
> > can we assert `!ProfiledFunctions.empty()` to make sure profile functions are collected already? 
> missed this one? 
assert added.


================
Comment at: llvm/tools/llvm-profgen/ProfiledBinary.cpp:382
               reinterpret_cast<const uint8_t *>(Contents.data()),
-              Contents.size()))
+              Contents.size(), ProfiledGuids))
         exitWithError("Pseudo Probe decoder fail in .pseudo_probe section");
----------------
wenlei wrote:
> wenlei wrote:
> > How about completely unify the workflow of on-demand vs full decoding? 
> > 
> > The only difference would be - for full decoding, we just need to provide a full Guid set; and for on-demand, we provide ProfiledGuids. 
> > 
> > Correct me if I'm wrong, but I think this was we keep the same behavior for full decoding as of today (no context truncation before preinliner), and we can also remove ProbeStack and replace it with AddressStack everywhere. 
> > The only difference would be - for full decoding, we just need to provide a full Guid set; and for on-demand, we provide ProfiledGuids.
> 
> missed this one? By only difference, I meant the only place that need to check `DecodeProbeForProfiledFunctionsOnly`. 
Right now for full decoding we provide an empty `ProfiledGuids` to the decoder instead of computing Guids for all disassembled functions which could be expansive.  The only place that requires full decoding is when --show-assembly-only is on. Otherwise it is kind of unified?

I'm adding an assert like this

```
  assert((!ProfiledFunctions.empty() || !DecodeProbeForProfiledFunctionsOnly ||
          ShowDisassemblyOnly) &&
         "Profiled functions should not be empty in on-demand probe decoding "
         "mode");
```

We can also use a 0 as a guid to represent all functions. WDYT?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121643/new/

https://reviews.llvm.org/D121643



More information about the llvm-commits mailing list