[PATCH] D90125: [CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining

Wenlei He via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 2 10:02:05 PST 2020


wenlei marked 2 inline comments as done.
wenlei added a comment.

Thanks for quick review! I've update the patch, also see replies inline.



================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:347
+// State of context associated with FunctionSamples
+enum ContextState {
+  UnknownContext = 0x0,   // Profile without context
----------------
davidxl wrote:
> Since these states are not mutually exclusive, perhaps name it ContextStateMask?
Renamed.


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:364
+
+  void promoteOnPath(StringRef ContextStrToRemove) {
+    assert(FullContext.startswith(ContextStrToRemove));
----------------
davidxl wrote:
> document this method.
Done.


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:377
+
+  static void DecodeContextString(StringRef ContextStr, StringRef &FName,
+                                  LineLocation &LineLoc) {
----------------
davidxl wrote:
> Document the context string format here.
Added header comment to `SampleContext`.


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:409
+private:
+  void setContext(StringRef ContextStr, ContextState CState) {
+    assert(!ContextStr.empty());
----------------
davidxl wrote:
> what is the input format? document it here.
Added comment.


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:504-506
+      // For CSFDO, in order to conserve profile size, we no longer write out
+      // locations profile for those not hit during training, so we need to
+      // treat them as zero instead of error here.
----------------
wmi wrote:
> davidxl wrote:
> > CSFDO ==> CSSPGO
> It means CSSPGO will treat all the new lines as cold, even if some of them may be inferred from other parts of the profile. How much extra size is needed if zero is emitted? 
Knowing that CS profile will be much bigger, we started with trimming zero counts trying to save size as much as we can. But I don't actually have the data at hand. Let me see if I can get some data on this.

New lines will be less of a problem for pseudo-probe if they don't change CFG.



================
Comment at: llvm/include/llvm/Transforms/IPO/SampleContextTracker.h:94
+  SampleContextTracker(StringMap<FunctionSamples> &Profiles);
+  FunctionSamples *getCalleeContextSamplesFor(const CallBase &Inst,
+                                              StringRef CalleeName);
----------------
davidxl wrote:
> Document the public APIs.
Done.


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:225
       }
-      Profiles[FName] = FunctionSamples();
-      FunctionSamples &FProfile = Profiles[FName];
-      FProfile.setName(FName);
+      SampleContext FContext(FName);
+      if (FContext.hasContext())
----------------
davidxl wrote:
> so FName is also the context String?
Yes, `SampleContext` can take both full context string (wrapped with `[]`) as well as context-less function names, and it will set internal state accordingly. I've updated header comment for `SampleContext` with details.

```
// Example of full context string (note the wrapping `[]`):
//    `[main:3 @ _Z5funcAi:1 @ _Z8funcLeafi]`
// Example of context-less function name (same as AutoFDO):
//    `_Z8funcLeafi`
```


================
Comment at: llvm/lib/Transforms/IPO/SampleContextTracker.cpp:246
+  LLVM_DEBUG(dbgs() << "Getting base profile for function: " << Name << "\n");
+  ContextTrieNode *Node = getTopLevelContextNode(Name);
+  if (MergeContext) {
----------------
wmi wrote:
> I don't understand what the top level means here. Better document it. 
> 
> Do we cache the base profile somewhere or we merge it everytime?
Top-level means it's under root node directly - path from to node is empty, hence no context. I added comment here as well as in the header comment of `SampleContextTracker`.

If `getBaseSamplesFor` is called for the same function again, we'll retrieve the existing top-level node from last call (it must exist, with context profile all merged into it already), then iterating over `FuncToCtxtProfileSet[Name]` but won't do anything since all context profiles have been merged (check on line 265). So it's somewhat like caching. Currently `getBaseSamplesFor` is only called once for each function from sample profile loader.


================
Comment at: llvm/lib/Transforms/IPO/SampleContextTracker.cpp:368
+
+ContextTrieNode *SampleContextTracker::getContextFor(const DILocation *DIL) {
+  assert(DIL);
----------------
wmi wrote:
> Do we need to call getCanonicalFnName here to make the name in inline stack canonical so we can match the name in inline stack with the name in context?
Good catch, I think we need to canonicalize `CalleeName` which is the leaf. (The names of middle inline frames should be fine as they're from debug metadata which are not modified when suffixes are appended for symbol promotion, etc..)

I guess we need to add `getCanonicalFnName` for `SampleProfileLoader::findCalleeFunctionSamples`. IIUC, we need it there for today's FDO too?


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:1910-1914
+    // In order to be able to fully leverage context-sensitive profile,
+    // we need global top-down inlining, thus do not load profile or
+    // perform profile guided inlining pre-LTO.
+    if (IsThinLTOPreLink)
+      return false;
----------------
wmi wrote:
> Here it means no any profile loading or just no CS profile? ThinLTO thinlink phase needs to know which functions are hot and it can import them, so profile information is needed in ThinLTO prelink.
Oops, we don't have this change now, forgot to remove when upstreaming. And you're right, we need to load profile for thinlto so thinlink importing can be profile guided. Thanks for catch this. 


================
Comment at: llvm/test/Transforms/SampleProfile/Inputs/profile-context-tracker.prof:27
+ 3: 12
+[externalA:17 @ _Z5funcBi]:120:3
+ 0: 3
----------------
wmi wrote:
> Here "main" doesn't show up in the context. Is it a problem of unwinding or debug info?
This is an artificial context to simulate the case where funcB is also called from external functions to current module (compile time profile loader's case), and we merge context involving external caller correctly. Real profile for this case doesn't have problem in capturing the correct context. 


================
Comment at: llvm/test/Transforms/SampleProfile/profile-context-tracker-debug.ll:1-3
+; REQUIRES: asserts
+; Test for CSSPGO's SampleContextTracker to make sure context profile tree is promoted and merged properly
+; based on inline decision, so post inline counts are accurate.
----------------
wmi wrote:
> Is there anything which cannot be tested in profile-context-tracker.ll? The debug message is usually used as last resort if something cannot be fully tested by just checking IR.  
I added this hoping to make it easier to reason about the internals operations/state of context tracker, and also capture any unintended subtle change in context tracking. But if we look at end result of IR, the non-debug test should be able to cover it as good. I can remove this one if you think that's better.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90125/new/

https://reviews.llvm.org/D90125



More information about the llvm-commits mailing list