[PATCH] D99351: [CSSPGO] Top-down processing order based on full profile.

Hongtao Yu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 26 12:16:25 PDT 2021


hoy marked an inline comment as done.
hoy added inline comments.


================
Comment at: llvm/lib/Transforms/IPO/SampleContextTracker.cpp:569
 
-// Replace call graph edges with dynamic call edges from the profile.
-void SampleContextTracker::addCallGraphEdges(CallGraph &CG,
-                                             StringMap<Function *> &SymbolMap) {
-  // Add profile call edges to the call graph.
+void SampleContextTracker::buildProfiledCallGraph(ProfiledCallGraph &ProfiledCG,
+                                                  CallGraph &CG) {
----------------
wenlei wrote:
> This is very similar to CSPreInliner::buildTopDownOrder. We had to let context track do things like addCallGraphEdges in the past. But now it probably makes more sense to let ProfileCallGraph take care of the graph building. I can refactor part of CSPreInliner::buildTopDownOrder into ctor of ProfileCallGraph so it can be reused here. 
Sounds good. Moving the logic into ProfileCallGraph makes more sense.


================
Comment at: llvm/lib/Transforms/IPO/SampleContextTracker.cpp:578
     NodeQueue.pop();
-    Function *F = SymbolMap.lookup(Node->getFuncName());
     for (auto &I : Node->getAllChildContext()) {
       ContextTrieNode *ChildNode = &I.second;
----------------
wenlei wrote:
> Is it intentional that we only look at trie, but not call targets from body samples for call edges? 
It is intentional. A call target that doesn't come with a profile or is not on a call path to its child profile can be ignored since processing it before its caller (if this is the only context) shouldn't lose anything. 


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:165
+static cl::opt<bool>
+    UseProfiledCallGraph("use-profiled-call-graph", cl::init(false), cl::Hidden,
+                         cl::desc("Process functions in a top-down order "
----------------
wenlei wrote:
> Nit: the description implies that with `-use-profiled-call-graph=1`, we would do top-down order even if `-sample-profile-top-down-load=0` is used. But the implementation doesn't do that. 
> 
> Would be good to have a cohesive connection between the two switches, and description to reflect that. How about `use-profiled-top-down-order` with description like "Use the top-down order defined by profiled call graph when `-sample-profile-top-down-load` is on"?
Good point. Description changed.


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:1577
+                                               const FunctionSamples &Samples) {
+  ProfiledCG.addProfiledFunction(Samples.getFuncName());
+
----------------
wenlei wrote:
> assert that we don't go this path for csspgo?
Done.


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:1644-1645
+    //    is inlined into into a caller function in LTO prelink, every call edge
+    //    originated from the callee will be transferred to the caller. If any
+    //    of the transferred edges is indirect, the original profiled indirect
+    //    edge, even if considered, would not enforce a top-down order from the
----------------
wenlei wrote:
> Such case has to involve indirect call because if we have direct call, say we have A->B->C, if B is gone in post-link, we will still honor A->C order because the call to C is visible in A, correct? 
> 
> It may be clearer to use example in the description. Same for #4.
> 
> It'd be good to call out that some of this only applies to csspgo (e.g. #2 shouldn't be a problem for AutoFDO). 
Exactly. A->C can be used to recover A->B->C with C's `!dbg` information.

Examples added.


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:1665-1667
+    // Considering those cases, a profiled call graph completely independent of
+    // the static call graph is constructed based on profile data, where
+    // function objects are not even needed to handle case #3 and case 4.
----------------
wenlei wrote:
> Using ProfiledCallGraph allows us to order without needing function object, but profile could be stale (e.g. missing a new edge after source drift). Can we build with ProfiledCallGraph with static call graph nodes and edges included as well?
Actually adding static edges leads to worse performance for some benchmarks because of SCC. In that case, static edges in SCC should be completely removed so that only profile edges are honored. 

On the other hand, yes, profile could be stale, but that's the information FDO relies on. I think without the profile, top-down order isn't important. In other words, static call edges seems not important when they don't correspond to a context in the profile.


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:1679
+        if (F && !F->isDeclaration() && F->hasFnAttribute("use-sample-profile"))
+          FunctionOrderList.push_back(F);
       }
----------------
wenlei wrote:
> What happens if a function is not in input profile, looks like it will be skipped in sample loader after this change? Before the change, we would still set entry count for a function if it has no profile. 
Good catch! It's an overlook. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99351/new/

https://reviews.llvm.org/D99351



More information about the llvm-commits mailing list