[PATCH] D99146: [CSSPGO][llvm-profgen] Context-sensitive global pre-inliner
Hongtao Yu via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 25 22:55:22 PDT 2021
hoy accepted this revision.
hoy added inline comments.
This revision is now accepted and ready to land.
================
Comment at: llvm/tools/llvm-profgen/CSPreInliner.cpp:63
+
+ // Add calls for context, if both caller and callee has context profile.
+ for (auto &Child : Caller->getAllChildContext()) {
----------------
wenlei wrote:
> hoy wrote:
> > wenlei wrote:
> > > hoy wrote:
> > > > wenlei wrote:
> > > > > hoy wrote:
> > > > > > wenlei wrote:
> > > > > > > hoy wrote:
> > > > > > > > I'm wondering in the future if functions without profile should be considered so that a broader inline decisions can be made regardless of callsite hotness.
> > > > > > > For pre-inlining, we need to have callee profile otherwise there's no profile to be adjusted regardless of whether we inline or not.
> > > > > > >
> > > > > > > For caller profile, we currently requires it to trigger pre-inline, but we could do pre-inline without caller profile. Is that what you meant? Currently compiler's inlining also requires caller profile, so what we have here aligns with compiler.
> > > > > > >
> > > > > > > (Not sure if I get the question, it's building top-down order here, not actual pre-inlining.. )
> > > > > > Sorry for the confusion. I actually meant to explore non-profiled callees (such as getters/setters) here and in `getInlineCandidates` to mimid the situation that both hot callees of those getters/setters and themselves are inlined into the current caller. The getter/setter inlining, though their callsite are not hot, are likely done in prelink CGSCC inlining. I was thinking about simulating the prelink inlining if possible. There's no such need if prelink inlining is disabled.
> > > > > What do you mean by non-profiled callees? If function does not have profile, there's no profile to be adjusted, then doing pre-inline or not doesn't matter. CGSCC inline in prelink can happen but since there's no profile for the callee, no profile need to be adjusted and there's no count quality issue.
> > > > >
> > > > > Or are you suggesting considering looking at multiple levels of callees when evaluating a call site? That is orthogonal to whether a function has profile.
> > > > Yeah, even if the first level callee doesn't have profile, the callee's callee could have a hot profile. Skipping such first-level callees also skips next-level callees. It's fine if first-level callees are not inlined in prelink, which will aslo not be inlined in postlink sample loader. If such callees are inlined by prelink cgscc, then next level callees will likely be inlined by postlink FDO, which is a discrepancy from llvm-profgen preliner.
> > > >
> > > > Currently if all functions have profiles, multi-level is naturally supported with the priority-based BFS processing by tweaking calliste costs. If any function in one call chain doesn't have a profile, BFS will stop at that level.
> > > >
> > > We don't support multiple levels even with priority-based BFS inliner in the sense that we never look ahead to see if there's anything hot underneath a cold callee. Callee without profile is just one example of cold callee.
> > I see. Yes, callees without profile is a case of cold callees. They shouldn't be an issue without cgscc inlining sitting between the preliner and the targeted postlink FDO inliner. With the cgscc inlining, we might need sort of simulation for that, which might be quite different with the current top-down simulator.
> Yeah, I hope that cgscc inline will mostly only deal with cold/small function inlining for csspgo, in which case profile adjustment is less important, hence preinline is less important (extreme case of cold inlining is cases where we don't have profile at all, then pre-inline doesn't matter). If we actually find cgscc inlining covering some hot inlining, I think it'd be worth looking to see why sample loader doesn't handle such cases. So I hope we don't have to do much for preinline estimation for cgscc inliner.
Sounds good. Current implementation looks good to me. Would like to see this moving forward and evolving close to the real inliner.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99146/new/
https://reviews.llvm.org/D99146
More information about the llvm-commits
mailing list