[PATCH] D120784: [CSSPGO][PriorityInliner] Do not use block weight to drive callsite inlining.
Wenlei He via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 1 18:25:56 PST 2022
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.
lgtm, thanks.
================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:1309
+ uint64_t CallsiteCount =
+ CalleeSamples ? CalleeSamples->getEntrySamples() * Factor : 0;
*NewCandidate = {CB, CalleeSamples, CallsiteCount, Factor};
----------------
hoy wrote:
> hoy wrote:
> > wenlei wrote:
> > > hoy wrote:
> > > > wenlei wrote:
> > > > > When we don't have callee samples, should we fall back to call site block counts?
> > > > >
> > > > > In reality we would also need to tolerate some source change, i.e. the call site didn't exist in pass1 build.
> > > > If the callsite doesn't exist in pass1, the caller profile will probably be discarded due to checksum mismatch. So here when callee sample is missing, it's likely that the callsite is cold in this particular context. Using block count might end up treating it as hot.
> > > > If the callsite doesn't exist in pass1, the caller profile will probably be discarded due to checksum mismatch.
> > >
> > > Is it? If that's the behavior today, it defeats the purpose of CFG based profile - we should be able to match as long as CFG does not change. we can have a change that adds a call site without changing CFG.
> > Interesting point on tolerating source changes. That would require some changes to probe numbering. So far we number callsite probes sequentially which is easily broken with a new callsite introduced. We might somehow need to keep this numbering stable across builds.
> Yes, callsites are currently considered when computing the function checksum:
>
> ```
> FunctionHash = (uint64_t)CallProbeIds.size() << 48 |
> (uint64_t)Indexes.size() << 32 | JC.getCRC();
> ```
>
> We do this to prevent callsite samples mismatch. E.g, indirect call target samples can be applied to an irrelevant callsite when the number of callsite changes.
This is something we need to address eventually. The design of CSSPGO is made to tolerate changes not altering CFG, but looks like the current implementation does not satisfy that yet.. For now this change looks fine.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120784/new/
https://reviews.llvm.org/D120784
More information about the llvm-commits
mailing list