[PATCH] D45377: [SampleFDO] Don't let inliner treat warm callsite with inline instance in the profile as cold

Mon Apr 16 09:42:19 PDT 2018

danielcdh added a comment.

In https://reviews.llvm.org/D45377#1068853, @wmi wrote:

> In https://reviews.llvm.org/D45377#1068438, @danielcdh wrote:
>
> > Thanks for the fix!
> >
> > I think maybe a preferred way to fix this is to change SampleProfileLoader::inlineHotFunctions to inline these "warm" inlined callsites early. The current algorithm uses callsiteIsHot, which compares inline instance's total count to the caller's total count, which could be misleading if the caller is super large/hot. A better algorithm should compare inline instance's total count to PSI to get a global hotness. In this way, if the profile annotator thinks a callsite is not hot, the later inliner should *not* even try to inline it. This makes the design cleaner and more stable. WDYT?
>
>
> I tried the idea to compute the inline instance's total count divided by its bb count, and compare the division result to PSI hot threshold. That improved the regression benchmark but did not recover the whole regression. That is why I choosed to keep the current callsiteIsHot check in early inliner unchanged because I guessed regular inliner may have a better position to decide whether to inline such warm/medium size callsite.

I suppose the regression comes from iterative-AutoFDO?

The problem of letting regular inliner to handle warm callsites is that the callee may have profile missing if it is fully inlined. Maybe instead of comparing total_count/num_calle_bb to hot threshold, just compare total_count to hot threshold? I agree this may increase code size a little, but it should not be worst than the previous afdo binary?

Repository:
  rL LLVM

https://reviews.llvm.org/D45377