[PATCH] D99146: [CSSPGO][llvm-profgen] Context-sensitive global pre-inliner

Thu Mar 25 09:46:38 PDT 2021

On Wed, Mar 24, 2021 at 11:10 PM Wenlei He via Phabricator <
reviews at reviews.llvm.org> wrote:

> wenlei added a comment.
>
> In D99146#2649707 <https://reviews.llvm.org/D99146#2649707>, @wmi wrote:
>
> > In D99146#2649599 <https://reviews.llvm.org/D99146#2649599>, @wenlei
> wrote:
> >
> >> In D99146#2649557 <https://reviews.llvm.org/D99146#2649557>, @davidxl
> wrote:
> >>
> >>> ThinLTO is known to have issues related to profile update (cross
> module), so we were thinking something similar in ThinLink phase.
> >>
> >> This is the exact problem we are trying to mitigate. We also considered
> doing this in ThinLink but adjusting profiles for thin-backends and
> communicating inline decisions to thin-backends would add quite a bit of
> complexity, which could also slow down ThinLink. With CSSPGO, doing it in
> profile generation and use adjusted profile to convey inline
> estimation/suggestion is much simpler and cheaper.
> >>
> >>> One of the issues is that the pre-inlining needs to make similar
> decisions as the compiler. How well is the preinliner doing in this regard?
> >>
> >> Yes, this is a challenge. We don't have data yet, but I hope with some
> tuning we can get them to be close. One problem with doing pre-inlining is
> we don't have a lot of information that compiler can see from IR, though if
> needed some of that can be embedded into binary (some metadata in probe
> descriptor, etc.) for preinliner. I hope a more accurate view on machine
> code byte size for inline cost can offset some of the disadvantages due to
> lack of IR.
> >
> > It is a good idea to have an non-intrusive way to predict cross-module
> inlining decision and update the profile beforehand.
> >
> > To mitigate ThinLTO profile update issue, either apparent inline or
> no-inline decisions can be made. From the patch description, seems
> currently it only considers the case that no-inline decision is made and
> profile can be merged back. Have you considered the case that inline is
> apparently beneficial and profile without context can be split?
>
> If we don't have context profile from raw input profile, the split is
> going to be a simple scaling based on call site counts, right? In that
> case, doing it in profile generation won't improve profile quality because
> the scaling won't be very different from the scaling done by cgscc inliner.
> Though if we split the profiles to synthesize context profile, sample
> loader would be able to inline more, but if we want we could allow sample
> loader inlining to do scaling.
>

Right, that will be scaling based on callsite counts. However, cgscc
inlining or sample loader inlining cannot do the scaling if it is
cross-module inlining.

>
> >> We'll be working on tuning the preinliner to get it to be close to
> compiler inliner. This is similar to the effort of transferring more
> inlining from cgscc inliner to sample loader inliner in that we may not see
> immediate results, but over time, as the new component matures, we hope to
> reap benefits later.
>
>
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D99146/new/
>
> https://reviews.llvm.org/D99146
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210325/66020015/attachment.html>