[PATCH] D99146: [CSSPGO][llvm-profgen] Context-sensitive global pre-inliner

Wenlei He via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 25 14:43:35 PDT 2021


Ok, I see what you’re getting at now. Yes, we could do scaling across module boundaries in llvm-profgen in this case.

But on the other hand, we expect all raw profiles to have context for CSSPGO, so we’re not concerned about this. I imagine it could be helpful for AutoFDO.

From: Wei Mi <wmi at google.com>
Date: Thursday, March 25, 2021 at 9:47 AM
To: Wenlei He <reviews+D99146+public+92a1b5a85215a3b1 at reviews.llvm.org>
Cc: Wenlei He <wenlei at fb.com>, Hongtao Yu <hoy at fb.com>, David Li <davidxl at google.com>, Lei Wang <wlei at fb.com>, Fangrui Song <maskray at google.com>, Michał Górny <mgorny at gentoo.org>, Easwaran Raman <eraman at google.com>, lxfind at gmail.com <lxfind at gmail.com>, Modi Mo <modimo at fb.com>, llvm-commits <llvm-commits at lists.llvm.org>, bhuvanendra.kumarn at amd.com <bhuvanendra.kumarn at amd.com>, 88888yl at gmail.com <88888yl at gmail.com>, dougpuob at gmail.com <dougpuob at gmail.com>, David Green <david.green at arm.com>, orlando.hyams at sony.com <orlando.hyams at sony.com>, yuanfang.chen at sony.com <yuanfang.chen at sony.com>, jeremy.morse.llvm at gmail.com <jeremy.morse.llvm at gmail.com>
Subject: Re: [PATCH] D99146: [CSSPGO][llvm-profgen] Context-sensitive global pre-inliner


On Wed, Mar 24, 2021 at 11:10 PM Wenlei He via Phabricator <reviews at reviews.llvm.org<mailto:reviews at reviews.llvm.org>> wrote:
wenlei added a comment.

In D99146#2649707 <https://reviews.llvm.org/D99146#2649707<https://reviews.llvm.org/D99146#2649707>>, @wmi wrote:

> In D99146#2649599 <https://reviews.llvm.org/D99146#2649599<https://reviews.llvm.org/D99146#2649599>>, @wenlei wrote:
>
>> In D99146#2649557 <https://reviews.llvm.org/D99146#2649557<https://reviews.llvm.org/D99146#2649557>>, @davidxl wrote:
>>
>>> ThinLTO is known to have issues related to profile update (cross module), so we were thinking something similar in ThinLink phase.
>>
>> This is the exact problem we are trying to mitigate. We also considered doing this in ThinLink but adjusting profiles for thin-backends and communicating inline decisions to thin-backends would add quite a bit of complexity, which could also slow down ThinLink. With CSSPGO, doing it in profile generation and use adjusted profile to convey inline estimation/suggestion is much simpler and cheaper.
>>
>>> One of the issues is that the pre-inlining needs to make similar decisions as the compiler. How well is the preinliner doing in this regard?
>>
>> Yes, this is a challenge. We don't have data yet, but I hope with some tuning we can get them to be close. One problem with doing pre-inlining is we don't have a lot of information that compiler can see from IR, though if needed some of that can be embedded into binary (some metadata in probe descriptor, etc.) for preinliner. I hope a more accurate view on machine code byte size for inline cost can offset some of the disadvantages due to lack of IR.
>
> It is a good idea to have an non-intrusive way to predict cross-module inlining decision and update the profile beforehand.
>
> To mitigate ThinLTO profile update issue, either apparent inline or no-inline decisions can be made. From the patch description, seems currently it only considers the case that no-inline decision is made and profile can be merged back. Have you considered the case that inline is apparently beneficial and profile without context can be split?

If we don't have context profile from raw input profile, the split is going to be a simple scaling based on call site counts, right? In that case, doing it in profile generation won't improve profile quality because the scaling won't be very different from the scaling done by cgscc inliner. Though if we split the profiles to synthesize context profile, sample loader would be able to inline more, but if we want we could allow sample loader inlining to do scaling.

Right, that will be scaling based on callsite counts. However, cgscc inlining or sample loader inlining cannot do the scaling if it is cross-module inlining.



>> We'll be working on tuning the preinliner to get it to be close to compiler inliner. This is similar to the effort of transferring more inlining from cgscc inliner to sample loader inliner in that we may not see immediate results, but over time, as the new component matures, we hope to reap benefits later.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99146/new/<https://reviews.llvm.org/D99146/new/>

https://reviews.llvm.org/D99146<https://reviews.llvm.org/D99146>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210325/367f2139/attachment.html>


More information about the llvm-commits mailing list