[llvm] r372232 - [SampleFDO] Minimize performance impact when profile-sample-accurate

Wei Mi via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 20 09:13:25 PDT 2019


The size increase is expected. We are planning to use
profile-sample-accurate more widely in Google, but simply enabling it will
cause performance regression so we use this patch to minimize it. The size
increase is mostly from more inlining, but the increase is small compared
with code size reduction from enabling profile-sample-accurate from our
experiment.

If code size is more of a concern for chrome on Android than performance, I
can guard the change not to be enabled for -Os and Oz (I guess chrome on
Android has already been built with Os or Oz).

Please let me know what you think.

Thanks,
Wei.

On Fri, Sep 20, 2019 at 4:14 AM Hans Wennborg <hans at chromium.org> wrote:

> This caused a ~70 KB size regression for Chrome on Android. Is this
> expected?
>
> Also +gbiv for Chrome Android pgo.
>
> We're tracking this in
> https://bugs.chromium.org/p/chromium/issues/detail?id=1006159
>
> On Wed, Sep 18, 2019 at 6:05 PM Wei Mi via llvm-commits
> <llvm-commits at lists.llvm.org> wrote:
> >
> > Author: wmi
> > Date: Wed Sep 18 09:06:28 2019
> > New Revision: 372232
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=372232&view=rev
> > Log:
> > [SampleFDO] Minimize performance impact when profile-sample-accurate
> > is enabled.
> >
> > We can save memory and reduce binary size significantly by enabling
> > ProfileSampleAccurate. However when ProfileSampleAccurate is true,
> > function without sample will be regarded as cold and this could
> > potentially cause performance regression.
> >
> > To minimize the potential negative performance impact, we want to be
> > a little conservative here saying if a function shows up in the profile,
> > no matter as outline instance, inline instance or call targets, treat
> > the function as not being cold. This will handle the cases such as most
> > callsites of a function are inlined in sampled binary (thus outline copy
> > don't get any sample) but not inlined in current build (because of source
> > code drift, imprecise debug information, or the callsites are all cold
> > individually but not cold accumulatively...), so that the outline
> function
> > showing up as cold in sampled binary will actually not be cold after
> current
> > build. After the change, such function will be treated as not cold even
> > profile-sample-accurate is enabled.
> >
> > At the same time we lower the hot criteria of callsiteIsHot check when
> > profile-sample-accurate is enabled. callsiteIsHot is used to determined
> > whether a callsite is hot and qualified for early inlining. When
> > profile-sample-accurate is enabled, functions without profile will be
> > regarded as cold and much less inlining will happen in CGSCC inlining
> pass,
> > so we can worry less about size increase and be aggressive to allow more
> > early inlining to happen for warm callsites and it is helpful for
> performance
> > overall.
> >
> > Differential Revision: https://reviews.llvm.org/D67561
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190920/a9efcfc4/attachment.html>


More information about the llvm-commits mailing list