[cfe-dev] How much profile data is enough for pgo?

Fri Oct 8 15:09:00 PDT 2021

yes it really depends on the point of 'diminishing' returns that is
determined.  Some may think 6% to 10% performance improvement with 2x
coverage is not worth it (or not having perceivable impact on users), but
others may think an additional 0.5% is worth the effort even with 5x more
training due to power or cpu savings :).    This depends on the type of
apps and the scale of the deployment of the optimized product.

David

On Fri, Oct 8, 2021 at 11:58 AM Fāng-ruì Sòng via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> On Fri, Oct 8, 2021 at 11:49 AM Chris Bieneman via cfe-dev
> <cfe-dev at lists.llvm.org> wrote:
> >
> > This also came up during the LLVM Distributor's conference talk on PGO (
> https://github.com/ClangBuiltLinux/llvm-distributors-conf-2021/issues/4).
> >
> > Honestly I'm not actually convinced there's that much difference between
> carefully selected and curated collections of PGO data, and building a few
> "Hello World" type simple programs.
> >
> > When I was working on Clang at Apple much of our instrumentation showed
> that process launch time was the most consistent place that we could
> optimize performance to get significant wins that were pretty universal.
> >
> > When I added the in-tree multi-stage PGO that used LIT to run
> instrumented compiles, I found that just the one C++ hello-world program
> had something crazy like a 6% performance improvement. I'd love to see us
> add a few more source files into that system so that we could tune it a
> bit, but I never had the time.
> >
> > -Chris
>
> Yes, the initial enablement (even with a hello-world program) can give
> decent speed up.
> After that, training llvm-project itself, or other dedicated
> applications has little marginal benefit.
> So I'd just pick one medium-sized C and one C++ applications for training
> data.
> If distributors thinks adding more training data is easy, adding up to
> 10 applications still looks goo to me.
> 100 or 1000 are definitely too much and don't worth the hassle :)
>
> > On Oct 8, 2021, at 1:39 PM, Shoaib Meenai <smeenai at fb.com> wrote:
> >
> > (now actually CCing him correctly)
> >
> > On 10/8/21, 11:08 AM, "cfe-dev on behalf of Shoaib Meenai via cfe-dev" <
> cfe-dev-bounces at lists.llvm.org on behalf of cfe-dev at lists.llvm.org> wrote:
> >
> >    CCing Chris; I remember discussing something like this with him at a
> developer meeting, but I don't remember his recommendation from the time :)
> >
> >    On 10/7/21, 1:10 PM, "cfe-dev on behalf of Tom Stellard via cfe-dev" <
> cfe-dev-bounces at lists.llvm.org on behalf of cfe-dev at lists.llvm.org> wrote:
> >
> >        Hi,
> >
> >        I'm trying to generate profile data for clang by building the
> some of the packages
> >        we ship in Fedora Linux.  I'm trying to decide how many packages
> to build, is
> >        there much advantage to building 1000 vs something substantially
> less, like 100?
> >
> >        -Tom
> >
> >        _______________________________________________
> >        cfe-dev mailing list
> >        cfe-dev at lists.llvm.org
> >        https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> >
> >    _______________________________________________
> >    cfe-dev mailing list
> >    cfe-dev at lists.llvm.org
> >    https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> >
> >
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>
> --
> 宋方睿
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20211008/e722aba7/attachment.html>