<div dir="ltr">Hans's solution to this problem when we deployed PGO for Chrome was to pick a representative C++ file from the codebase, pre-process it, and use that as an input during PGO training. This fails to cover input reading and use cases like LTO, but it at least ensures that all the Sema, optimizer, and codegen codepaths are representatively exercised.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Oct 8, 2021 at 11:49 AM Chris Bieneman via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;">This also came up during the LLVM Distributor's conference talk on PGO (<a href="https://github.com/ClangBuiltLinux/llvm-distributors-conf-2021/issues/4" target="_blank">https://github.com/ClangBuiltLinux/llvm-distributors-conf-2021/issues/4</a>).<div><br></div><div>Honestly I'm not actually convinced there's that much difference between carefully selected and curated collections of PGO data, and building a few "Hello World" type simple programs.</div><div><br></div><div>When I was working on Clang at Apple much of our instrumentation showed that process launch time was the most consistent place that we could optimize performance to get significant wins that were pretty universal.</div><div><br></div><div>When I added the in-tree multi-stage PGO that used LIT to run instrumented compiles, I found that just the one C++ hello-world program had something crazy like a 6% performance improvement. I'd love to see us add a few more source files into that system so that we could tune it a bit, but I never had the time.</div><div><br></div><div>-Chris</div><div><div><br><blockquote type="cite"><div>On Oct 8, 2021, at 1:39 PM, Shoaib Meenai <<a href="mailto:smeenai@fb.com" target="_blank">smeenai@fb.com</a>> wrote:</div><br><div><div>(now actually CCing him correctly)<br><br>On 10/8/21, 11:08 AM, "cfe-dev on behalf of Shoaib Meenai via cfe-dev" <<a href="mailto:cfe-dev-bounces@lists.llvm.org" target="_blank">cfe-dev-bounces@lists.llvm.org</a> on behalf of <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>> wrote:<br><br> CCing Chris; I remember discussing something like this with him at a developer meeting, but I don't remember his recommendation from the time :)<br><br> On 10/7/21, 1:10 PM, "cfe-dev on behalf of Tom Stellard via cfe-dev" <<a href="mailto:cfe-dev-bounces@lists.llvm.org" target="_blank">cfe-dev-bounces@lists.llvm.org</a> on behalf of <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>> wrote:<br><br> Hi,<br><br> I'm trying to generate profile data for clang by building the some of the packages<br> we ship in Fedora Linux. I'm trying to decide how many packages to build, is<br> there much advantage to building 1000 vs something substantially less, like 100?<br><br> -Tom<br><br> _______________________________________________<br> cfe-dev mailing list<br> <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a> <br><br> _______________________________________________<br> cfe-dev mailing list<br> <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a> <br><br></div></div></blockquote></div><br></div></div>_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote></div>