<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 27, 2015 at 10:11 AM, Lee Hunt <span dir="ltr"><<a href="mailto:leehu@exchange.microsoft.com" target="_blank">leehu@exchange.microsoft.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Thanks! CIL [LeeHu] for a few comments…<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Xinliang David Li [mailto:<a href="mailto:xinliangli@gmail.com" target="_blank">xinliangli@gmail.com</a>]
<br>
<b>Sent:</b> Wednesday, May 27, 2015 9:29 AM<br>
<b>To:</b> Lee Hunt<br>
<b>Cc:</b> <a href="mailto:llvmdev@cs.uiuc.edu" target="_blank">llvmdev@cs.uiuc.edu</a><span class=""><br>
<b>Subject:</b> Re: [LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)<u></u><u></u></span></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">On Tue, May 26, 2015 at 8:47 PM, Lee Hunt <<a href="mailto:leehu@exchange.microsoft.com" target="_blank">leehu@exchange.microsoft.com</a>> wrote:<u></u><u></u></p><span class="">
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<p class="MsoNormal">Hello –<u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<p class="MsoNormal">I’m an Engineer in Microsoft Office after looking into possible advantages of using PGO for our Android Applications.<u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<p class="MsoNormal">We at Microsoft have deep experience with Visual C++’s
<a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__msdn.microsoft.com_en-2Dus_library_e7k32f4k.aspx&d=AwMFAg&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=CDx6fJHiO_U5ya1dHZhv-O5nAU_botD-I7BAyxPZXZE&s=L5s90Jkxqk45FMvD7qA0Visu71cC_bqMyLK3h0RSZtU&e=" target="_blank">
Profile Guided Optimization</a> and often see 10% or more reduction in the size of application code loaded after using PGO for key scenarios (e.g. application launch). <u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">yes. This is true for the GCC too. Clang's PGO does not shrink code size yet.<u></u><u></u></p>
</div>
</span><div>
<p class="MsoNormal"><span style="color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">[LeeHu] Note: I’m not talking about shrinking code size, but rather reordering it such that only ‘active’ branches within the profiled functions are grouped together
in ‘hot’ code pages. This is a very big optimization for us in VC++ toolchain in PGO.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">We also have the “/LTCG” flag – which is seemingly similar to the “-flto” Clang flag -- that *<b>does*</b> shrink code by various means (dead code removal, common
IL tree collapsing) because it can see all the object code for an entire produced target binary (e.g. .exe or .dll).<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Does -flto also shrink code?</span><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u><u></u></span></p>
<p class="MsoNormal"> </p></div></div></div></div></div></div></blockquote><div><br></div><div>That depends on other options used (e.g, -Os). With LTO, compiler sees larger scope, performs cross module inlines and dead function eliminations. It does have more opportunities to shrink code.</div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><div><div><div><div><p class="MsoNormal"><u></u><u></u></p>
</div><span class="">
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<p class="MsoNormal"> Making application launch quickly is very important to us, and reducing the number of code pages loaded helps with this goal.<u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<p class="MsoNormal">Before we dig into turning it on, I’m wondering if there’s any pre-existing research / case studies about possible code page reduction seen from other Clang PGO-enabled applications?
It sounds like there is some possible instrumented run performance problems due to counter contention resulting in sluggish performance and perhaps skewed profile data:
<a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_forum_-23-21topic_llvm-2Ddev_cDqYgnxNEhY&d=AwMFAg&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=CDx6fJHiO_U5ya1dHZhv-O5nAU_botD-I7BAyxPZXZE&s=YaUiiOgIrmA6Io5p4aWzmppYDAKyp8ddTwozd_l-Wjg&e=" target="_blank">
https://groups.google.com/forum/#!topic/llvm-dev/cDqYgnxNEhY</a>. <u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Counter contention is one issue. Redundant counter updates is another major issue (due to the early instrumentation). We are working on the later and see great speed ups.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<p class="MsoNormal">I’d like an overview of the optimizations that PGO does, but I don’t find much from looking at the Clang PGO section:
<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__clang.llvm.org_docs_UsersManual.html-23profile-2Dguided-2Doptimization&d=AwMFAg&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=CDx6fJHiO_U5ya1dHZhv-O5nAU_botD-I7BAyxPZXZE&s=cKiMsZqz31mbPqwGaH_hX2B8sTtFSJ65A4_vbF-fkB4&e=" target="_blank">
http://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization</a>.<u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</span><div><span class="">
<p class="MsoNormal">Profile data is not used in any IPA passes yet. It is used by any post inline optimizations though -- including block layout, register allocator etc.<u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
</span><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">[LeeHu]: sorry for naïve question, but what is IPA? </span></p></div></div></div></div></div></div></blockquote><div><br></div><div><br></div><div>Inter-procedural analysis/optimizations.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><div><div><div><div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">And what post-inline optimizations are currently being done? We’re currently using Clang 3.5 if that matters.<u></u><u></u></span></p>
</div><span class="">
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<p class="MsoNormal">For example, from reading different pages on how Clang PGO, it’s unclear if it does “block reordering” (i.e. moving unexecuted code blocks to a distant code page, leaving only ‘hot’
executed code packed together for greater code density).<u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">LLVM's block placement uses branch probability and frequency data, but there is no function splitting optimization yet.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<p class="MsoNormal"> I find mention of “hot arc” optimization (-fprofile-arcs) , but I’m unclear if this is the same thing. Does Clang PGO do block reordering?<u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">It does reordering, but does not do splitting/partitioning.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">David<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<p class="MsoNormal">Thanks,<u></u><u></u></p>
<p class="MsoNormal">--Lee<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><u></u><u></u></p>
</blockquote>
</span></div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
</div>
</div>
</blockquote></div><br></div></div>