[LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)

Xinliang David Li xinliangli at gmail.com
Wed May 27 09:29:25 PDT 2015


On Tue, May 26, 2015 at 8:47 PM, Lee Hunt <leehu at exchange.microsoft.com>
wrote:

>  Hello –
>
>
>
> I’m an Engineer in Microsoft Office after looking into possible advantages
> of using PGO for our Android Applications.
>
>
>
> We at Microsoft have deep experience with Visual C++’s Profile Guided
> Optimization
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__msdn.microsoft.com_en-2Dus_library_e7k32f4k.aspx&d=AwMFAg&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=CDx6fJHiO_U5ya1dHZhv-O5nAU_botD-I7BAyxPZXZE&s=L5s90Jkxqk45FMvD7qA0Visu71cC_bqMyLK3h0RSZtU&e=>
> and often see 10% or more reduction in the size of application code loaded
> after using PGO for key scenarios (e.g. application launch).
>

yes. This is true for the GCC too.  Clang's PGO does not shrink code size
yet.


>  Making application launch quickly is very important to us, and reducing
> the number of code pages loaded helps with this goal.
>
>
>
> Before we dig into turning it on, I’m wondering if there’s any
> pre-existing research / case studies about possible code page reduction
> seen from other Clang PGO-enabled applications?  It sounds like there is
> some possible instrumented run performance problems due to counter
> contention resulting in sluggish performance and perhaps skewed profile
> data: https://groups.google.com/forum/#!topic/llvm-dev/cDqYgnxNEhY
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_forum_-23-21topic_llvm-2Ddev_cDqYgnxNEhY&d=AwMFAg&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=CDx6fJHiO_U5ya1dHZhv-O5nAU_botD-I7BAyxPZXZE&s=YaUiiOgIrmA6Io5p4aWzmppYDAKyp8ddTwozd_l-Wjg&e=>.
>
>

Counter contention is one issue. Redundant counter updates is another major
issue (due to the early instrumentation). We are working on the later and
see great speed ups.



> I’d like an overview of the optimizations that PGO does, but I don’t find
> much from looking at the Clang PGO section:
> http://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__clang.llvm.org_docs_UsersManual.html-23profile-2Dguided-2Doptimization&d=AwMFAg&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=CDx6fJHiO_U5ya1dHZhv-O5nAU_botD-I7BAyxPZXZE&s=cKiMsZqz31mbPqwGaH_hX2B8sTtFSJ65A4_vbF-fkB4&e=>
> .
>

Profile data is not used in any IPA passes yet. It is used by any post
inline optimizations though -- including block layout, register allocator
etc.



>
>
> For example, from reading different pages on how Clang PGO, it’s unclear
> if it does “block reordering” (i.e. moving unexecuted code blocks to a
> distant code page, leaving only ‘hot’ executed code packed together for
> greater code density).
>

LLVM's block placement uses branch probability and frequency data, but
there is no function splitting optimization yet.

 I find mention of “hot arc” optimization (-fprofile-arcs) , but I’m
> unclear if this is the same thing.  Does Clang PGO do block reordering?
>
>
It does reordering, but does not do splitting/partitioning.

David



>
>
> Thanks,
>
> --Lee
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150527/fcb09083/attachment.html>


More information about the llvm-dev mailing list