[llvm-dev] Current PGO status
Xinliang David Li via llvm-dev
llvm-dev at lists.llvm.org
Wed Feb 7 14:22:54 PST 2018
Victor, please file a bug tracking the issue. We can put relevant
information there including test cases used in the experiment etc.
thanks,
David
On Wed, Feb 7, 2018 at 2:15 PM, Victor Leschuk <vleschuk at accesssoftek.com>
wrote:
> David, could you please clarify on which code did you gain 10%
> improvement? I have run numerous tests with and w/o this option and it
> looks like it has no effect on performance (I am talking of the old 2016
> sample to be concrete). Maybe we could investigate it together? Just tell
> me where to start?
>
> On 02/07/2018 02:11 AM, Xinliang David Li wrote:
>
> Victor, thanks for the experiment.
>
> My suspicion is it is due to the remaining issues with block layout --
> especially with loop rotation (with PGO). Another problem is that tail dup
> is not happening after loop rotation which can limit the effectiveness of
> loop rotation.
>
> I tried the internal option -mllvm -force-precise-rotation-cost and there
> is about 10% speedup with -fprofile-use. This option turns on more precise
> cost model when computing rotation strategy but it is not turned on by
> default.
>
> +carrot who is working on this area.
>
> thanks,
>
> David
>
> On Tue, Feb 6, 2018 at 1:37 PM, Victor Leschuk <vleschuk at accesssoftek.com>
> wrote:
>
>> Hello David, thanks for detailed response!
>>
>> Do you have any tests that you use to measure the PGO effectiveness? I
>> have tested clang version 6.0 with the same sample that Jie Chen used in
>> 2016 and actually both frontend-based PGO and IR-based make code run
>> slower, see the average time:
>>
>> clang++ -O3: 3.15 sec
>>
>> clang++ -O3 and -fprofile-instr-use: 3.160 sec
>>
>> clang++ -O3 and -fprofile-use: 3.180 sec
>>
>> g++ (7.3.0) -O3: 3.640 sec
>>
>> g++ (7.3.0) -O3 and -fprofile-use: 2.92 sec
>>
>> Do you have any idea what can be wrong? Maybe there are some
>> recommendations in which cases one should use PGO with clang and when it is
>> better not to do it?
>>
>> Thanks!
>>
>> On 02/05/2018 09:38 AM, Xinliang David Li wrote:
>>
>>
>>
>> On Sun, Feb 4, 2018 at 9:59 PM, Victor Leschuk <vleschuk at accesssoftek.com
>> > wrote:
>>
>>> Hello David!
>>>
>>> I have recently started acquaintance with PGO in LLVM/clang and found
>>> your e-mail thread:
>>> http://lists.llvm.org/pipermail/llvm-dev/2016-May/099395.html . Here you
>>> posted a nice list of optimizations that use profiling and of those
>>> which could be using but don't. However that thread is about 2 years
>>> old. Could you please kindly let me know if there were any significant
>>> changes in this area since that time?
>>>
>>
>>
>> Yes, there were quite some changes since then. Here are some of the new
>> features:
>>
>> * LLVM IR based PGO -- this is designed to maximize program performance.
>> The option to turn it on is -fprofile-generate/-fprofile-use
>> * value profiling support in PGO -- currently support indirect call
>> target profiling and memcpy/memset size profiling and optimizations
>> * Profile data is made available for inliner to use (enabled only for the
>> new pass manager: -fexperimental-new-pass-manager)
>> * Profile aware LICM is available -- implemented via a profile driven
>> code sinking pass
>> * Partial inlining is made profile aware; Graham Yu also added support
>> for multiple region function outlining (with PGO)
>> * BB layout heuristics are tuned with PGO
>> * hotness driven function layout optimization
>>
>> There are pending work in the following area:
>> * profile aware loop vectorization, etc
>> * control heigh reduction optimization (Hiroshi is working on this)
>>
>> ThinLTO also works well with PGO.
>>
>> Hope this helps.
>>
>> David
>>
>> >* What I can tell you is that there are many missing ones (that can benefit
>> *from profile): such as profile aware LICM (patch pending), speculative PRE,
>> loop unrolling, loop peeling, auto vectorization, inlining, function
>> splitting, function layout, function outlinling, profile driven size
>> optimization, induction variable optimization/strength reduction, stringOp
>> specialization/optimization/inlining, switch peeling/lowering etc. The
>> biggest profile user today include ralloc, BB layout, ifcvt, shrinkwrapping
>> etc, but there should be rooms to be improvement there too.
>>
>>
>>
>>> Thanks in advance!
>>>
>>> --
>>> Best Regards,
>>>
>>> Victor Leschuk | Software Engineer | Access Softek
>>>
>>>
>>
>> --
>> Best Regards,
>>
>> Victor Leschuk | Software Engineer | Access Softek
>>
>>
>
> --
> Best Regards,
>
> Victor Leschuk | Software Engineer | Access Softek
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180207/712983b9/attachment.html>
More information about the llvm-dev
mailing list