[PATCH] D15449: [PassManagerBuilder] Add a few more scalar optimization passes

Mon Dec 14 08:54:39 PST 2015

Hi Hal, Mehdi,

Hal: I checked and as I suspected, SCCP, instcombine, simplifycfg and
bdce/adce all preserve GlobalsAA.

Mehdi: I have numbers for you :)

Running test-suite -flto on an AArch64 out-of-order platform:
  lnt.MultiSource/Benchmarks/FreeBench/analyzer/analyzer
<http://llvm-test.cambridge.arm.com:8000/db_default/v4/nts/3534/graph?test.188=3>
 2.57%
lnt.MultiSource/Benchmarks/McCat/17-bintr/bintr
<http://llvm-test.cambridge.arm.com:8000/db_default/v4/nts/3534/graph?test.151=3>
 -5.34%
lnt.MultiSource/Benchmarks/Ptrdist/bc/bc
<http://llvm-test.cambridge.arm.com:8000/db_default/v4/nts/3534/graph?test.72=3>
 -1.30%

So not much change (certainly less change than I expected), but overall
positive.

On a third party benchmark I get improvements ranging from 1%-11%, and
regressions from 1% to 3% (with more improvements than regressions). I also
happen to know that when a patch under review goes in, an edge case in this
suite gets triggered and one testcase doubles in performance.

I ran compile time numbers too. My test was codegenning/linking llvm-tblgen
using -flto on a macbook pro:
  with patch: 34.64s, 33.62s, 33.89s, 33.33s, 33.8s -  median: 33.80s
  without patch: 34.26s, 34.49s, 33.54s, 31.89s, 32.57s - median: 33.54s

Difference in medians: 0.78% (the samples are so close it probably needs
more samples to be properly statistically relevant though!)

Cheers,

James

On Fri, 11 Dec 2015 at 18:05 Mehdi Amini <mehdi.amini at apple.com> wrote:

> On Dec 11, 2015, at 8:19 AM, James Molloy <james at jamesmolloy.co.uk> wrote:
>
> Hi,
>
> > - I'd rather see this as two patches: one for the GlobalOpt and the
> other for the scalar optimizations
>
> Sure, that's easily done. Would you prefer me to open another phab review
> or are happy with it being committed split apart?
>
>
> It is more about the commit. So that the performance can be assessed
> separately and any issue would be better bisected.
>
>
> > - Do you have benchmark results before/after?
>
> Yes and no. The mem2reg changes do affect benchmarks I care about, but
> they're not in test-suite and I'm not allowed to quote numbers from them.
>
> I don't have an LTO setup of the test-suite to get numbers for the LTO
> portions either (although I do have LTO set up for third party test suites
> that I can't quote numbers from!). I haven't seen any regressions in any
> test, and some improve drastically. Sorry for the weasel words.
>
>
> Can you at least give an overview (without naming), like “on some internal
> benchmarks it improves XX% on average, with XX test cases that regressed
> around ~XX%” ?
>
>
>
> As a general principle, I think the LTO driver isn't currently doing
> enough scalar optimization. I've seen several cases where really poor code
> gets through to late passes like CGP purely because SimplifyCFG/InstCombine
> weren't run enough.
>
>
>
> Clearly, the problem is the tradeoff with the compile time.
>
>
> > - See also: http://reviews.llvm.org/D13443 ; I paused my work on this
> till January because of the ThinLTO bringup, but I still plan to move
> forward with it.
>
> This looks good. It looks like a real reegineering of the pipeline, which
> is a bit more work than I was hoping to chew off - I hope that my work
> might go some way towards improving the LTO codegen without requiring
> thousands of benchmarking hours to check it's OK!
>
>
>
> Indeed I spent some hundred of hours of benchmarking in September. I’d be
> happy if you could test D13443 on your hardware/bench by the way :)
>
>
>
> (Aside, in D13443 you don't run GlobalOpt/Mem2Reg early. I think
> functionattrs+globalopt+mem2reg needs to run as early as possible so that
> demoted globals become first class SSA values for the whole of the pass
> pipeline).
>
>
> Note that global opt needs *also* to run after the inliner because it can
> do more work. But again compile time...
>
> —
> Mehdi
>
>
>
>
>
> James
>
> On Fri, 11 Dec 2015 at 16:08 Mehdi AMINI via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>> joker.eph added a comment.
>>
>> Hi James,
>>
>> A few points:
>>
>> - I'd rather see this as two patches: one for the GlobalOpt and the other
>> for the scalar optimizations
>> - Do you have benchmark results before/after?
>> - See also: http://reviews.llvm.org/D13443 ; I paused my work on this
>> till January because of the ThinLTO bringup, but I still plan to move
>> forward with it.
>>
>> Thanks!
>>
>>
>> Repository:
>>   rL LLVM
>>
>> http://reviews.llvm.org/D15449
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151214/f4828e0d/attachment.html>