[llvm-dev] llvm is getting slower, January edition

Wed Jan 18 22:00:18 PST 2017

Am I reading this right that over the course of the graph we have gotten
about 50% slower compiling this benchmark, and the execution time of the
benchmark has tripled? Those are significant regressions along both
dimensions.

-- Sean Silva

On Wed, Jan 18, 2017 at 3:35 PM, Mikhail Zolotukhin via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
> On Jan 18, 2017, at 3:21 PM, Jonathan Roelofs <jonathan at codesourcery.com>
> wrote:
>
>
>
> On 1/18/17 3:55 PM, Davide Italiano via llvm-dev wrote:
>
> On Tue, Jan 17, 2017 at 6:02 PM, Mikhail Zolotukhin
> <mzolotukhin at apple.com> wrote:
>
> Hi,
>
> Continuing recent efforts in understanding compile time slowdowns, I
> looked at some historical data: I picked one test and tried to pin-point
> commits that affected its compile-time. The data I have is not 100%
> accurate, but hopefully it helps to provide an overview of what's going on
> with compile time in LLVM and give a better understanding of what changes
> usually impact compile time.
>
> Configuration:
> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a single
> source file, but still takes a noticeable time to compile, which makes it
> very convenient for this kind of experiments. The file was compiled with Os
> for arm64 on x86 host.
>
> Results:
> The attached PDF has a compile time graph, on which I marked points where
> compile time changed with a list of corresponding commits. A textual
> version of the list is available below, but I think it might be much harder
> to comprehend the data without the graph. A number in the end shows compile
> time change after the given commit:
>
> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
> ignored. +1%
> 2. r241886: [InstCombine] Employ AliasAnalysis in
> FindAvailableLoadedValue. +1%
> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis for
> sub, mul and shl. +2%
> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI. -1%
> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
> simpler... +1%
>   r247240: [LPM] Use a map from analysis ID to immutable passes in the
> legacy pass manager... +3%
>   r247264: Enable GlobalsAA by default. +1%
> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to exploit
> trip counts'. +2%
> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after `GroupByComplexity`;
> NFCI. +4%
> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
> 11. No data
> 12. r259252: AttributeSetImpl: Summarize existing function attributes in a
> bitset. -1%
>    r259256: Add LoopSimplifyCFG pass. -2%
> 13. r262250: Enable LoopLoadElimination by default. +3%
> 14. r262839: Revert "Enable LoopLoadElimination by default". -3%
> 15. r263393: Remove PreserveNames template parameter from IRBuilder. -3%
> 16. r263595: Turn LoopLoadElimination on again. +3%
> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata. +4%
> 18. r268509: Do not disable completely loop unroll when optimizing for
> size. -34%
> 19. r269124: Loop unroller: set thresholds for optsize and minsize
> functions to zero. +50%
> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling analysis by
> default" one more time. -28%
> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>    r270959: Don't allocate unnecessarily in APInt::operator[+-].  NFC. -1%
>    r271020: Don't generate unnecessary signed ConstantRange during
> multiply.  NFC. -3%
> 23. r271615: [LoopUnroll] Set correct thresholds for new recently enabled
> unrolling heuristic. +22%
> 24. r276942: Don't invoke getName() from Function::isIntrinsic(). -1%
>    r277087: Revert "Don't invoke getName() from Function::isIntrinsic().",
> rL276942. +1%
> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing for
> size.
> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
> 27. r289755: Make processing @llvm.assume more efficient by using operand
> bundles. +6%
> 28. r290086: Revert @llvm.assume with operator bundles (r289755-r289757).
> -6%
>
>
> Disclaimer:
> The data is specific for this particular test, so I could have skipped
> some commits affecting compile time on other workloads/configurations.
> The data I have is not perfect, so I could have skipped some commits, even
> if they impacted compile-time on this test case.
> Same commits might have a different impact on a different
> test/configuration, up to the opposite to the one listed.
> I didn't mean to label any commits as 'good' or 'bad' by posting these
> numbers. It's expected that some commits increase compile time, we just
> need to be aware of it and avoid unnecessary slowdowns.
>
> Conclusions:
> Changes in optimization thresholds/cost-models usually have the biggest
> impact on compile time. However, usually they are well-assessed and
> trade-offs are discussed and agreed on.
> Introducing a pass doesn't necessarily mean a compile time slowdown.
> Sometimes the total compile time might decrease because we're saving some
> work for later passes.
> There are many commits, which individually have a low compile time impact,
> but together sum up to a noticeable slowdown.
> Conscious efforts on reducing compile time definitely help - thanks
> everyone who's been working on this!
>
> Thanks for reading, any comments or suggestions on how to make LLVM faster
> are welcome! I hope we'll see this graph going down this year :-)
>
> Michael
>
>
> This is great, thanks for the January update :)
> Do you mind to share how you collected the numbers (script etc.. and
> how you plotted the graph so I can try repeating at home with my
> testcases?)
>
>
> Out of pure curiosity, I would love to see the performance of the
> resulting binary co-plotted with the same horizontal axis as this compile
> duration data.
>
> LNT doesn't allow to plot them on the same graph, so that's I was able to
> just align them one with the other:
>
> Though I don't know how representative this test is for performance (but
> jump in the beginning looks very interesting).
>
> Michael
>
>
>
> Jon
>
>
> Thanks,
>
>
> --
> Jon Roelofs
> jonathan at codesourcery.com
> CodeSourcery / Mentor Embedded
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170118/9cc70b9a/attachment.html>