[cfe-dev] [llvm-dev] llvm and clang are getting slower
Nico Weber via cfe-dev
cfe-dev at lists.llvm.org
Tue Mar 8 11:11:29 PST 2016
On a somewhat smaller (but hopefully more actionable) scale, we noticed
that build time regressed ~10% recently in 262315:262447. I'm still trying
to repro locally (no luck so far; maybe it's a bot config thing, not a
clang-side problem), but if this rings a bell to anyone, please let me know
On Tue, Mar 8, 2016 at 1:49 PM, Xinliang David Li via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
> On Tue, Mar 8, 2016 at 10:22 AM, Daniel Berlin via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>> On Tue, Mar 8, 2016 at 9:55 AM, Hal Finkel via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>> ----- Original Message -----
>>> > From: "Mehdi Amini via cfe-dev" <cfe-dev at lists.llvm.org>
>>> > To: "Rafael Espíndola" <rafael.espindola at gmail.com>
>>> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "cfe-dev" <
>>> cfe-dev at lists.llvm.org>
>>> > Sent: Tuesday, March 8, 2016 11:40:47 AM
>>> > Subject: Re: [cfe-dev] [llvm-dev] llvm and clang are getting slower
>>> > Hi Rafael,
>>> > CC: cfe-dev
>>> > Thanks for sharing. We also noticed this internally, and I know that
>>> > Bruno and Chris are working on some infrastructure and tooling to
>>> > help tracking closely compile time regressions.
>>> > We had this conversation internally about the tradeoff between
>>> > compile-time and runtime performance, and I planned to bring-up the
>>> > topic on the list in the coming months, this looks like a good
>>> > occasion to plant the seed. Apparently in the past (years/decade
>>> > ago?) the project was very conservative on adding any optimizations
>>> > that would impact compile time, however there is no explicit policy
>>> > (that I know of) to address this tradeoff.
>>> > The closest I could find would be what Chandler wrote in:
>>> > http://reviews.llvm.org/D12826 ; for instance for O2 he stated that
>>> > "if an optimization increases compile time by 5% or increases code
>>> > size by 5% for a particular benchmark, that benchmark should also be
>>> > one which sees a 5% runtime improvement".
>>> > My hope is that with better tooling for tracking compile time in the
>>> > future, we'll reach a state where we'll be able to consider
>>> > "breaking" the compile-time regression test as important as breaking
>>> > any test: i.e. the offending commit should be reverted unless it has
>>> > been shown to significantly (hand wavy...) improve the runtime
>>> > performance.
>>> > <troll>
>>> > With the current trend, the Polly developers don't have to worry
>>> > about improving their compile time, we'll catch up with them ;)
>>> > </troll>
>>> My two largest pet peeves in this area are:
>> I think you hit on something that i would expand on:
>> We don't hold the line very well on adding little things to passes and
>> analysis over time.
>> We add 1000 little walkers and pattern matchers to try to get better
>> code, and then often add knobs to try to control their overall compile time.
>> At some point, these all add up. You end up with the same flat profile if
>> you do this everywhere, but your compiler gets slower.
>> At some point, someone has to stop and say "well, wait a minute, are
>> there better algorithms or architecture we should be using to do this", and
>> either do it, or not let it get worse :) I'd suggest, in most cases, we
>> know better ways to do almost all of these things.
>> Don't get me wrong, i don't believe there is any theoretically pure way
>> to do everything that we can just implement and never have to tweak. But
>> it's a continuum, and at some point you have to stop and re-evaluate
>> whether the current approach is really the right one if you have to have a
>> billion little things to it get what you want.
>> We often don't do that.
>> We go *very* far down the path of a billion tweaks and adding knobs, and
>> what we have now, compile time wise, is what you get when you do that :)
>> I suspect this is because we don't really want to try to force work on
>> people who are just trying to get crap done. We're all good contributors
>> trying to do the right thing, and saying no often seems obstructionist, etc.
>> The problem is at some point you end up with the tragedy of the commons.
>> (also, not everything in the compiler has to catch every case to get good
>>> 1. We often use functions from ValueTracking (to get known bits, the
>>> number of sign bits, etc.) as through they're low cost. They're not really
>>> low cost. The problem is that they *should* be. These functions do
>>> bottom-up walks, and could cache their results. Instead, they do a limited
>>> walk and recompute everything each time. This is expensive, and a
>>> significant amount of our InstCombine time goes to ValueTracking, and that
>>> shouldn't be the case. The more we add to InstCombine (and related passes),
>>> and the more we run InstCombine, the worse this gets. On the other hand,
>>> fixing this will help both compile time and code quality.
>> (LVI is another great example. Fun fact: If you ask for value info for
>> everything, it's no longer lazy ....)
> Yep -- see the bug Wei is working on:
>>> Furthermore, BasicAA has the same problem.
>>> 2. We have "cleanup" passes in the pipeline, such as those that run
>>> after loop unrolling and/or vectorization, that run regardless of whether
>>> the preceding pass actually did anything. We've been adding more of these,
>>> and they catch important use cases, but we need a better infrastructure for
>>> this (either with the new pass manager or otherwise).
>>> Also, I'm very hopeful that as our new MemorySSA and GVN improvements
>>> materialize, we'll see large compile-time improvements from that work. We
>>> spend a huge amount of time in GVN computing memory-dependency information
>>> (the dwarfs the time spent by GVN doing actual value numbering work by an
>>> order of magnitude or more).
>> I'm a working on it ;)
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev