[llvm-dev] [cfe-dev] llvm and clang are getting slower

Tue Mar 8 10:49:41 PST 2016

On Tue, Mar 8, 2016 at 10:22 AM, Daniel Berlin via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

>
>
> On Tue, Mar 8, 2016 at 9:55 AM, Hal Finkel via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> ----- Original Message -----
>> > From: "Mehdi Amini via cfe-dev" <cfe-dev at lists.llvm.org>
>> > To: "Rafael Espíndola" <rafael.espindola at gmail.com>
>> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "cfe-dev" <
>> cfe-dev at lists.llvm.org>
>> > Sent: Tuesday, March 8, 2016 11:40:47 AM
>> > Subject: Re: [cfe-dev] [llvm-dev] llvm and clang are getting slower
>> >
>> > Hi Rafael,
>> >
>> > CC: cfe-dev
>> >
>> > Thanks for sharing. We also noticed this internally, and I know that
>> > Bruno and Chris are working on some infrastructure and tooling to
>> > help tracking closely compile time regressions.
>> >
>> > We had this conversation internally about the tradeoff between
>> > compile-time and runtime performance, and I planned to bring-up the
>> > topic on the list in the coming months, this looks like a good
>> > occasion to plant the seed. Apparently in the past (years/decade
>> > ago?) the project was very conservative on adding any optimizations
>> > that would impact compile time, however there is no explicit policy
>> > (that I know of) to address this tradeoff.
>> > The closest I could find would be what Chandler wrote in:
>> > http://reviews.llvm.org/D12826 ; for instance for O2 he stated that
>> > "if an optimization increases compile time by 5% or increases code
>> > size by 5% for a particular benchmark, that benchmark should also be
>> > one which sees a 5% runtime improvement".
>> >
>> > My hope is that with better tooling for tracking compile time in the
>> > future, we'll reach a state where we'll be able to consider
>> > "breaking" the compile-time regression test as important as breaking
>> > any test: i.e. the offending commit should be reverted unless it has
>> > been shown to significantly (hand wavy...) improve the runtime
>> > performance.
>> >
>> > <troll>
>> > With the current trend, the Polly developers don't have to worry
>> > about improving their compile time, we'll catch up with them ;)
>> > </troll>
>>
>> My two largest pet peeves in this area are:
>>
>
> I think you hit on something that i would expand on:
>
> We don't hold the line very well on adding little things to passes and
> analysis over time.
> We add 1000 little walkers and pattern matchers to try to get better code,
> and then often add knobs to try to control their overall compile time.
> At some point, these all add up. You end up with the same flat profile if
> you do this everywhere, but your compiler gets slower.
> At some point, someone has to stop and say "well, wait a minute, are there
> better algorithms or architecture we should be using to do this", and
> either do it, or not let it get worse :) I'd suggest, in most cases, we
> know better ways to do almost all of these things.
>
> Don't get me wrong, i don't believe there is any theoretically pure way to
> do everything that we can just implement and never have to tweak.  But it's
> a continuum, and at some point you have to stop and re-evaluate whether the
> current approach is really the right one if you have to have a billion
> little things to it get what you want.
> We often don't do that.
> We go *very* far down the path of a billion tweaks and adding knobs, and
> what we have now, compile time wise, is what you get when you do that :)
> I suspect this is because we don't really want to try to force work on
> people who are just trying to get crap done.  We're all good contributors
> trying to do the right thing, and saying no often seems obstructionist, etc.
> The problem is at some point you end up with the tragedy of the commons.
>
> (also, not everything in the compiler has to catch every case to get good
> code)
>
>
>>  1. We often use functions from ValueTracking (to get known bits, the
>> number of sign bits, etc.) as through they're low cost. They're not really
>> low cost. The problem is that they *should* be. These functions do
>> bottom-up walks, and could cache their results. Instead, they do a limited
>> walk and recompute everything each time. This is expensive, and a
>> significant amount of our InstCombine time goes to ValueTracking, and that
>> shouldn't be the case. The more we add to InstCombine (and related passes),
>> and the more we run InstCombine, the worse this gets. On the other hand,
>> fixing this will help both compile time and code quality.
>>
>
> (LVI is another great example. Fun fact: If you ask for value info for
> everything, it's no longer lazy ....)
>

Yep -- see the bug Wei is working on:
https://llvm.org/bugs/show_bug.cgi?id=10584

David

>
>>   Furthermore, BasicAA has the same problem.
>>
>>  2. We have "cleanup" passes in the pipeline, such as those that run
>> after loop unrolling and/or vectorization, that run regardless of whether
>> the preceding pass actually did anything. We've been adding more of these,
>> and they catch important use cases, but we need a better infrastructure for
>> this (either with the new pass manager or otherwise).
>>
>> Also, I'm very hopeful that as our new MemorySSA and GVN improvements
>> materialize, we'll see large compile-time improvements from that work. We
>> spend a huge amount of time in GVN computing memory-dependency information
>> (the dwarfs the time spent by GVN doing actual value numbering work by an
>> order of magnitude or more).
>>
>
> I'm a working on it ;)
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/165399c2/attachment.html>