[Patch] GVN fold conditional-branch on-the-fly

Mon Sep 9 01:00:56 PDT 2013

On Sun, Sep 8, 2013 at 10:01 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi,
>
> I agree with both of you. We should definitely evaluate this patch just like
> we evaluate any other patch (using a cost-benefit analysis of performance
> vs. compile-time impact averaged over LLVM's test suite).  I know that
> shuxin collected this data and he even presented some of it. I hope that
> more people would follow shuxin and present their performance data as part
> of the commit message or email or something. I also agree about the need to
> reduce GVN’s complexity. I know that shuxin is actively working on cleaning
> up GVN (see r181047 and  r180951 and r181532).  My opinion is that this
> patch does not add any significant complexity to GVN.

So let's look at it from a few perspectives:

GVN, including comments, is currently 2630 lines.
This patch adds another 330 lines (including detailed comments, to be
fair), or grows it by another 13%.

In doing so, it adds a bunch of new dominator graph manipulations
(erasing subtrees), and a heuristic to "try to calculate the dominance
frontier, but bail if it requires visiting too many CFG blocks (16
currently)'", plus classify and discover "simple landing pads".
It tries to classify a very specific set of conditions for branch
conditionals that it wants to optimize, because the rest are "hard" in
the current GVN, which wants to do everything on the fly.
I can't see how this does not add significant complexity.

There are no compile time numbers included in shuxin's statement,
though I agree the performance numbers look helpful, and agree with
you that the overall presentation of the patch is something that
should be applauded.

Assuming the compile time numbers look good, my problem here is still
with this approach to adding this functionality.
With all due respect to Shuxin, it is not a very clean approach.

While it will achieve some performance goal, it increases the
maintenance burden of already complicated code.
It's not currently clear what the compile time impact is.
It's not clear why the heuristic DF approach has the right numbers,
and what it limits in practice (IE why 16, would using 32 actually
improve it, would using 8 really degrade it. ), and the compile time
tradeoffs involved.
Does requiring simple landing pads hurt us a lot?

To anyone who has to maintain this code, it's also not really clear
exactly which cases it will get, and which it won't, just by looking
at the code.
There is no quantification of what we lose out through this approach.
The entire argument for choosing this approach seems to be "GVN
currently does everything on the fly, and relies on DCE happening
during it's mixed analysis-elimination phases, or else it can't
optimize later code because it's analysis is very weak.  This is all I
could fit in this current framework without too much trouble" (which
to me screams 'problem')
It also adds exactly two real testcases (in one file) that show
eliminations, and modifies a small number of others in a trivial way.

All this, to me, is a maintenance burden, and complexity.
On the other side, we have a small set of performance numbers.

If GVN was a relatively simple pass, i'd say "sure, whatever, go for
it". But as we've gone over in this thread, everyone agrees it is
complicated.
Adding even more complication, that catches some limited set of cases,
does not seem to me to be worth it without a huge performance benefit,
low compile time cost, and low maintenance burden.

Again, with all due respect to Shuxin's work, I just don't see it, at
least, given what is currently presented.

>  Does anybody else who
> worked on GVN think that it adds complexity ? If not the Shuxin should
> continue.
>
> Thanks,
> Nadav
>
>
> On Sep 7, 2013, at 1:00 PM, Chandler Carruth <chandlerc at google.com> wrote:
>
>
> On Sat, Sep 7, 2013 at 12:48 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>
>> > I think that we should evaluate this change just like any other:
>> > using a cost-benefit analysis of performance vs. compile-time impact
>> > averaged over the test suite (and any other code bases we care
>> > about). As a practical matter, we should not hold all GVN-related
>> > improvements hostage to competition of a GVN rewrite on which no one
>> > is actively working.
>> > I don't think Danny was suggesting this.
>> >
>> >
>> > There is a difference between holding all improvements hostage, and
>> > arguing against growing the complexity of a pass which is in serious
>> > need of refactoring/updating until that occurs. Especially when the
>> > reworking proposed is likely to achieve the same concrete goal is
>> > the added complexity.
>> >
>> >
>> > I do think that GVN has likely reached the point where adding new
>> > layers of complexity to it significantly grow both the cost of
>> > maintenance of GVN and the cost of doing a deep reworking of it.
>> > I've watched folks try to do deep improvements to our GVN
>> > infrastructure, and they have to spend more time trying to replicate
>> > all of the current tweaks, hacks, and complexity in it than they do
>> > improving the fundamentals of the pass.
>>
>> I think that this is not quite the right way to think about it, especially
>> in the context of a nearly-complete rewrite.
>
>
> The context I'm coming from is that of the current state of GVN. I don't
> know anything really about Danny's prototype other than the fact that some
> others looked at cleaning it up and making it work, and couldn't get it
> cleaned up and working well enough to justify the cost of adding. Maybe
> others will have better luck.
>
>>
>> First, we need to answer the question: Is this fundamentally something
>> that we need GVN to do? If it is, then it should be evaluated as I said. If
>> not, then we need to figure out where (if anywhere) is a better place.
>
>
> Sure, but that's totally orthogonal.
>
>>
>> Ideally, we should really start to review Danny's GVN rewrite (even though
>> it may, as he says, need a rewrite itself), so that we can at least all get
>> on the same page regarding what the new version does, how it does it, and
>> what functional improvements will be necessary prior to the start of
>> more-extensive testing.
>
>
> If you have time for this, cool. But that doesn't actually impact my stance
> here in either direction, and should really be a separate discussion.
>
>>
>> Maintenance of the current GVN is obviously a concern, but if we block
>> improvements to GVN now because GVN needs to be redone, we loose the ability
>> of the current GVN to serve as the best possible reference implementation to
>> which we should compare the new GVN. The lost productivity from failing to
>> do that (failing to capture the work put into isolating performance
>> problems, mitigating them, and building up the corresponding test cases, and
>> then building on those improvements) is likely to be greater than the extra
>> maintenance burden (IMHO).
>
>
> I think you are continuing to misinterpret what I'm saying.
>
> I do not want to block anything. I think that the current maintenance
> challenges of GVN and the scope creep that has occurred in the set of
> optimizations it performs will make the costs in your cost-benefit analysis
> extremely high for changes which add new functionality to GVN. That doesn't
> mean they couldn't be demonstrated as worth while as it remains a tradeoff.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>