[Patch] GVN fold conditional-branch on-the-fly

Mon Sep 9 06:42:51 PDT 2013

----- Original Message -----
> On Sun, Sep 8, 2013 at 10:01 PM, Nadav Rotem <nrotem at apple.com>
> wrote:
> > Hi,
> >
> > I agree with both of you. We should definitely evaluate this patch
> > just like
> > we evaluate any other patch (using a cost-benefit analysis of
> > performance
> > vs. compile-time impact averaged over LLVM's test suite).  I know
> > that
> > shuxin collected this data and he even presented some of it. I hope
> > that
> > more people would follow shuxin and present their performance data
> > as part
> > of the commit message or email or something. I also agree about the
> > need to
> > reduce GVN’s complexity. I know that shuxin is actively working on
> > cleaning
> > up GVN (see r181047 and  r180951 and r181532).  My opinion is that
> > this
> > patch does not add any significant complexity to GVN.
> 
> 
> 
> So let's look at it from a few perspectives:
> 
> GVN, including comments, is currently 2630 lines.
> This patch adds another 330 lines (including detailed comments, to be
> fair), or grows it by another 13%.

Just as a general statement, I dislike this line-counting argument. If the optimization is going to be implemented somewhere, the question is how many lines of code would it take in various different places (including adding a new pass). If the number of lines in the GVN file is too big, then we can split it into multiple files, but that's irrelevant to this addition (this is exactly what we did with InstCombine, which has nearly 20k lines in total -- and the various pieces of InstCombine in the different files certainly interact with each other in non-trivial ways). LLVM in total has nearly 1 million lines, and 330 is a small addition ;)

Perhaps it would be useful if Shuxin commented on the particular examples that motivated him to work on this, and what (if any) alternatives there are to implementing the transformation in GVN.

The remainder of your concerns are certainly legitimate, but I think it is important to separate those which can be solved by adding more comments (or other documentation), and/or trivial refactoring, and those that are more fundamental and cannot.

 -Hal

> 
> In doing so, it adds a bunch of new dominator graph manipulations
> (erasing subtrees), and a heuristic to "try to calculate the
> dominance
> frontier, but bail if it requires visiting too many CFG blocks (16
> currently)'", plus classify and discover "simple landing pads".
> It tries to classify a very specific set of conditions for branch
> conditionals that it wants to optimize, because the rest are "hard"
> in
> the current GVN, which wants to do everything on the fly.
> I can't see how this does not add significant complexity.
> 
> There are no compile time numbers included in shuxin's statement,
> though I agree the performance numbers look helpful, and agree with
> you that the overall presentation of the patch is something that
> should be applauded.
> 
> Assuming the compile time numbers look good, my problem here is still
> with this approach to adding this functionality.
> With all due respect to Shuxin, it is not a very clean approach.
> 
> While it will achieve some performance goal, it increases the
> maintenance burden of already complicated code.
> It's not currently clear what the compile time impact is.
> It's not clear why the heuristic DF approach has the right numbers,
> and what it limits in practice (IE why 16, would using 32 actually
> improve it, would using 8 really degrade it. ), and the compile time
> tradeoffs involved.
> Does requiring simple landing pads hurt us a lot?
> 
> To anyone who has to maintain this code, it's also not really clear
> exactly which cases it will get, and which it won't, just by looking
> at the code.
> There is no quantification of what we lose out through this approach.
> The entire argument for choosing this approach seems to be "GVN
> currently does everything on the fly, and relies on DCE happening
> during it's mixed analysis-elimination phases, or else it can't
> optimize later code because it's analysis is very weak.  This is all
> I
> could fit in this current framework without too much trouble" (which
> to me screams 'problem')
> It also adds exactly two real testcases (in one file) that show
> eliminations, and modifies a small number of others in a trivial way.
> 
> All this, to me, is a maintenance burden, and complexity.
> On the other side, we have a small set of performance numbers.
> 
> If GVN was a relatively simple pass, i'd say "sure, whatever, go for
> it". But as we've gone over in this thread, everyone agrees it is
> complicated.
> Adding even more complication, that catches some limited set of
> cases,
> does not seem to me to be worth it without a huge performance
> benefit,
> low compile time cost, and low maintenance burden.
> 
> Again, with all due respect to Shuxin's work, I just don't see it, at
> least, given what is currently presented.
> 
> 
> >  Does anybody else who
> > worked on GVN think that it adds complexity ? If not the Shuxin
> > should
> > continue.
> >
> > Thanks,
> > Nadav
> >
> >
> > On Sep 7, 2013, at 1:00 PM, Chandler Carruth <chandlerc at google.com>
> > wrote:
> >
> >
> > On Sat, Sep 7, 2013 at 12:48 PM, Hal Finkel <hfinkel at anl.gov>
> > wrote:
> >>
> >> > I think that we should evaluate this change just like any other:
> >> > using a cost-benefit analysis of performance vs. compile-time
> >> > impact
> >> > averaged over the test suite (and any other code bases we care
> >> > about). As a practical matter, we should not hold all
> >> > GVN-related
> >> > improvements hostage to competition of a GVN rewrite on which no
> >> > one
> >> > is actively working.
> >> > I don't think Danny was suggesting this.
> >> >
> >> >
> >> > There is a difference between holding all improvements hostage,
> >> > and
> >> > arguing against growing the complexity of a pass which is in
> >> > serious
> >> > need of refactoring/updating until that occurs. Especially when
> >> > the
> >> > reworking proposed is likely to achieve the same concrete goal
> >> > is
> >> > the added complexity.
> >> >
> >> >
> >> > I do think that GVN has likely reached the point where adding
> >> > new
> >> > layers of complexity to it significantly grow both the cost of
> >> > maintenance of GVN and the cost of doing a deep reworking of it.
> >> > I've watched folks try to do deep improvements to our GVN
> >> > infrastructure, and they have to spend more time trying to
> >> > replicate
> >> > all of the current tweaks, hacks, and complexity in it than they
> >> > do
> >> > improving the fundamentals of the pass.
> >>
> >> I think that this is not quite the right way to think about it,
> >> especially
> >> in the context of a nearly-complete rewrite.
> >
> >
> > The context I'm coming from is that of the current state of GVN. I
> > don't
> > know anything really about Danny's prototype other than the fact
> > that some
> > others looked at cleaning it up and making it work, and couldn't
> > get it
> > cleaned up and working well enough to justify the cost of adding.
> > Maybe
> > others will have better luck.
> >
> >>
> >> First, we need to answer the question: Is this fundamentally
> >> something
> >> that we need GVN to do? If it is, then it should be evaluated as I
> >> said. If
> >> not, then we need to figure out where (if anywhere) is a better
> >> place.
> >
> >
> > Sure, but that's totally orthogonal.
> >
> >>
> >> Ideally, we should really start to review Danny's GVN rewrite
> >> (even though
> >> it may, as he says, need a rewrite itself), so that we can at
> >> least all get
> >> on the same page regarding what the new version does, how it does
> >> it, and
> >> what functional improvements will be necessary prior to the start
> >> of
> >> more-extensive testing.
> >
> >
> > If you have time for this, cool. But that doesn't actually impact
> > my stance
> > here in either direction, and should really be a separate
> > discussion.
> >
> >>
> >> Maintenance of the current GVN is obviously a concern, but if we
> >> block
> >> improvements to GVN now because GVN needs to be redone, we loose
> >> the ability
> >> of the current GVN to serve as the best possible reference
> >> implementation to
> >> which we should compare the new GVN. The lost productivity from
> >> failing to
> >> do that (failing to capture the work put into isolating
> >> performance
> >> problems, mitigating them, and building up the corresponding test
> >> cases, and
> >> then building on those improvements) is likely to be greater than
> >> the extra
> >> maintenance burden (IMHO).
> >
> >
> > I think you are continuing to misinterpret what I'm saying.
> >
> > I do not want to block anything. I think that the current
> > maintenance
> > challenges of GVN and the scope creep that has occurred in the set
> > of
> > optimizations it performs will make the costs in your cost-benefit
> > analysis
> > extremely high for changes which add new functionality to GVN. That
> > doesn't
> > mean they couldn't be demonstrated as worth while as it remains a
> > tradeoff.
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory