[Patch] GVN fold conditional-branch on-the-fly

Mon Sep 9 10:25:59 PDT 2013

On 9/9/13 1:00 AM, Daniel Berlin wrote:
> On Sun, Sep 8, 2013 at 10:01 PM, Nadav Rotem <nrotem at apple.com> wrote:
>> Hi,
>>
>> I agree with both of you. We should definitely evaluate this patch just like
>> we evaluate any other patch (using a cost-benefit analysis of performance
>> vs. compile-time impact averaged over LLVM's test suite).  I know that
>> shuxin collected this data and he even presented some of it. I hope that
>> more people would follow shuxin and present their performance data as part
>> of the commit message or email or something. I also agree about the need to
>> reduce GVN’s complexity. I know that shuxin is actively working on cleaning
>> up GVN (see r181047 and  r180951 and r181532).  My opinion is that this
>> patch does not add any significant complexity to GVN.
>
>
> So let's look at it from a few perspectives:
>
> GVN, including comments, is currently 2630 lines.
It is 2630 LOCs,  and the code is pretty straightforward. Dose not that 
contradict to
your statement GVN is complicated?

I believe you think your rudimentary GVN is simpler, but it has 4130 LOCs.
If you were to quantify complexity by LOCs, it seems you contradict to 
yourself.

> This patch adds another 330 lines (including detailed comments, to be
> fair), or grows it by another 13%.
Most of them are simple helper routines. The most complicated function
is to find the if-then-else pattern.

You claim maintaining Dom-Tree is big burden, however, the code involved
are merely about 10 LOCs.

>
> In doing so, it adds a bunch of new dominator graph manipulations
> (erasing subtrees), and a heuristic to "try to calculate the dominance
> frontier, but bail if it requires visiting too many CFG blocks (16
> currently)'", plus classify and discover "simple landing pads".
> It tries to classify a very specific set of conditions for branch
> conditionals that it wants to optimize, because the rest are "hard" in
> the current GVN, which wants to do everything on the fly.
> I can't see how this does not add significant complexity.
>
> There are no compile time numbers included in shuxin's statement,
> though I agree the performance numbers look helpful, and agree with
> you that the overall presentation of the patch is something that
> should be applauded.
It only  depending on how many conditional branch can be folded.
If no such branch, this code has no compile-time impact at all.
If one such branch is found, it only look forward for 16 blocks, that is.

>
> Assuming the compile time numbers look good,
It should be nothing.

> my problem here is still
> with this approach to adding this functionality.
> With all due respect to Shuxin, it is not a very clean approach.
I'm open to any approach. How about you figure out the right way to do it.

>
> While it will achieve some performance goal, it increases the
> maintenance burden of already complicated code.
First of all, it is not complicated code.
2nd, my patch dose not tightly coupled with the existing code.
It kick in one place, and clean up (delete dead block) at the end, that 
is all.

> It's not currently clear what the compile time impact is.
> It's not clear why the heuristic DF approach has the right numbers,
> and what it limits in practice (IE why 16, would using 32 actually
> improve it, would using 8 really degrade it. ), and the compile time
> tradeoffs involved.
> Does requiring simple landing pads hurt us a lot?
It dose catch many cases. I forget to mention one benchmark I improve
the parser in CPU2Kint, it see diff (77s vs 75.6).  I have not yet got 
chance
to test more benchmarks.

>
> To anyone who has to maintain this code, it's also not really clear
> exactly which cases it will get, and which it won't, just by looking
> at the code.

I'm sure I understand.  Let me try to answer.  It tries to catch the 
case containing
conditional branch which proved to holding constant condition during GVN 
time.
The function name says all.

> There is no quantification of what we lose out through this approach.
> The entire argument for choosing this approach seems to be "GVN
> currently does everything on the fly, and relies on DCE happening
> during it's mixed analysis-elimination phases, or else it can't
> optimize later code because it's analysis is very weak.  This is all I
> could fit in this current framework without too much trouble" (which
> to me screams 'problem')
> It also adds exactly two real testcases (in one file) that show
> eliminations, and modifies a small number of others in a trivial way.
>
> All this, to me, is a maintenance burden, and complexity.
> On the other side, we have a small set of performance numbers.
I would be surprise, optimization like this would speedup programs
across the board.  If any incremental improvement can achieve this goal,
CPU architects have no values any more.

>
> If GVN was a relatively simple pass,
It is, 2.6k LOCs. Too complicated.

>   i'd say "sure, whatever, go for
> it". But as we've gone over in this thread, everyone agrees it is
> complicated.
> Adding even more complication, that catches some limited set of cases,
> does not seem to me to be worth it without a huge performance benefit,
> low compile time cost, and low maintenance burden.
>
> Again, with all due respect to Shuxin's work, I just don't see it, at
> least, given what is currently presented.
>
>
>>   Does anybody else who
>> worked on GVN think that it adds complexity ? If not the Shuxin should
>> continue.
>>
>> Thanks,
>> Nadav
>>
>>
>> On Sep 7, 2013, at 1:00 PM, Chandler Carruth <chandlerc at google.com> wrote:
>>
>>
>> On Sat, Sep 7, 2013 at 12:48 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>>> I think that we should evaluate this change just like any other:
>>>> using a cost-benefit analysis of performance vs. compile-time impact
>>>> averaged over the test suite (and any other code bases we care
>>>> about). As a practical matter, we should not hold all GVN-related
>>>> improvements hostage to competition of a GVN rewrite on which no one
>>>> is actively working.
>>>> I don't think Danny was suggesting this.
>>>>
>>>>
>>>> There is a difference between holding all improvements hostage, and
>>>> arguing against growing the complexity of a pass which is in serious
>>>> need of refactoring/updating until that occurs. Especially when the
>>>> reworking proposed is likely to achieve the same concrete goal is
>>>> the added complexity.
>>>>
>>>>
>>>> I do think that GVN has likely reached the point where adding new
>>>> layers of complexity to it significantly grow both the cost of
>>>> maintenance of GVN and the cost of doing a deep reworking of it.
>>>> I've watched folks try to do deep improvements to our GVN
>>>> infrastructure, and they have to spend more time trying to replicate
>>>> all of the current tweaks, hacks, and complexity in it than they do
>>>> improving the fundamentals of the pass.
>>> I think that this is not quite the right way to think about it, especially
>>> in the context of a nearly-complete rewrite.
>>
>> The context I'm coming from is that of the current state of GVN. I don't
>> know anything really about Danny's prototype other than the fact that some
>> others looked at cleaning it up and making it work, and couldn't get it
>> cleaned up and working well enough to justify the cost of adding. Maybe
>> others will have better luck.
>>
>>> First, we need to answer the question: Is this fundamentally something
>>> that we need GVN to do? If it is, then it should be evaluated as I said. If
>>> not, then we need to figure out where (if anywhere) is a better place.
>>
>> Sure, but that's totally orthogonal.
>>
>>> Ideally, we should really start to review Danny's GVN rewrite (even though
>>> it may, as he says, need a rewrite itself), so that we can at least all get
>>> on the same page regarding what the new version does, how it does it, and
>>> what functional improvements will be necessary prior to the start of
>>> more-extensive testing.
>>
>> If you have time for this, cool. But that doesn't actually impact my stance
>> here in either direction, and should really be a separate discussion.
>>
>>> Maintenance of the current GVN is obviously a concern, but if we block
>>> improvements to GVN now because GVN needs to be redone, we loose the ability
>>> of the current GVN to serve as the best possible reference implementation to
>>> which we should compare the new GVN. The lost productivity from failing to
>>> do that (failing to capture the work put into isolating performance
>>> problems, mitigating them, and building up the corresponding test cases, and
>>> then building on those improvements) is likely to be greater than the extra
>>> maintenance burden (IMHO).
>>
>> I think you are continuing to misinterpret what I'm saying.
>>
>> I do not want to block anything. I think that the current maintenance
>> challenges of GVN and the scope creep that has occurred in the set of
>> optimizations it performs will make the costs in your cost-benefit analysis
>> extremely high for changes which add new functionality to GVN. That doesn't
>> mean they couldn't be demonstrated as worth while as it remains a tradeoff.
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits