[llvm-commits] [patch] Use the new edge dominance functions in GVN

Tue Aug 14 01:14:47 PDT 2012

On Mon, Aug 13, 2012 at 11:38 PM, Daniel Berlin <dannyb at google.com> wrote:
> On Tue, Aug 14, 2012 at 2:14 AM, Eli Friedman <eli.friedman at gmail.com> wrote:
>> On Mon, Aug 13, 2012 at 8:05 PM, Rafael Espíndola
>> <rafael.espindola at gmail.com> wrote:
>>>> In any case,  an updated patch is attached. I made it slightly less
>>>> aggressive but faster by keeping isOnlyReachableViaThisEdge as a fast
>>>> dominance approximation. I understand that it is taking a technical
>>>> debt, but I think I should be able to repay it :-)
>>>
>>> Looking at the code of the new GVN pass I see that it implements a
>>> similar feature by doing
>>>
>>>       AU.addRequiredID(BreakCriticalEdgesID)
>>>
>>> While this is a possibility, applying this to the current GVN
>>> implementation causes some problems. First, we have to mark
>>> BreakCriticalEdges as preserving AliasAnalysis and
>>> MemoryDependenceAnalysis, which I guess is fine.
>>>
>>> The addition of extra basic blocks seems to prevent some optimizations
>>> and causes two tests to fail:
>>>
>>>     LLVM :: CodeGen/Thumb2/2009-12-01-LoopIVUsers.ll
>>>     LLVM :: Transforms/GVN/2011-06-01-NonLocalMemdepMiscompile.l
>>>
>>> I have not debugged exactly why. Finally, it seems that when looking
>>> at the time of a full opt -O2 run, at least with the current
>>> algorithm, the impact of splitting the critical edge is larger than
>>> the impact of directly reasoning about a use being dependent on an
>>> edge.
>>>
>>> In any case. I have attached both the previous patch and a variation
>>> that uses critical edge splitting instead. If there is a preference
>>> for the edge splitting one I can try to figure out the two remaining
>>> test failures.
>>
>> We don't want to add additional users of the BreakCriticalEdges pass;
>> it's bad for performance.
>
> In what exact way?
>
> For example, current LLVM GVN notes critical edges blocking PRE, then
> splits them, then  *reiterates all of its PRE* just to deal with a
> critical edge.
>
> This causes N^2 iterations (*just in GVN*) in the worst case.  GVN's
> local PRE is certainly more expensive than breaking critical edges.
>
> Algorithms that detect and try to account for critical edges are
> usually complicated and expensive.
>
> GCC's edge-based PRE that would split only the critical edges that
> were necessary (by detecting which placements were blocked by them)
> turned out to be significantly more expensive than simply splitting
> all critical edges.
>
> Additionally, to my recollection, in the history of GCC, critical edge
> splitting, even on huge functions, has never shown up as any real
> performance hit (IE > 0.1% of the total profile).
>
> Is there some mailing list discussion or performance data you can
> point me at that shows why it is such a problem?

I can't really find any extended discussion of it... it's sort of
difficult to find because nothing in the tree has used that pass for a
very long time.  The old condprop pass was the last user removed, and
there were other reasons to get rid of it.  Basically, I'm just
automatically suspicious of any pass which globally transforms IR in a
way which will be reversed by passes like SimplifyCFG; there's a
performance cost which is partially attached to passes other than the
pass which actually does the transform, and various IR utility
functions won't be aware of the extra invariants.  One of the few
places we do perform such transforms is the loop transformations, and
passes accidentally breaking the invariants has caused many bugs.

That said, it's possible that it's worthwhile in this case; it sounds
like your experience shows the cost here is minimal (which is
believable, given how expensive GVN is generally).

-Eli