[PATCH] D28129: NewGVN: Sort Dominator Tree in RPO order, and use that for generating order.

Daniel Berlin via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 28 20:09:28 PST 2016


On Wed, Dec 28, 2016 at 7:22 PM, Friedman, Eli <efriedma at codeaurora.org>
wrote:

> On 12/28/2016 2:33 PM, Daniel Berlin wrote:
>
>
>
> On Wed, Dec 28, 2016 at 1:18 PM, Friedman, Eli <efriedma at codeaurora.org>
> wrote:
>
>> On 12/28/2016 1:03 PM, Daniel Berlin via llvm-commits wrote:
>>
>>
>>
>> On Wed, Dec 28, 2016 at 7:04 AM, Davide Italiano via Phabricator <
>> reviews at reviews.llvm.org> wrote:
>>
>>> davide accepted this revision.
>>> davide added a comment.
>>> This revision is now accepted and ready to land.
>>>
>>> Sorry for the slow response, I'm out('ish) of the office these days. I
>>> took a close look at your patch.
>>>
>>
>> No worries.
>>
>>
>>> I happen to be lucky enough to hit a case in the wild where this already
>>> matters.  The number of iteration goes down from hundreds to ~10, which
>>> makes compile time/me happier.
>>>
>>
>> yay.
>>
>> The current code, excepting super-weird cases, should operate in O(d+3)
>> iterations, where d is the loop connectedness of the SSA graph (not the
>> CFG), which is the number of backedges in any path. This will change when
>> we move to equality propagation, but for now, ...
>> We could  calculate this number and see if we are screwing up :)
>>
>> For most programs, the loop connectedness of the SSA graph is the same or
>> less than the CFG.
>>
>> However,  IIRC, we allow  dependent phis in the same block (this is not
>> strictly SSA, since all phi nodes are supposed to be evaluated
>> simultaneously).
>>
>>
>> I'm not sure what you're trying to say here?  PHI nodes for a given basic
>> block are evaluated simultaneously. From LangRef: "For the purposes of
>> the SSA form, the use of each incoming value is deemed to occur on the edge
>> from the corresponding predecessor block to the current block (but after
>> any definition of an ‘invoke‘ instruction’s return value on the same
>> edge)."
>>
>>
> I'm saying we've had mailing list arguments about this, about whether
> there is any ordering among phi nodes in a given block.  The part you quote
> from the langref does not actually definitively answer that (again, there
> is no argument in theory.  In the abstract, the answer is "there is no
> ordering, it's undefined to have phis depend in the same block depend on
> each other")
>
> Given
> b = phi(d, e)
> a = phi(b, c)
>
> Saying "is deemed to occur on the edge of the corresponding predecessor
> block" does not help.
>
>
> Consider the following function:
>
> void f(int a, int b, int g(int, int)) {
>   while (g(a, b)) { int x = a; a = b; b = x; }
> }
>
> mem2reg produces this:
>
> define void @f(i32 %a, i32 %b, i32 (i32, i32)* %g) #0 {
> entry:
>   br label %while.cond
>
> while.cond:                                       ; preds = %while.body,
> %entry
>   %a.addr.0 = phi i32 [ %a, %entry ], [ %b.addr.0, %while.body ]
>   %b.addr.0 = phi i32 [ %b, %entry ], [ %a.addr.0, %while.body ]
>   %call = call i32 %g(i32 %a.addr.0, i32 %b.addr.0)
>   %tobool = icmp ne i32 %call, 0
>   br i1 %tobool, label %while.body, label %while.end
>
> while.body:                                       ; preds = %while.cond
>   br label %while.cond
>
> while.end:                                        ; preds = %while.cond
>   ret void
> }
>
> A "phi" works in the only way which allows this IR to match the semantics
> of the C code.
>

I'm not sure i believe this, but i actually don't care enough to argue
about it further :)
That said, i'm curious:

What are the expected semantics of the second case i presented?


>
> If you think LangRef isn't clear, suggestions are welcome.
>
I would be explicit that phi nodes may depend on each other, and what the
expected evaluation order actually is (if it's "as-they-appear-in-IR", say
that.)
It looks something like "dependencies are allowed, any order is allowed
despites dependencies".


>
> Popping back up, regardless of resolution, this causes the issue i
> mentioned above - it may require more iterations to resolve because of the
> second case passing verification.  If we really want phi nodes to be
> executable , and want it to take the minimum number of iterations to
> converge NewGVN, we need to process aa before t1.
>
> Otherwise, we will process t1, get some value, *then* process aa, and
> immediately mark t1 as needing to be reprocessed since it is a use of aa.
> We effectively waste an iteration because all of t1's uses have are going
> to have the wrong value.
>
>
> It looks like NewGVN creates one less congruence class if you process them
> in the "right" order.
>

Well, no. It's not about creating the congruence class, it's about making
it a reverse post-order evaluation .  In your example above, there is no
single RPO order due to the two cycles.
In my example, any valid  RPO order of the SSA graph must visit aa before
t1.
This is *normally* taken care of by evaluating instructions in block order,
and ordering the CFG in RPO.   But in the case i gave, it is not enough.
You would have to explicitly sort the phi nodes separately, since we allow
both orderings of the phi nodes in LLVM,despite only one being RPO (and
only one having defs dominate uses).



> I'm not sure there's any way to usefully generalize that heuristic,
> though; you're only saving time based on discovering the cycle one step
> faster.
>
> I'm not sure what you are trying to say here.
You are saving time by not performing iterations that are unnecessary.  The
generalized heuristic is exactly "perform this problem by evaluating the
SSA graph/CFG in RPO order".  This is provable, it's the notion of a
"rapid" problem.

If you placed my aa/t1 example in the first block of a 10000 block
function, you will process 10000 blocks (1 iteration) uselessly.
If you sort it into a valid RPO order of the SSA graph, you will not.

This generalizes to *any* "rapid" problem.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161228/adc48d44/attachment-0001.html>


More information about the llvm-commits mailing list