[PATCH] Extend EarlyCSE to handle basic cases from JumpThreading and CVP

Tue May 19 10:13:09 PDT 2015

On Tue, May 19, 2015 at 9:37 AM, Philip Reames
<listmail at philipreames.com> wrote:
> On 05/19/2015 08:50 AM, Daniel Berlin wrote:
>>>
>>> ================
>>> Comment at: lib/Transforms/Scalar/EarlyCSE.cpp:471
>>> @@ +470,3 @@
>>> +  if (BasicBlock *Pred = BB->getSinglePredecessor())
>>> +    // TODO: handle switch with case value as well
>>> +    if (auto *BI = dyn_cast<BranchInst>(Pred->getTerminator()))
>>> ----------------
>>> Even GVN does not handle switch with case value, even when it can
>>> determine the case value as a constant. It is not as simple as one would
>>> thing because it's not about dominance, but control dependence.
>>>
>>>
>>> (and DT->dominates will fail the single edge assertion in those cases,
>>> even though it's a reasonable thing to test)
>>>
>>>
>>
>> Just to explain this:
>>
>> getSinglePredecessor returns true if all edges lead to the same block.
>> Thus, a switch with multiple edges to the same block will return a
>> block to getSinglePredecessor.
>
> Er, I believe this is incorrect.  The behavior you're describing is true of
> getUniquePredecessor, but not getSinglePredecessor.  This is why I'm
> explicitly using the getSinglePredecessor version.

Yes, i read the logic of getSinglePredecessor wrong.

>>
>>
>> Dominates(Edge, anything) asserts isSingleEdge(Edge).
>> isSingleEdge asserts that the edge is singular in the sense that
>> "multiple edges do not exist from block EdgeStart to block EdgeEnd".
>> Thus, getSinglePredecessor will happily give you blocks from switch
>> statements that fail this test (any switch with case statements
>> leading to the same block), such that you can't test dominance on
>> them.
>
> For getUniquePredecessor, I believe the problem you describe is relevant for
> branches too.  Nothing says a conditional branch can't have both it's
> destinations going to the same block.  The optimizer would likely fold such
> thing away quickly, but it is legal IR.

Sure.

>>
>>
>> GVN goes to the trouble of tracking how many edges lead to a target
>> block, and only propagates equalities if a single edge reaches a given
>> target block to avoid this assert.
>>
>> This is true *even if* it could prove that only one of the edges to
>> the target block is reachable (because it would still hit the
>> isSingleEdge assert).
>
> If it can prove this, it should be breaking the other edges.  Once that's
> done, the assert holds.  I don't see the problem?

It doesn't break any edges, because it would break other things it
does (and invalidate some analyses that don't know how to update in
the face of this,)

I generally don't disagree. I would actually rather just build control
dependence regions and say "this is valid in that region", instead of
trying to use dominance.
C'est la vie.

>>
>>
>> At some point, that assert should simply be removed (or such was the
>> conclusion of the folks who added it when i asked them, since the
>> justification given for it makes no sense),
>
> I'm not sure I actually agree here.  I don't actively disagree either, but
> I've found the assertion quite helpful in finding what would have otherwise
> been a subtle performance problem in one of my other changes.  I hadn't
> *intended* to visit anything other than a single edge and had managed to
> phrase the query in a way where I did.

So, the assertion claims the clients can do it more performantly.
But they actually can't if that is what they want to do.  They also
would have to reproduce the critical edge logic.
Additionally, because they all assert, there is no way for the client
to do anything :)

IE It can make up a different query that doesn't test the same thing
in some cases, but if what you need to test is whether the edge
dominates a use, etc, there is no faster way to accomplish it than
what it does, and there is in fact, no way to accomplish it other than
reproducing the exact logic already in there.

Adding asserts for "this may be a performance problem for someone at
some point in time" is probably not a great idea, and we generally
don't do it elsewhere.
We generally add limits, bail out, and return conservative answers if
we have to.

(and we don't even have to do that here).

In any case, if you want to leave that TODO, sounds fine.