[Patch] GVN fold conditional-branch on-the-fly

Fri Sep 6 21:39:34 PDT 2013

On Fri, Sep 6, 2013 at 8:39 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
> Hi, Daniel:
>
>   Thank you for your speedy feedback.
>
>
> On 9/6/13 8:03 PM, Daniel Berlin wrote:
>>
>> Is there a reason to not just use a better GVN algorithm?
>> Why add a set of complicated conditions to fold code on the fly, when
>> almost any other GVN algorithm than the current one would properly
>> detect the cases you are talking about.
>>
>> In particular, you say: "GVN prove x > 0, making the else-clause dead.
>> If GVN were not able to prune
>> dead-edge <else-clause, JoinNode>, value "foo" (could be constant) would
>> not
>> be able propagated to S3."
>>
>> This is false.  It does not need to prune this edge, it just needs an
>> algorithm that doesn't consider the edge taken if the edge is provably
>> not taken.
>
> Yes, we can "ignore" the dead code.
> As I mentioned in my previous mail, GVN can proceed by ignoring dead stuff.
> \
> the problem is that it will end up with the code where def does not dominate
> its uses.

Errr, it can remove it at the end.  I'm not saying never remove it,
i'm saying your method, which is to patch in an incremental
DF/dominator tree update algorithm, seems like a very high maintenance
burden (these algorithms are non-trivial to understand) for what it
does.

It's only necessary because the current actual GVN algorithm is very
dumb, and can't see this stuff on it's own.

> Unless following passes can "ignore" dead code in the same way,
> the verifier and other passes will complains.

I'm saying it's not necessary to do on the fly.
>
>
>>
>> There are plenty of these algorithms, most even more powerful than the
>> test cases you want done, for example,
>> http://dl.acm.org/citation.cfm?id=512536
>
>
>
>
> But, I don't want to re-implement GVN from ground up:-).

Nobody does, but somebody has to.  Otherwise, everyone is happy to
just increase the maintenance burden, and leave it to someone else to
finally clean up the mess.  The problem what that person eventually
has to do becomes larger and larger as more stuff is added.

>   I'm not sure
> theoretically better algorithm can significantly speed up the performance.
> (e.g. I almost never find a SCC-GVN opportunities in real-world apps)
This is an interesting claim.   In the real world C++ apps I tested, I
found many opportunities.
GCC's GVN finds plenty of loop-based values to eliminate using even a
non-predicated SCC GVN.
A large number of these are memory operations that use loop based phi
nodes that end up equivalent.

> As far as I can tell,  what  GVN at llvm really need to improve is detecting
> mem-op redundancy. But this is not GVN's fault.
>
Actually, it is, in part.  GVN does not detect a lot of  opportunities
because it cannot detect the memory operations are the same.  This is
because it does not understand how to do things like value number
loads and stores for real, etc.   It knows how to pull values out of a
direct must-alias store/load dependence, which is not the same.  It
relies on having eliminated everything else in the way, and
canonicalized everything in the world up to that point.   It does not
really understand stores and loads at all.

>
>>GVN is getting more and more complicated,
> GVN is almost always complicated:-)

It's really not.  LLVM's GVN is not really a GVN.  It's really a very
complicated set of interleaved analysis and optimization that happens
to include some amount of value numbering.  It is actually becoming
more and more like GCC's old dominator optimization every day - a grab
bag of optimizations that all fall under some abstract notion of
value based redundancy elimination, but aren't really the same, and
are not unified in any real way.  This is okay if it was not a compile
time sink, but it's a huge one now, and adding stuff like this only
makes it worse.

>
>>and is already a compile time sink.
> Alias analysis bear the blame, It remember alias analysis account for 1/3+
> of GVN compile-time.

This is directly because of the algorithm in GVN and how it works.
Every pass, and every pointer related operation it removes, it
invalidates memdep caches for.  On the next call for a later pointer,
the getPointerDependence and getNonLocalDependency calls now have to
do all the same work again.     There is the added fact that
getNonLocalDependence's cache does not seem to work quite right (call
it a million times for the same non-local pointer, and see how slow it
is :P)

>
>
>> Is this really the best way to add this functionality?
> The "best" way I can figure out considering complexity and common-case in
> practices:-)
>
> What is your suggestion.? Thanks
>

My suggestion would be to sit down and design something that actually
covers the case we want, instead of just adding more stuff to GVN as
it exists now, trying to get one oddball case at a time.
This is going to have to be done at some point, and while I understand
you don't want to have to be the one to do it, and I certainly can't
make you, I can tell you that doing something like this is making the
eventual job a bunch harder.