[Patch] GVN fold conditional-branch on-the-fly

Sat Sep 7 16:02:10 PDT 2013

On Sat, Sep 7, 2013 at 11:27 AM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
> On 9/7/13 9:14 AM, Daniel Berlin wrote:
>>>>>>
>>>>>> GVN is getting more and more complicated,
>>>>>
>>>>> GVN is almost always complicated:-)
>>>>
>>>> It's really not.  LLVM's GVN is not really a GVN.  It's really a very
>>>> complicated set of interleaved analysis and optimization that happens
>>>> to include some amount of value numbering.
>>>
>>> LLVM GVN is a old-style GVN + some PRE.  But for the PRE part, I don't
>>> think LLVM GVN is complicated at all.
>>
>> No, it's not. It is a very simple PRE that tries to catch a few cases.
>
> I disagree.  The PRE in LLVM GVN has two part, one tackle load, the other
> one tackle
> simple expr in diamond-shaped cfg. I believe the former catch *all* load
> PREs (depending
> on alias analysis's result).

It absolutely positively does not.
It catches one form of partial availability, fully anticipatable
expressions (and not even all of these, i believe). It certainly does
not catch *any* form of partially available, partially anticipatable
expressions, or even fully available, partially anticipatable
expressions.

While not as common, based on experience with GVN-PRE, and knowing
it's limitations (which include punting on multiple-successor cases,
only inserting one load, etc), I'd say it misses ~20% of cases it
could eliminate by doing this (the PA-PA case is rare, i'd say 2-3%,
though it did show up in some performance sensitive apps. The FA-PA
case is significantly less rare. The other random restrictions hurt
slightly too).

> If you disable load PRE, you will certainly see
> noticeable
> difference in performance.

While true, this does not mean it catches all load PRE.

>
>>>
>>>> It is actually becoming
>>>> more and more like GCC's old dominator optimization every day - a grab
>>>> bag of optimizations that all fall under some abstract notion of
>>>> value based redundancy elimination, but aren't really the same, and
>>>> are not unified in any real way.  This is okay if it was not a compile
>>>> time sink, but it's a huge one now, and adding stuff like this only
>>>> makes it worse.
>>>>
>>>>
>>>>>> and is already a compile time sink.
>>>>>
>>>>> Alias analysis bear the blame, It remember alias analysis account for
>>>>> 1/3+
>>>>> of GVN compile-time.
>>>>
>>>> This is directly because of the algorithm in GVN and how it works.
>>>
>>> I don't think so.  I believe the culprit is lacking a sparse way to
>>> represent
>>> mem-dep -- when GVN try to figure out a the set of mem-opt a particular
>>> load/store
>>> deps on, the mem-dep search all over the place.
>>
>> I disagree. As I said, the reason GVN's memory analysis is slow is
>> simply because of how it uses memdep, which is unnecessarily slow and
>> expensive.
>>
>> It's true that a sparse representation would fit better with the
>> current usage pattern, but as I mentioned in a later message, the
>> current usage pattern is probably not actually a good thing.
>>
>> As for a sparse representation, you can't sparsely and accurately
>> represent mem-dep at the same time in a way that satisfies a lot of
>> clients.
>
>
>
>>   GCC learned this the hard way, and we tried for something
>> like 8 years until we settled on what is there now, where clients are
>> expected to use the sparse representation to get the nearest
>> "possible" memory dependence, and then further disambiguate.
>>
>> Computing this representation is actually fairly expensive to do well,
>> and without a client past GVN, i'm not sure it would be worth it.
>>
>>
>>
> This topic seems to deviate from what we are focusing. But I'd have to say
> I disagree, LLVM dose not like mem-ssa, and gcc dose not do mem-ssa
> very well, dose not mean other compiler cannot handle it very well.

GCC did mem-ssa very well after many years (it had a rough start), but
in the end, it still was not precise enough.

> At least, Open64's mem-ssa is good enough for my need albeit it suffers
> from poor engineering, and there are few defects (like do not
> allow overlap of virtual variable live-ranges)

I've used Open64's, and it's very like GCC's is today.

As you said, we are getting far afield, so we can simply agree to disagree :)