[llvm] r303387 - [InstCombine] add more tests for xor-of-icmps; NFC

Fri May 19 16:27:52 PDT 2017

On Fri, May 19, 2017 at 4:11 PM, Sanjay Patel <spatel at rotateright.com> wrote:
> [adding llvm-dev for wider audience]
>
> On Fri, May 19, 2017 at 12:28 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
>>
>>
>>
>> On Fri, May 19, 2017 at 11:00 AM, Davide Italiano <davide at freebsd.org>
>> wrote:
>>>
>>> On Fri, May 19, 2017 at 10:00 AM, Sanjay Patel <spatel at rotateright.com>
>>> wrote:
>>> > Is "VRP" (value range propagation) in LLVM-speak "CVP" (correlated
>>> > value
>>> > propagation)?
>>> >
>>> > If so, we have this comment regarding compares:
>>> >   // As a policy choice, we choose not to waste compile time on
>>> > anything
>>> > where
>>> >   // the comparison is testing local values.
>>> >
>>> > Or this for div/rem:
>>> >   // As a policy choice, we choose not
>>> >   // to waste compile time on anything where the operands are local
>>> > defs.
>>> >
>>> > "Local" means in the same basic block from what I can tell by the code
>>> > here.
>>> >
>>> > I think this means that this pass explicitly defers handling simple
>>> > cases
>>> > like:
>>> > https://reviews.llvm.org/D33342
>>> > ...to another pass, and currently that pass is InstCombine (because the
>>> > patterns really can't be any simpler than the tests in that patch,
>>> > right?).
>>> >
>>> > I think that improving compile-time should not preclude improving
>>> > InstCombine. We should do both.
>>> >
>>>
>>> Just thoughts, feel free to ignore them.
>>> I didn't measure the impact in LLVM, but I'm sure you can do VRP
>>> relatively fast (GCC does that both interprocedurally and
>>> intraprocedurally and their pass is much faster in some cases than
>>> LLVM's), i.e. O(N) in practice, so, maybe we could re-evaluate this
>>> policy?
>>
>>
>> Yeah, that's kind of silly, we can do much better.
>>
>>>
>>> I think replacing a pass in LLVM is not trivial (I'm learning it the
>>> hard way with NewGVN). OTOH, I'm still not entirely convinced
>>> `InstCombine` should handle these cases given it's already a
>>> compile-time sink?
>>
>>
>
> Correct me where I'm going wrong.
>
> 1. We got a bug report with a perf regression that was root caused to this
> IR:
>
>   define i1 @xor_of_icmps(i64 %a) {
>     %b = icmp sgt i64 %a, 0
>     %c = icmp eq i64 %a, 1
>     %r = xor i1 %b, %c
>     ret i1 %r
>   }
>
> Because this was a regression, we know that the optimal/canonical IR for
> this program is:
>
>   define i1 @xor_of_icmps_canonical(i64 %a) {
>     %r = icmp sgt i64 %a, 1
>      ret i1 %r
>   }
>
> 2. I saw this as a bug in InstCombine because I think InstCombine's job is
> to canonicalize simple IR sequences.
>
> 3. I assumed that matching an (xor icmp, icmp) pattern can be done
> efficiently in InstCombine. In fact, I knew that we already did that match,
> so there is zero incremental cost to find this pattern in InstCombine.
>
> 4. I knew that we already handle related and-of-cmps and or-of-cmps patterns
> in InstSimplify/InstCombine.
>
> 5. Based on that, I have proposed a patch that mostly uses existing logic in
> those passes to solve this bug in the most general way I am aware of. It
> makes 2 calls to InstSimplify to find a potential fold before
> morphing/creating new instructions that may get folded by other existing
> logic.
>
>
> Questions:
> 1. Was/is InstCombine intended to handle this transform?
>
> 2. If there is a measurable compile-time cost for this patch, then there
> must be a structural problem with InstSimplify, InstCombine, and/or the pass
> pipeline?
>
> 3. Is there another pass that is more suitable to solve this bug than
> InstCombine? CVP was mentioned as a candidate for enhancement. Any others?
>
> 4. If there is a more suitable pass, why is it more suitable?
>

I can provide an example + analysis: https://reviews.llvm.org/D33172#754563

> 5. If we add to that pass or create a new pass that specifically targets
> logic-of-cmps, can we pull that code out of InstSimplify and InstCombine?
>
> 6. If compile-time has become unbearable, shouldn't we be moving
> optimization logic and/or passes from -O1 to -O2 to -O3, etc? My
> understanding as a compiler consumer has always been that I'll get better
> performing code the higher up the -O# ladder I climb, but it will cost
> increasingly more time to compile. Is that not the model for LLVM?
>

The real problem of instcombine IMHO is that it suffer from an
additive effect. i.e. matchers over matchers are added over time which
per-se don't slow down the compiler in any measurable manner. But all
of them sum up result in a significant compiler slowdown, and we
notice only when it's too late (i.e. when the code has been released).
We've all under the assumption that the time in InstCombine was spent
inside auxiliary analysis computing the bitwise domain
(ComputeKnownBits et similia). Turns out we're not able to find a case
where caching speeds up things.

>
>> People seem intent on adding lots and lots of stuff to instcombine because
>> it's easy.
>> This makes it harder to ever replace, while simultaneously making it
>> slower.
>> It's not hard to see where that path ends up.
>> InstCombine does a lot of random crap that isn't even combining or graph
>> rewrite, too, because well second and third order effects are hard.
>
>
> Can you provide some examples? I honestly don't know where I should draw the
> line. If I've crossed the line, I'd like to fix it.
>

I don't have the code handy, but something that came up recently (and
I was a little surprised) is that instcombine checks for irreducible
PHI cycles, see the functions/comments in InstCombinePHI.

>> If we want an optimal graph rewriter, we should actually just build one,
>> with sane rewrite rules, verification that it fixpoints, and some sense of
>> good ordering, etc,  instead of  slowly every adding possible pattern match
>> to InstCombine.
>
>
> Can you recommend papers or projects to look at to get a better
> understanding of an optimal graph rewriter? LLVM is the only compiler I've
> worked on from the inside, so I don't really know what else is possible /
> feasible.
>
>

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare