[LLVMdev] missing optimization for icmps in induction variables?
Nick Lewycky
nicholas at mxc.ca
Mon Jan 12 22:05:23 PST 2015
Sanjoy Das wrote:
> Hi Nick,
>
> I checked in something towards (1) yesterday -- http://reviews.llvm.org/D6748
Ah, I missed that. Catching up on email.
> I was under the impression that (2) is exactly the kind of predicate
> ScalarEvolution::isKnownPredicate is designed to solve (using
> isImpliedCondXXX or something like that). Is there a reason to prefer
> GVN over that?
If you think of the problem as a loop problem then yes
ScalarEvolution::isKnownPredicate makes sense. If you think of it as a
redundancy elimination problem then GVN makes sense. Does this sort of
problem occur outside of loops?
> On Wed, Jan 7, 2015 at 10:06 PM, Nick Lewycky<nicholas at mxc.ca> wrote:
>> Sanjoy Das wrote:
>>>
>>> Hi all,
>>>
>>> I'm trying to get llvm to optimize away the %cmp to true in
>>>
>>> define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) {
>>> entry:
>>> %length = load i32* %length_ptr, !range !0
>>> %len.sub.1 = sub i32 %length, 1
>>> %upper = icmp slt i32 %init, %len.sub.1
>>> br i1 %upper, label %loop, label %exit
>>>
>>> loop:
>>> %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ]
>>> %civ.inc = add i32 %civ, 1
>>> %cmp = icmp slt i32 %civ.inc, %length
>>> br i1 %cmp, label %latch, label %break
>>>
>>> latch:
>>> store i32 0, i32* %array
>>> %check = icmp slt i32 %civ.inc, %len.sub.1
>>> br i1 %check, label %loop, label %break
>>>
>>> break:
>>> ret i32 %civ.inc
>>>
>>> exit:
>>> ret i32 42
>>> }
>>>
>>> !0 = !{i32 0, i32 2147483647}
>>>
>>>
>>> One way to prove "%cmp == true" in two steps
>>>
>>> 1. notice that since both on the backedge and entry, %civ is known to
>>> be less than %len.sub.1, which not i32_signed_max. This means
>>> %civ.inc is an "add nsw".
>>>
>>> 2. on both the entry and backedge, we know "%civ `slt` %len.sub.1".
>>> This implies "(%civ nsw+ 1) `slt` (%len.sub.1 nsw+ 1)" ==>
>>> "%civ.inc `slt` %len".
>>>
>>> Currently neither of these happen (i.e. even if I make transformation
>>> (1) manually, (2) does not kick in).
>>>
>>> Is the above reasoning correct? If it is, what is the right place to
>>> add this logic to? I'm thinking ScalarEvolution (so that this gets
>>> picked up by SimplifyIndVar), but maybe there is a more idiomatic
>>> place? The case above is a simplified form of my real workload, which
>>> involves smin and smax expressions; so the implementation has to be
>>> easily generalizable to such cases.
>>
>>
>> Before reading your two steps I was going to suggest jump threading. Jump
>> threading is where we optimize redundant tests across blocks that feed into
>> branches (block A tests property X and branches to block B which also tests
>> property X). However jump threading is powered by lazy value info, which I
>> don't think is suited for the sort of reasoning in your two steps.
>>
>> One option is GVN. GVN does have x< y expressions but it doesn't try to
>> deduce nuw/nsw bits. It might be possible to add that, but it isn't
>> immediately obvious to me how. GVN also does path-sensitive expression
>> commoning.
>>
>> Nick
>
More information about the llvm-dev
mailing list