[LLVMdev] missing optimization for icmps in induction variables?
Sanjoy Das
sanjoy at playingwithpointers.com
Wed Jan 14 15:23:10 PST 2015
> If you think of the problem as a loop problem then yes
> ScalarEvolution::isKnownPredicate makes sense. If you think of it as a
> redundancy elimination problem then GVN makes sense. Does this sort of
> problem occur outside of loops?
I've only observed them happen inside loops, but that is only because
I've been looking at loops. :) At least in theory if LLVM's GVN can
be made to in a path-sensitive manner, then there is no need to do
such tricks using SCEV.
I've only cursorily glanced into GVN.cpp, but given that we want LLVM
to conclude that '(x nsw+ 5) SLT L' implies 'x SLT (L nsw- 5)', will
the value numbering for cmp expressions have to become richer? The
operands to icmp themselves won't value number to the same value,
because they're not the same; but maybe there is some canonicalization
form we want to reduce the icmp expressions to before numbering them?
-- Sanjoy
>
>
>> On Wed, Jan 7, 2015 at 10:06 PM, Nick Lewycky<nicholas at mxc.ca> wrote:
>>>
>>> Sanjoy Das wrote:
>>>>
>>>>
>>>> Hi all,
>>>>
>>>> I'm trying to get llvm to optimize away the %cmp to true in
>>>>
>>>> define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) {
>>>> entry:
>>>> %length = load i32* %length_ptr, !range !0
>>>> %len.sub.1 = sub i32 %length, 1
>>>> %upper = icmp slt i32 %init, %len.sub.1
>>>> br i1 %upper, label %loop, label %exit
>>>>
>>>> loop:
>>>> %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ]
>>>> %civ.inc = add i32 %civ, 1
>>>> %cmp = icmp slt i32 %civ.inc, %length
>>>> br i1 %cmp, label %latch, label %break
>>>>
>>>> latch:
>>>> store i32 0, i32* %array
>>>> %check = icmp slt i32 %civ.inc, %len.sub.1
>>>> br i1 %check, label %loop, label %break
>>>>
>>>> break:
>>>> ret i32 %civ.inc
>>>>
>>>> exit:
>>>> ret i32 42
>>>> }
>>>>
>>>> !0 = !{i32 0, i32 2147483647}
>>>>
>>>>
>>>> One way to prove "%cmp == true" in two steps
>>>>
>>>> 1. notice that since both on the backedge and entry, %civ is known to
>>>> be less than %len.sub.1, which not i32_signed_max. This means
>>>> %civ.inc is an "add nsw".
>>>>
>>>> 2. on both the entry and backedge, we know "%civ `slt` %len.sub.1".
>>>> This implies "(%civ nsw+ 1) `slt` (%len.sub.1 nsw+ 1)" ==>
>>>> "%civ.inc `slt` %len".
>>>>
>>>> Currently neither of these happen (i.e. even if I make transformation
>>>> (1) manually, (2) does not kick in).
>>>>
>>>> Is the above reasoning correct? If it is, what is the right place to
>>>> add this logic to? I'm thinking ScalarEvolution (so that this gets
>>>> picked up by SimplifyIndVar), but maybe there is a more idiomatic
>>>> place? The case above is a simplified form of my real workload, which
>>>> involves smin and smax expressions; so the implementation has to be
>>>> easily generalizable to such cases.
>>>
>>>
>>>
>>> Before reading your two steps I was going to suggest jump threading. Jump
>>> threading is where we optimize redundant tests across blocks that feed
>>> into
>>> branches (block A tests property X and branches to block B which also
>>> tests
>>> property X). However jump threading is powered by lazy value info, which
>>> I
>>> don't think is suited for the sort of reasoning in your two steps.
>>>
>>> One option is GVN. GVN does have x< y expressions but it doesn't try to
>>> deduce nuw/nsw bits. It might be possible to add that, but it isn't
>>> immediately obvious to me how. GVN also does path-sensitive expression
>>> commoning.
>>>
>>> Nick
>>
>>
>
More information about the llvm-dev
mailing list