[llvm-dev] Branch is not optimized because of right shift
Stefanos Baziotis via llvm-dev
llvm-dev at lists.llvm.org
Sun Jul 12 02:45:12 PDT 2020
Hi Florian,
Great, thanks for your effort!
- Stefanos
Στις Σάβ, 11 Ιουλ 2020 στις 11:13 π.μ., ο/η Florian Hahn <
florian_hahn at apple.com> έγραψε:
> Hi,
>
> I think this should be optimized as expected on latest trunk:
> https://godbolt.org/z/vW7K7f
>
> Cheers
>
> On Apr 6, 2020, at 16:53, Stefanos Baziotis <stefanos.baziotis at gmail.com>
> wrote:
>
> Hi,
>
> > But the value can't be [0,7] due to the earlier branch. When I said it
> was guaranteed to wrap, I only meant for the range of values that were
> possible after the first branch.
> Indeed, for some reason I remembered that the left check is >= 0 when I
> wrote this.
>
> Thanks a lot for the breakdown in CorrelatedValuePropagation. I had no
> idea about this pass.
>
> > It was able to see that the input to add was in the range [8,14) in the
> call to LVI->getConstantRange in processBinOp.
>
> I could not reproduce that. I get:
> BinOp: %4 = add nsw i32 %2, -8
> LRange: [0,-2147483648)
> RRange: [-8,-7)
>
> > processCmp skips calling LVI for the select's icmp because the input
> isn't in the same basic block and isn't a phi.
> Do you maybe mean: _is_ in the same BB (and isn't a PHI).
>
> By the way, I don't get the reasoning in the comment above:
> // As a policy choice, we choose not to waste compile time on anything
> where
> // the comparison is testing local values.
>
> > I think this is because the code executed for getConstant doesn't handle
> icmp even when it can prove the input is in a constant range.
>
> Maybe we ended up on the same thing. I'm not sure I followed that
> correctly but getValueFromICmpCondition() should have been able to handle
> that.
>
> Best,
> Stefanos
>
>
> Στις Δευ, 6 Απρ 2020 στις 5:22 π.μ., ο/η Craig Topper <
> craig.topper at gmail.com> έγραψε:
>
>> On Sun, Apr 5, 2020 at 6:34 PM Stefanos Baziotis <
>> stefanos.baziotis at gmail.com> wrote:
>>
>>> Hi Craig,
>>>
>>> > Adding a nuw to the add -8 is incorrect.
>>> Yeah, I didn't mean to say it was correct. It was just an observation
>>> that with nuw the optimization was happened and I asked if someone thought
>>> it was somehow connected.
>>>
>>> > From the perspective of the unsigned math, -8 is treated a very large
>>> positive number. The input to the add is [8,13) and adding a large positive
>>> number to it wraps around past 0. So that is guaranteed unsigned wrap
>>> I understand yes, but I don't think it is guaranteed. Unless I miss
>>> something, for values in [0, 7] it won't wrap. But past that and up to (and
>>> including in the original source code) 13, it will wrap yes.
>>>
>>
>> But the value can't be [0,7] due to the earlier branch. When I said it
>> was guaranteed to wrap, I only meant for the range of values that were
>> possible after the first branch.
>>
>> In theory, the CorrelatedValuePropagation pass should have been able to
>> optimize the select. It was able to see that the input to add was in the
>> range [8,14) in the call to LVI->getConstantRange in processBinOp.
>> processCmp skips calling LVI for the select's icmp because the input isn't
>> in the same basic block and isn't a phi. And the call to LVI->getConstant
>> for the select in processSelect didn't return a constant. I think this is
>> because the code executed for getConstant doesn't handle icmp even when it
>> can prove the input is in a constant range. I tried removing the local
>> value check in processCmp so that getPredicateAt would called, but that
>> didn't help either.
>>
>>
>>> Best,
>>> - Stefanos
>>>
>>>
>>> Στις Δευ, 6 Απρ 2020 στις 3:10 π.μ., ο/η Craig Topper <
>>> craig.topper at gmail.com> έγραψε:
>>>
>>>> Adding a nuw to the add -8 is incorrect. From the perspective of the
>>>> unsigned math, -8 is treated a very large positive number. The input to the
>>>> add is [8,13) and adding a large positive number to it wraps around past 0.
>>>> So that is guaranteed unsigned wrap. On the other hand, a sub nuw 8 would
>>>> be correct.
>>>>
>>>> ~Craig
>>>>
>>>>
>>>> On Sun, Apr 5, 2020 at 3:27 PM Stefanos Baziotis via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Thanks, I didn't know that! Indeed, it's instruction combine that does
>>>>> the job.
>>>>>
>>>>> - Stefanos
>>>>>
>>>>> Στις Δευ, 6 Απρ 2020 στις 12:38 π.μ., ο/η Florian Hahn <
>>>>> florian_hahn at apple.com> έγραψε:
>>>>>
>>>>>>
>>>>>>
>>>>>> > On Apr 5, 2020, at 22:20, Stefanos Baziotis <
>>>>>> stefanos.baziotis at gmail.com> wrote:
>>>>>> >
>>>>>> > > Any idea about how the compiler could remove the lshr and use a
>>>>>> add -16?
>>>>>> > Actually, I just figured that doing this test is like solving this:
>>>>>> >
>>>>>> > 8 <= x/2 <= 13
>>>>>> > 16 <= x <= 26
>>>>>> > 0 <= x - 16 <= 10 => 0 <= x < 11
>>>>>> > The left part is know since it's unsigned
>>>>>> > The right part could be done x <= 11 => x < 12 because it's
>>>>>> actually an integer division.
>>>>>> > Wow... I would be really happy to know what pass does that.
>>>>>>
>>>>>> I’d guess a combination of instcombine rules together with some other
>>>>>> transforms. You could probably use `-print-after-all` (`clang -mllvm
>>>>>> -print-after-all` if you are using clang) to track down the relevant
>>>>>> passes/steps.
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200712/810ce03d/attachment.html>
More information about the llvm-dev
mailing list