[llvm-dev] Branch is not optimized because of right shift

Sun Apr 5 14:21:36 PDT 2020

Typo: The last 3 x's are x-16 of course.

Στις Δευ, 6 Απρ 2020 στις 12:20 π.μ., ο/η Stefanos Baziotis <
stefanos.baziotis at gmail.com> έγραψε:

> > Any idea about how the compiler could remove the lshr and use a add -16?
> Actually, I just figured that doing this test is like solving this:
>
> 8 <= x/2 <= 13
> 16 <= x <= 26
> 0 <= x - 16 <= 10 => 0 <= x < 11
> The left part is know since it's unsigned
> The right part could be done x <= 11 => x < 12 because it's actually an
> integer division.
> Wow... I would be really happy to know what pass does that.
>
> Best,
> - Stefanos
>
> Στις Δευ, 6 Απρ 2020 στις 12:00 π.μ., ο/η Stefanos Baziotis <
> stefanos.baziotis at gmail.com> έγραψε:
>
>> Hi,
>>
>> > I think the IR in both of your examples makes things harder for the
>> compiler than expected from the original C source.
>>
>> Note that both versions are from clang with -O2. The first is with
>> version 9.0 and the second is with the trunk.
>>
>> > but in the branch only %0 is used. Sinking the lshr too early made the
>> analysis harder.
>>
>> Yes, exactly! That's what I figured too.
>>
>> > The version in https://godbolt.org/z/_ipKhb  is probably the easiest
>> for analysis (basically the original C source code built with `clang -O0 -S
>> -emit-llvm`, followed by running `opt -mem2reg`). There’s a patch under
>> review that adds support for conditional range propagation (
>> https://reviews.llvm.org/D76611) and with the patch it can be simplified
>> to the code below by running `opt  -ipsccp -simplifycfg -instcombine`
>> >
>> > define i32 @test(i32 %0) {
>> >   %.off = add i32 %0, -16
>> >   %2 = icmp ult i32 %.off, 12
>> >   %spec.select = select i1 %2, i32 66, i32 45
>> >   ret i32 %spec.select
>> > }
>> >
>> >
>> > The reason it does not yet work in the default pipeline is that the
>> branch condition will be combined to use an AND before the conditional
>> propagation and D76611 does not support that yet. But once it lands that
>> should be an easy extension.
>> >
>> > The problems with your two versions could be addressed as well of
>> course for that special case relatively easily I think, but the challenge
>> is to fix it in a general and efficient (compile-time wise) way. I hope the
>> conditional propagation should cover such cases soon though.
>>
>> Ok, I checked the patch, best of luck! I see how it does the
>> transformation, it's interesting. I think that
>> https://reviews.llvm.org/rGe9963034314edf49a12ea5e29f694d8f9f52734a could
>> maybe help in such things.
>> Any idea about how the compiler could remove the lshr and use a add -16?
>>
>> Best,
>> Stefanos Baziotis
>>
>>
>> Στις Κυρ, 5 Απρ 2020 στις 10:06 μ.μ., ο/η Florian Hahn <
>> florian_hahn at apple.com> έγραψε:
>>
>>> Hi,
>>>
>>> > On Apr 5, 2020, at 14:08, Stefanos Baziotis via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>> > To be more specific for everyone:
>>> > - First of all, I hope it's clear that in the original (C) code, the
>>> region - 0x8 > 1000 branch should
>>> > be eliminated. That is because it is inside a block that has ensured
>>> that 8 < region < 12. But for some reason,
>>> > it is not eliminated. I'm trying to find that reason. :)
>>> > - I understand that in the -O2 .ll output, we take %0 and do it a
>>> right shift. That means that we don't have
>>> > any range info about %0 or %0 >>= 1, so we can't eliminate the branch.
>>> What I don't understand
>>> > is why we chose to reuse %0 and do it a right shift and instead we
>>> didn't have a phi there.
>>> > - Finally, I saw that putting nuw in the addition with -8 eliminates
>>> the branch. This makes sense since,
>>> > we know probably know from the code above that %0 >>= 1 is less than
>>> 12. Although, I don't understand
>>> > why the right shift was translated to an add with -16.
>>> >
>>> > I hope this made some sense. You may ignore the last 2 and focus on
>>> the first, i.e. what optimization
>>> > should have been done but it's not and what we can do about it.
>>> >
>>> > A clearer version of the .ll: https://godbolt.org/z/2t4RU5
>>>
>>> I think the IR in both of your examples makes things harder for the
>>> compiler than expected from the original C source.
>>>
>>> With the IR in your original example (https://godbolt.org/z/BL-4jL), I
>>> think the problem is that the branch condition is '%0 - 16 < 12’, which
>>> allows us to limit the range of `%0 - 16` easily, but in the branch only %0
>>> is used. Sinking the lshr too early made the analysis harder.
>>>
>>> In the second example (https://godbolt.org/z/2t4RU5), things are hard
>>> to simplify because we need to use the information from the condition of
>>> one select to simplify the earlier select.
>>>
>>> The version in https://godbolt.org/z/_ipKhb  is probably the easiest
>>> for analysis (basically the original C source code built with `clang -O0 -S
>>> -emit-llvm`, followed by running `opt -mem2reg`). There’s a patch under
>>> review that adds support for conditional range propagation (
>>> https://reviews.llvm.org/D76611) and with the patch it can be
>>> simplified to the code below by running `opt  -ipsccp -simplifycfg
>>> -instcombine`
>>>
>>> define i32 @test(i32 %0) {
>>>   %.off = add i32 %0, -16
>>>   %2 = icmp ult i32 %.off, 12
>>>   %spec.select = select i1 %2, i32 66, i32 45
>>>   ret i32 %spec.select
>>> }
>>>
>>>
>>> The reason it does not yet work in the default pipeline is that the
>>> branch condition will be combined to use an AND before the conditional
>>> propagation and D76611 does not support that yet. But once it lands that
>>> should be an easy extension.
>>>
>>> The problems with your two versions could be addressed as well of course
>>> for that special case relatively easily I think, but the challenge is to
>>> fix it in a general and efficient (compile-time wise) way. I hope the
>>> conditional propagation should cover such cases soon though.
>>>
>>> Cheers,
>>> Florian
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200406/564113c1/attachment.html>