[llvm-dev] Branch is not optimized because of right shift

Sun Apr 5 14:20:30 PDT 2020

> Any idea about how the compiler could remove the lshr and use a add -16?
Actually, I just figured that doing this test is like solving this:

8 <= x/2 <= 13
16 <= x <= 26
0 <= x - 16 <= 10 => 0 <= x < 11
The left part is know since it's unsigned
The right part could be done x <= 11 => x < 12 because it's actually an
integer division.
Wow... I would be really happy to know what pass does that.

Best,
- Stefanos

Στις Δευ, 6 Απρ 2020 στις 12:00 π.μ., ο/η Stefanos Baziotis <
stefanos.baziotis at gmail.com> έγραψε:

> Hi,
>
> > I think the IR in both of your examples makes things harder for the
> compiler than expected from the original C source.
>
> Note that both versions are from clang with -O2. The first is with version
> 9.0 and the second is with the trunk.
>
> > but in the branch only %0 is used. Sinking the lshr too early made the
> analysis harder.
>
> Yes, exactly! That's what I figured too.
>
> > The version in https://godbolt.org/z/_ipKhb  is probably the easiest
> for analysis (basically the original C source code built with `clang -O0 -S
> -emit-llvm`, followed by running `opt -mem2reg`). There’s a patch under
> review that adds support for conditional range propagation (
> https://reviews.llvm.org/D76611) and with the patch it can be simplified
> to the code below by running `opt  -ipsccp -simplifycfg -instcombine`
> >
> > define i32 @test(i32 %0) {
> >   %.off = add i32 %0, -16
> >   %2 = icmp ult i32 %.off, 12
> >   %spec.select = select i1 %2, i32 66, i32 45
> >   ret i32 %spec.select
> > }
> >
> >
> > The reason it does not yet work in the default pipeline is that the
> branch condition will be combined to use an AND before the conditional
> propagation and D76611 does not support that yet. But once it lands that
> should be an easy extension.
> >
> > The problems with your two versions could be addressed as well of course
> for that special case relatively easily I think, but the challenge is to
> fix it in a general and efficient (compile-time wise) way. I hope the
> conditional propagation should cover such cases soon though.
>
> Ok, I checked the patch, best of luck! I see how it does the
> transformation, it's interesting. I think that
> https://reviews.llvm.org/rGe9963034314edf49a12ea5e29f694d8f9f52734a could
> maybe help in such things.
> Any idea about how the compiler could remove the lshr and use a add -16?
>
> Best,
> Stefanos Baziotis
>
>
> Στις Κυρ, 5 Απρ 2020 στις 10:06 μ.μ., ο/η Florian Hahn <
> florian_hahn at apple.com> έγραψε:
>
>> Hi,
>>
>> > On Apr 5, 2020, at 14:08, Stefanos Baziotis via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> > To be more specific for everyone:
>> > - First of all, I hope it's clear that in the original (C) code, the
>> region - 0x8 > 1000 branch should
>> > be eliminated. That is because it is inside a block that has ensured
>> that 8 < region < 12. But for some reason,
>> > it is not eliminated. I'm trying to find that reason. :)
>> > - I understand that in the -O2 .ll output, we take %0 and do it a right
>> shift. That means that we don't have
>> > any range info about %0 or %0 >>= 1, so we can't eliminate the branch.
>> What I don't understand
>> > is why we chose to reuse %0 and do it a right shift and instead we
>> didn't have a phi there.
>> > - Finally, I saw that putting nuw in the addition with -8 eliminates
>> the branch. This makes sense since,
>> > we know probably know from the code above that %0 >>= 1 is less than
>> 12. Although, I don't understand
>> > why the right shift was translated to an add with -16.
>> >
>> > I hope this made some sense. You may ignore the last 2 and focus on the
>> first, i.e. what optimization
>> > should have been done but it's not and what we can do about it.
>> >
>> > A clearer version of the .ll: https://godbolt.org/z/2t4RU5
>>
>> I think the IR in both of your examples makes things harder for the
>> compiler than expected from the original C source.
>>
>> With the IR in your original example (https://godbolt.org/z/BL-4jL), I
>> think the problem is that the branch condition is '%0 - 16 < 12’, which
>> allows us to limit the range of `%0 - 16` easily, but in the branch only %0
>> is used. Sinking the lshr too early made the analysis harder.
>>
>> In the second example (https://godbolt.org/z/2t4RU5), things are hard to
>> simplify because we need to use the information from the condition of one
>> select to simplify the earlier select.
>>
>> The version in https://godbolt.org/z/_ipKhb  is probably the easiest for
>> analysis (basically the original C source code built with `clang -O0 -S
>> -emit-llvm`, followed by running `opt -mem2reg`). There’s a patch under
>> review that adds support for conditional range propagation (
>> https://reviews.llvm.org/D76611) and with the patch it can be simplified
>> to the code below by running `opt  -ipsccp -simplifycfg -instcombine`
>>
>> define i32 @test(i32 %0) {
>>   %.off = add i32 %0, -16
>>   %2 = icmp ult i32 %.off, 12
>>   %spec.select = select i1 %2, i32 66, i32 45
>>   ret i32 %spec.select
>> }
>>
>>
>> The reason it does not yet work in the default pipeline is that the
>> branch condition will be combined to use an AND before the conditional
>> propagation and D76611 does not support that yet. But once it lands that
>> should be an easy extension.
>>
>> The problems with your two versions could be addressed as well of course
>> for that special case relatively easily I think, but the challenge is to
>> fix it in a general and efficient (compile-time wise) way. I hope the
>> conditional propagation should cover such cases soon though.
>>
>> Cheers,
>> Florian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200406/25b8ad99/attachment.html>