[PATCH] D63505: [InstCombine] Fold icmp eq/ne (and %x, ~C), 0 -> %x u</u>= 0 earlier, C+1 is power of 2.
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 18 14:40:53 PDT 2019
lebedev.ri added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp:1656-1657
+ // ((X & ~7) == 0) --> X u< 8
+ // If X is (BinOp Y, C3), allow other rules to fold C3 with C2.
+ if (!match(X, m_c_BinOp(m_Value(), m_Constant())) &&
+ (~(*C2) + 1).isPowerOf2()) {
----------------
huihuiz wrote:
> lebedev.ri wrote:
> > lebedev.ri wrote:
> > > huihuiz wrote:
> > > > lebedev.ri wrote:
> > > > > huihuiz wrote:
> > > > > > huihuiz wrote:
> > > > > > > lebedev.ri wrote:
> > > > > > > > Why do we care about that? Seems rather arbitrary.
> > > > > > > take this test as example
> > > > > > >
> > > > > > >
> > > > > > > ```
> > > > > > > define i1 @test_shift_and_cmp_changed2(i8 %p) {
> > > > > > > %shlp = shl i8 %p, 5
> > > > > > > %andp = and i8 %shlp, -64
> > > > > > > %cmp = icmp ult i8 %andp, 32
> > > > > > > ret i1 %cmp
> > > > > > > }
> > > > > > >
> > > > > > > ```
> > > > > > >
> > > > > > > we do miss fold for (X >> C3) & C2 != C1 --> (X & (C2 << C3)) != (C1 << C3)
> > > > > > for this test, should be better transforming into this
> > > > > >
> > > > > > ```
> > > > > > define i1 @test_shift_and_cmp_changed2(i8 %p) {
> > > > > > %andp = and i8 %p, 6
> > > > > > %cmp = icmp eq i8 %andp, 0
> > > > > > ret i1 %cmp
> > > > > > }
> > > > > >
> > > > > > ```
> > > > > >
> > > > > > However, by scheduling ((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two earlier,
> > > > > > we end up with
> > > > > >
> > > > > > ```
> > > > > > define i1 @test_shift_and_cmp_changed2(i8 %p) {
> > > > > > %shlp = shl i8 %p, 5
> > > > > > %cmp = icmp ult i8 %shlp, 64
> > > > > > ret i1 %cmp
> > > > > > }
> > > > > >
> > > > > > ```
> > > > > >
> > > > > Is that IR without this restriction?
> > > > > We currently seem to handle it just fine https://godbolt.org/z/8DB1xD
> > > > Yes , that is the IR without add this restriction.
> > > > We are currently handle it fine.
> > > > But when we try to schedule (X & ~C) ==/!= 0 --> X u</u>= C+1 earlier
> > > > we end up with
> > > >
> > > > ```
> > > > define i1 @test_shift_and_cmp_changed2(i8 %p) {
> > > > %shlp = shl i8 %p, 5
> > > > %cmp = icmp ult i8 %shlp, 64
> > > > ret i1 %cmp
> > > > }
> > > > ```
> > > Well, nice find.
> > > Like i guessed this rescheduling is exposing all kinds of missing folds :)
> > Can you give me a spoiler, if you drop this restriction, are there any other regressions in `check-llvm`?
> Here are the failing tests when dropping this restriction:
>
> 1. test file: test/Transforms/InstCombine/pr17827.ll
> test_shift_and_cmp_changed2 and its vector version
>
> 2. test file: test/Transforms/InstCombine/lshr-and-negC-icmpeq-zero.ll
> scalar_i32_lshr_and_negC_eq_X_is_constant1
> and
> scalar_i32_lshr_and_negC_eq_X_is_constant1
>
>
> ```
> define i1 @scalar_i32_lshr_and_negC_eq_X_is_constant1(i32 %y) {
> ; CHECK-LABEL: @scalar_i32_lshr_and_negC_eq_X_is_constant1(
> ; CHECK-NEXT: [[LSHR:%.*]] = lshr i32 12345, [[Y:%.*]]
> -; CHECK-NEXT: [[AND:%.*]] = and i32 [[LSHR]], -8
> -; CHECK-NEXT: [[R:%.*]] = icmp eq i32 [[AND]], 0
> +; CHECK-NEXT: [[R:%.*]] = icmp ult i32 [[LSHR]], 8
> ; CHECK-NEXT: ret i1 [[R]]
> ;
> %lshr = lshr i32 12345, %y
> %and = and i32 %lshr, 4294967288 ; ~7
> %r = icmp eq i32 %and, 0
> ret i1 %r
> }
>
> define i1 @scalar_i32_lshr_and_negC_eq_X_is_constant2(i32 %y) {
> ; CHECK-LABEL: @scalar_i32_lshr_and_negC_eq_X_is_constant2(
> ; CHECK-NEXT: [[LSHR:%.*]] = lshr i32 268435456, [[Y:%.*]]
> -; CHECK-NEXT: [[AND:%.*]] = and i32 [[LSHR]], -8
> -; CHECK-NEXT: [[R:%.*]] = icmp eq i32 [[AND]], 0
> +; CHECK-NEXT: [[R:%.*]] = icmp ult i32 [[LSHR]], 8
> ; CHECK-NEXT: ret i1 [[R]]
> ;
> %lshr = lshr i32 268435456, %y
> %and = and i32 %lshr, 4294967288 ; ~7
> %r = icmp eq i32 %and, 0
> ret i1 %r
> }
>
> ```
>
> 3. similarly for test/Transforms/InstCombine/shl-and-negC-icmpeq-zero.ll
I'm confused, 2. and 3. are improvements, right?
So only `test_shift_and_cmp_changed2` regresses?
A missing fold should simply be added then, i'd say.
I'm not sure what it is yet though.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D63505/new/
https://reviews.llvm.org/D63505
More information about the llvm-commits
mailing list