[llvm] r336172 - [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done

Wed Jul 4 04:19:06 PDT 2018

https://reviews.llvm.org/D48930 should help.

From: Maxim Kazantsev
Sent: Wednesday, July 4, 2018 1:49 PM
To: 'aditya_nandakumar at apple.com' <aditya_nandakumar at apple.com>
Cc: 'llvm-commits at lists.llvm.org' <llvm-commits at lists.llvm.org>
Subject: RE: [llvm] r336172 - [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done

The root cause is that we kill *all* further transforms with this code.

  // Test if the ICmpInst instruction is used exclusively by a select as
  // part of a minimum or maximum operation. If so, refrain from doing
  // any other folding. This helps out other analyses which understand
  // non-obfuscated minimum and maximum idioms, such as ScalarEvolution
  // and CodeGen. And in this case, at least one of the comparison
  // operands has at least one user besides the compare (the select),
  // which would often largely negate the benefit of folding anyway.
  //
  // Do the same for the other patterns recognized by matchSelectPattern.
  if (I.hasOneUse())
    if (SelectInst *SI = dyn_cast<SelectInst>(I.user_back())) {
      Value *A, *B;
      SelectPatternResult SPR = matchSelectPattern(SI, A, B);
      if (SPR.Flavor != SPF_UNKNOWN)
        return nullptr;
    }

It looks at least unreasonable. I will try to fix that.

-- Max

From: Maxim Kazantsev
Sent: Wednesday, July 4, 2018 12:27 PM
To: 'aditya_nandakumar at apple.com' <aditya_nandakumar at apple.com<mailto:aditya_nandakumar at apple.com>>
Cc: llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>
Subject: RE: [llvm] r336172 - [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done

Hi Aditya,

Thanks for the test, I'll see what's happening with it.

-- Max

From: aditya_nandakumar at apple.com<mailto:aditya_nandakumar at apple.com> [mailto:aditya_nandakumar at apple.com]
Sent: Wednesday, July 4, 2018 6:57 AM
To: Maxim Kazantsev <max.kazantsev at azul.com<mailto:max.kazantsev at azul.com>>
Cc: llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>
Subject: Re: [llvm] r336172 - [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done

Hi Max

This commit is causing significantly worse codegen similar to what’s seen in this simplified test case.

Previously we produce
define i16 @foo(i16) {
entry:
  %1 = and i16 %0, 255
  ret i16 %1
}

With this commit, instcombine doesn’t seem to change the input at all.

Could you please take a look at this ?

Aditya

> On Jul 2, 2018, at 11:23 PM, Max Kazantsev via llvm-commits <llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>> wrote:
>
> Author: mkazantsev
> Date: Mon Jul  2 23:23:57 2018
> New Revision: 336172
>
> URL: http://llvm.org/viewvc/llvm-project?rev=336172&view=rev
> Log:
> [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done
>
> This patch changes order of transform in InstCombineCompares to avoid
> performing transforms based on ranges which produce complex bit arithmetics
> before more simple things (like folding with constants) are done. See PR37636
> for the motivating example.
>
> Differential Revision: https://reviews.llvm.org/D48584
> Reviewed By: spatel, lebedev.ri
>
> Modified:
>    llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp
>    llvm/trunk/test/Analysis/ValueTracking/non-negative-phi-bits.ll
>    llvm/trunk/test/Transforms/InstCombine/icmp-shl-nsw.ll
>    llvm/trunk/test/Transforms/InstCombine/icmp-shr-lt-gt.ll
>    llvm/trunk/test/Transforms/InstCombine/icmp_sdiv_with_and_without_range.ll
>    llvm/trunk/test/Transforms/LoopVectorize/X86/masked_load_store.ll
>
> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp?rev=336172&r1=336171&r2=336172&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp (original)
> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp Mon Jul  2 23:23:57 2018
> @@ -4485,9 +4485,6 @@ Instruction *InstCombiner::visitICmpInst
>   if (Instruction *Res = foldICmpWithConstant(I))
>     return Res;
>
> -  if (Instruction *Res = foldICmpUsingKnownBits(I))
> -    return Res;
> -
>   // Test if the ICmpInst instruction is used exclusively by a select as
>   // part of a minimum or maximum operation. If so, refrain from doing
>   // any other folding. This helps out other analyses which understand
> @@ -4706,6 +4703,13 @@ Instruction *InstCombiner::visitICmpInst
>     if (match(Op1, m_Add(m_Value(X), m_ConstantInt(Cst))) && Op0 == X)
>       return foldICmpAddOpConst(X, Cst, I.getSwappedPredicate());
>   }
> +
> +  // This may be expensive in compile-time, and transforms based on known bits
> +  // can make further analysis more difficult, so we use it as the last resort
> +  // if we cannot do anything better.
> +  if (Instruction *Res = foldICmpUsingKnownBits(I))
> +    return Res;
> +
>   return Changed ? &I : nullptr;
> }
>
>
> Modified: llvm/trunk/test/Analysis/ValueTracking/non-negative-phi-bits.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/ValueTracking/non-negative-phi-bits.ll?rev=336172&r1=336171&r2=336172&view=diff
> ==============================================================================
> --- llvm/trunk/test/Analysis/ValueTracking/non-negative-phi-bits.ll (original)
> +++ llvm/trunk/test/Analysis/ValueTracking/non-negative-phi-bits.ll Mon Jul  2 23:23:57 2018
> @@ -8,7 +8,7 @@ define void @test() #0 {
> ; CHECK:       for.body:
> ; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDVARS_IV_NEXT:%.*]], [[FOR_BODY]] ]
> ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
> -; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ult i64 [[INDVARS_IV_NEXT]], 40
> +; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ult i64 [[INDVARS_IV]], 39
> ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[FOR_END:%.*]], label [[FOR_BODY]]
> ; CHECK:       for.end:
> ; CHECK-NEXT:    ret void
>
> Modified: llvm/trunk/test/Transforms/InstCombine/icmp-shl-nsw.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/icmp-shl-nsw.ll?rev=336172&r1=336171&r2=336172&view=diff
> ==============================================================================
> --- llvm/trunk/test/Transforms/InstCombine/icmp-shl-nsw.ll (original)
> +++ llvm/trunk/test/Transforms/InstCombine/icmp-shl-nsw.ll Mon Jul  2 23:23:57 2018
> @@ -69,11 +69,9 @@ define <2 x i1> @icmp_shl_nsw_eq_vec(<2
> ; icmp sgt with shl nsw with a constant compare operand and constant
> ; shift amount can always be reduced to icmp sgt alone.
>
> -; Known bits analysis turns this into an equality predicate.
> -
> define i1 @icmp_sgt1(i8 %x) {
> ; CHECK-LABEL: @icmp_sgt1(
> -; CHECK-NEXT:    [[CMP:%.*]] = icmp ne i8 %x, -64
> +; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 %x, -64
> ; CHECK-NEXT:    ret i1 [[CMP]]
> ;
>   %shl = shl nsw i8 %x, 1
> @@ -144,11 +142,10 @@ define i1 @icmp_sgt7(i8 %x) {
>   ret i1 %cmp
> }
>
> -; Known bits analysis turns this into an equality predicate.
>
> define i1 @icmp_sgt8(i8 %x) {
> ; CHECK-LABEL: @icmp_sgt8(
> -; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 %x, 63
> +; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 %x, 62
> ; CHECK-NEXT:    ret i1 [[CMP]]
> ;
>   %shl = shl nsw i8 %x, 1
> @@ -158,11 +155,9 @@ define i1 @icmp_sgt8(i8 %x) {
>
> ; Compares with 126 and 127 are recognized as always false.
>
> -; Known bits analysis turns this into an equality predicate.
> -
> define i1 @icmp_sgt9(i8 %x) {
> ; CHECK-LABEL: @icmp_sgt9(
> -; CHECK-NEXT:    [[CMP:%.*]] = icmp ne i8 %x, -1
> +; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 %x, -1
> ; CHECK-NEXT:    ret i1 [[CMP]]
> ;
>   %shl = shl nsw i8 %x, 7
> @@ -210,11 +205,9 @@ define <2 x i1> @icmp_sgt11_vec(<2 x i8>
> ;
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>
> -; Known bits analysis turns this into an equality predicate.
> -
> define i1 @icmp_sle1(i8 %x) {
> ; CHECK-LABEL: @icmp_sle1(
> -; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 %x, -64
> +; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i8 %x, -63
> ; CHECK-NEXT:    ret i1 [[CMP]]
> ;
>   %shl = shl nsw i8 %x, 1
> @@ -285,11 +278,9 @@ define i1 @icmp_sle7(i8 %x) {
>   ret i1 %cmp
> }
>
> -; Known bits analysis turns this into an equality predicate.
> -
> define i1 @icmp_sle8(i8 %x) {
> ; CHECK-LABEL: @icmp_sle8(
> -; CHECK-NEXT:    [[CMP:%.*]] = icmp ne i8 %x, 63
> +; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i8 %x, 63
> ; CHECK-NEXT:    ret i1 [[CMP]]
> ;
>   %shl = shl nsw i8 %x, 1
> @@ -299,11 +290,9 @@ define i1 @icmp_sle8(i8 %x) {
>
> ; Compares with 126 and 127 are recognized as always true.
>
> -; Known bits analysis turns this into an equality predicate.
> -
> define i1 @icmp_sle9(i8 %x) {
> ; CHECK-LABEL: @icmp_sle9(
> -; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 %x, -1
> +; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i8 %x, 0
> ; CHECK-NEXT:    ret i1 [[CMP]]
> ;
>   %shl = shl nsw i8 %x, 7
> @@ -353,4 +342,3 @@ define i1 @icmp_ne1(i8 %x) {
>   %cmp = icmp ne i8 %shl, -128
>   ret i1 %cmp
> }
> -
>
> Modified: llvm/trunk/test/Transforms/InstCombine/icmp-shr-lt-gt.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/icmp-shr-lt-gt.ll?rev=336172&r1=336171&r2=336172&view=diff
> ==============================================================================
> --- llvm/trunk/test/Transforms/InstCombine/icmp-shr-lt-gt.ll (original)
> +++ llvm/trunk/test/Transforms/InstCombine/icmp-shr-lt-gt.ll Mon Jul  2 23:23:57 2018
> @@ -1,3 +1,4 @@
> +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
> ; RUN: opt < %s -instcombine -S | FileCheck %s
>
> define i1 @lshrugt_01_00(i4 %x) {
> @@ -1834,7 +1835,7 @@ define i1 @lshrugt_01_05_exact(i4 %x) {
>
> define i1 @lshrugt_01_06_exact(i4 %x) {
> ; CHECK-LABEL: @lshrugt_01_06_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp eq i4 %x, -2
> +; CHECK-NEXT:    [[C:%.*]] = icmp ugt i4 %x, -4
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = lshr exact i4 %x, 1
> @@ -1945,7 +1946,7 @@ define i1 @lshrugt_02_01_exact(i4 %x) {
>
> define i1 @lshrugt_02_02_exact(i4 %x) {
> ; CHECK-LABEL: @lshrugt_02_02_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp eq i4 %x, -4
> +; CHECK-NEXT:    [[C:%.*]] = icmp ugt i4 %x, -8
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = lshr exact i4 %x, 2
> @@ -2226,7 +2227,7 @@ define i1 @lshrult_01_00_exact(i4 %x) {
>
> define i1 @lshrult_01_01_exact(i4 %x) {
> ; CHECK-LABEL: @lshrult_01_01_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp eq i4 %x, 0
> +; CHECK-NEXT:    [[C:%.*]] = icmp ult i4 %x, 2
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = lshr exact i4 %x, 1
> @@ -2286,7 +2287,7 @@ define i1 @lshrult_01_06_exact(i4 %x) {
>
> define i1 @lshrult_01_07_exact(i4 %x) {
> ; CHECK-LABEL: @lshrult_01_07_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp ne i4 %x, -2
> +; CHECK-NEXT:    [[C:%.*]] = icmp ult i4 %x, -2
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = lshr exact i4 %x, 1
> @@ -2377,7 +2378,7 @@ define i1 @lshrult_02_00_exact(i4 %x) {
>
> define i1 @lshrult_02_01_exact(i4 %x) {
> ; CHECK-LABEL: @lshrult_02_01_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp eq i4 %x, 0
> +; CHECK-NEXT:    [[C:%.*]] = icmp ult i4 %x, 4
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = lshr exact i4 %x, 2
> @@ -2397,7 +2398,7 @@ define i1 @lshrult_02_02_exact(i4 %x) {
>
> define i1 @lshrult_02_03_exact(i4 %x) {
> ; CHECK-LABEL: @lshrult_02_03_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp ne i4 %x, -4
> +; CHECK-NEXT:    [[C:%.*]] = icmp ult i4 %x, -4
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = lshr exact i4 %x, 2
> @@ -2524,7 +2525,7 @@ define i1 @lshrult_03_00_exact(i4 %x) {
>
> define i1 @lshrult_03_01_exact(i4 %x) {
> ; CHECK-LABEL: @lshrult_03_01_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp ne i4 %x, -8
> +; CHECK-NEXT:    [[C:%.*]] = icmp sgt i4 %x, -1
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = lshr exact i4 %x, 3
> @@ -2801,7 +2802,7 @@ define i1 @ashrsgt_01_14_exact(i4 %x) {
>
> define i1 @ashrsgt_01_15_exact(i4 %x) {
> ; CHECK-LABEL: @ashrsgt_01_15_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp sgt i4 %x, -1
> +; CHECK-NEXT:    [[C:%.*]] = icmp sgt i4 %x, -2
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = ashr exact i4 %x, 1
> @@ -2948,7 +2949,7 @@ define i1 @ashrsgt_02_14_exact(i4 %x) {
>
> define i1 @ashrsgt_02_15_exact(i4 %x) {
> ; CHECK-LABEL: @ashrsgt_02_15_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp sgt i4 %x, -1
> +; CHECK-NEXT:    [[C:%.*]] = icmp sgt i4 %x, -4
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = ashr exact i4 %x, 2
> @@ -3093,7 +3094,7 @@ define i1 @ashrsgt_03_14_exact(i4 %x) {
>
> define i1 @ashrsgt_03_15_exact(i4 %x) {
> ; CHECK-LABEL: @ashrsgt_03_15_exact(
> -; CHECK-NEXT:    [[C:%.*]] = icmp sgt i4 %x, -1
> +; CHECK-NEXT:    [[C:%.*]] = icmp ne i4 %x, -8
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %s = ashr exact i4 %x, 3
>
> Modified: llvm/trunk/test/Transforms/InstCombine/icmp_sdiv_with_and_without_range.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/icmp_sdiv_with_and_without_range.ll?rev=336172&r1=336171&r2=336172&view=diff
> ==============================================================================
> --- llvm/trunk/test/Transforms/InstCombine/icmp_sdiv_with_and_without_range.ll (original)
> +++ llvm/trunk/test/Transforms/InstCombine/icmp_sdiv_with_and_without_range.ll Mon Jul  2 23:23:57 2018
> @@ -2,7 +2,8 @@
> ; RUN: opt -instcombine -S < %s | FileCheck %s
>
> ; Test that presence of range does not cause unprofitable transforms with bit
> -; arithmetics, and instcombine behaves exactly the same as without the range.
> +; arithmetics. InstCombine needs to be smart about dealing with range-annotated
> +; values.
>
> define i1 @without_range(i32* %A) {
> ; CHECK-LABEL: @without_range(
> @@ -19,8 +20,7 @@ define i1 @without_range(i32* %A) {
> define i1 @with_range(i32* %A) {
> ; CHECK-LABEL: @with_range(
> ; CHECK-NEXT:    [[A_VAL:%.*]] = load i32, i32* [[A:%.*]], align 8, !range !0
> -; CHECK-NEXT:    [[B_MASK:%.*]] = and i32 [[A_VAL]], 2147483646
> -; CHECK-NEXT:    [[C:%.*]] = icmp eq i32 [[B_MASK]], 0
> +; CHECK-NEXT:    [[C:%.*]] = icmp ult i32 [[A_VAL]], 2
> ; CHECK-NEXT:    ret i1 [[C]]
> ;
>   %A.val = load i32, i32* %A, align 8, !range !0
>
> Modified: llvm/trunk/test/Transforms/LoopVectorize/X86/masked_load_store.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/masked_load_store.ll?rev=336172&r1=336171&r2=336172&view=diff
> ==============================================================================
> --- llvm/trunk/test/Transforms/LoopVectorize/X86/masked_load_store.ll (original)
> +++ llvm/trunk/test/Transforms/LoopVectorize/X86/masked_load_store.ll Mon Jul  2 23:23:57 2018
> @@ -2052,8 +2052,7 @@ define void @foo4(double* %A, double* %B
> ; AVX512-NEXT:    [[PROL_ITER_CMP:%.*]] = icmp eq i64 [[PROL_ITER_SUB]], 0
> ; AVX512-NEXT:    br i1 [[PROL_ITER_CMP]], label [[FOR_BODY_PROL_LOOPEXIT:%.*]], label [[FOR_BODY_PROL]], !llvm.loop !50
> ; AVX512:       for.body.prol.loopexit:
> -; AVX512-NEXT:    [[DOTMASK:%.*]] = and i64 [[TMP24]], 9984
> -; AVX512-NEXT:    [[TMP28:%.*]] = icmp eq i64 [[DOTMASK]], 0
> +; AVX512-NEXT:    [[TMP28:%.*]] = icmp ult i64 [[TMP24]], 48
> ; AVX512-NEXT:    br i1 [[TMP28]], label [[FOR_END:%.*]], label [[FOR_BODY:%.*]]
> ; AVX512:       for.body:
> ; AVX512-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ [[INDVARS_IV_NEXT_3:%.*]], [[FOR_INC_3:%.*]] ], [ [[INDVARS_IV_NEXT_PROL]], [[FOR_BODY_PROL_LOOPEXIT]] ]
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180704/a0b44de2/attachment-0001.html>