[llvm] r315992 - Improve clamp recognition in ValueTracking.

Thu Oct 19 07:50:45 PDT 2017

Hi Mikael, 
thank you very much for this analysis.

Sanjay, I have two options:
1. Before my patch matchMinMax was matching patterns only with canonical predicate.
That is why the easy way to fix the problem found by Mikael is:

diff --git a/lib/Analysis/ValueTracking.cpp b/lib/Analysis/ValueTracking.cpp
index 4f5d69d77aa..d9ced4da7e9 100644
--- a/lib/Analysis/ValueTracking.cpp
+++ b/lib/Analysis/ValueTracking.cpp
@@ -4085,6 +4085,9 @@ static SelectPatternResult matchMinMax(CmpInst::Predicate Pred,
                                        Value *&LHS, Value *&RHS) {
   assert(!ICmpInst::isEquality(Pred) && "Expected not equality predicate only!");

+  if (!isCanonicalPredicate(Pred))
+    return false;
+
   // First, check if select has inverse order of what we will check below:
   if (CmpRHS == FalseVal) {
     std::swap(TrueVal, FalseVal);

2. We can revert this commit, I will prepare new one and upload it for review.

How do you think, which option is better?

Best regards, 
Artur Gainullin

-----Original Message-----
From: Mikael Holmén [mailto:mikael.holmen at ericsson.com] 
Sent: Thursday, October 19, 2017 2:30 PM
To: Bozhenov, Nikolai <nikolai.bozhenov at intel.com>; llvm-commits at lists.llvm.org; Gainullin, Artur <artur.gainullin at intel.com>; Sanjay Patel <spatel at rotateright.com>
Subject: Re: [llvm] r315992 - Improve clamp recognition in ValueTracking.

Hi Artur, Nikolai, Sanjay,

I think this patch miscompiles the following example:

If I do

llc -O0 -march=nvptx -o - foo.ll

on

@sur1 = global i16 1234

define i16 @main() {
   %_tmp21.i = icmp sle i16 0, 0
   %_tmp22.i = select i1 %_tmp21.i, i16 0, i16 32767
   store i16 %_tmp22.i, i16* @sur1
   ret i16 0
}

I get

         mov.u32         %r1, sur1;
         cvta.global.u32         %r2, %r1;
         mov.u16         %rs1, 32767;
         st.u16  [%r2], %rs1;
         mov.u32         %r3, 0;
         st.param.b32    [func_retval0+0], %r3;
         ret;

with the patch.

Without it I get

         mov.u32         %r1, sur1;
         cvta.global.u32         %r2, %r1;
         mov.u16         %rs1, 0;
         st.u16  [%r2], %rs1;
         mov.u32         %r3, 0;
         st.param.b32    [func_retval0+0], %r3;
         ret;

Notice the 32767 vs 0 difference.

I'm not very good at nvptx, but if looking at the -debug printouts from ISel we see

Initial selection DAG: BB#0 'main:'
SelectionDAG has 13 nodes:
   t4: i16 = Constant<0>
   t5: ch = setle
   t6: i1 = Constant<-1>
         t0: ch = EntryToken
         t3: i32 = llvm.nvvm.ptr.global.to.gen TargetConstant:i32<3309>,
GlobalAddress:i32<i16 addrspace(1)* @sur1> 0
       t10: ch = store<ST2[%cvta]> t0, Constant:i16<32767>, t3, undef:i32
     t11: ch = NVPTXISD::StoreRetval<ST4[<unknown>](align=1)> t10, Constant:i32<0>, Constant:i32<0>
   t12: ch = NVPTXISD::RET_FLAG t11

with the patch and

Initial selection DAG: BB#0 'main:'
SelectionDAG has 13 nodes:
   t5: ch = setle
   t6: i1 = Constant<-1>
   t7: i16 = Constant<32767>
         t0: ch = EntryToken
         t3: i32 = llvm.nvvm.ptr.global.to.gen TargetConstant:i32<3309>,
GlobalAddress:i32<i16 addrspace(1)* @sur1> 0
       t10: ch = store<ST2[%cvta]> t0, Constant:i16<0>, t3, undef:i32
     t11: ch = NVPTXISD::StoreRetval<ST4[<unknown>](align=1)> t10, Constant:i32<0>, Constant:i32<0>
   t12: ch = NVPTXISD::RET_FLAG t11

without. Here it's pretty clear that the patch has done something wrong, deciding we should always store 32767 to @sur1 instead of the expected 0.

It seems to be this code in matchMinMax that triggers:

   // An unsigned min/max can be written with a signed compare.
   const APInt *C2;
   if ((CmpLHS == TrueVal && match(FalseVal, m_APInt(C2))) ||
       (CmpLHS == FalseVal && match(TrueVal, m_APInt(C2)))) {
     // Is the sign bit set?
     // (X <s 0) ? X : MAXVAL ==> (X >u MAXVAL) ? X : MAXVAL ==> UMAX
     // (X <s 0) ? MAXVAL : X ==> (X >u MAXVAL) ? MAXVAL : X ==> UMIN
     if ((Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_SLE) && *C1 == 0 &&
         C2->isMaxSignedValue())
       return {CmpLHS == TrueVal ? SPF_UMAX : SPF_UMIN, SPNB_NA, false};

where "Pred == CmpInst::ICMP_SLE" was added in the patch...

Originally found this on my out-of-tree target where the resulting code really did the wrong thing noticed at runtime, and then I've reduced and reproduced on nvptx.

Regards,
Mikael

On 10/17/2017 01:50 PM, Nikolai Bozhenov via llvm-commits wrote:
> Author: n.bozhenov
> Date: Tue Oct 17 04:50:48 2017
> New Revision: 315992
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=315992&view=rev
> Log:
> Improve clamp recognition in ValueTracking.
> 
> Summary:
> ValueTracking was recognizing not all variations of clamp. Swapping of 
> true value and false value of select was added to fix this problem. 
> This change breaks the canonical form of cmp inside the matchMinMax 
> function, that is why additional checks for compare predicates is 
> needed. Added corresponding test cases.
> 
> Reviewers: spatel
> 
> Subscribers: llvm-commits
> 
> Differential Revision: https://reviews.llvm.org/D38531
> 
> Patch by: Artur Gainullin <artur.gainullin at intel.com>
> 
> Modified:
>      llvm/trunk/lib/Analysis/ValueTracking.cpp
>      llvm/trunk/test/Transforms/InstCombine/minmax-fold.ll
> 
> Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp
> URL: 
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTrack
> ing.cpp?rev=315992&r1=315991&r2=315992&view=diff
> ======================================================================
> ========
> --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original)
> +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Tue Oct 17 04:50:48 2017
> @@ -4083,6 +4083,14 @@ static SelectPatternResult matchMinMax(C
>                                          Value *CmpLHS, Value *CmpRHS,
>                                          Value *TrueVal, Value *FalseVal,
>                                          Value *&LHS, Value *&RHS) {
> +  assert(!ICmpInst::isEquality(Pred) && "Expected not equality 
> + predicate only!");
> +
> +  // First, check if select has inverse order of what we will check below:
> +  if (CmpRHS == FalseVal) {
> +    std::swap(TrueVal, FalseVal);
> +    Pred = CmpInst::getInversePredicate(Pred);
> +  }
> +
>     // Assume success. If there's no match, callers should not use these anyway.
>     LHS = TrueVal;
>     RHS = FalseVal;
> @@ -4095,26 +4103,30 @@ static SelectPatternResult matchMinMax(C
>   
>       // (X <s C1) ? C1 : SMIN(X, C2) ==> SMAX(SMIN(X, C2), C1)
>       if (match(FalseVal, m_SMin(m_Specific(CmpLHS), m_APInt(C2))) &&
> -        C1->slt(*C2) && Pred == CmpInst::ICMP_SLT)
> +        C1->slt(*C2) &&
> +        (Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_SLE))
>         return {SPF_SMAX, SPNB_NA, false};
>   
>       // (X >s C1) ? C1 : SMAX(X, C2) ==> SMIN(SMAX(X, C2), C1)
>       if (match(FalseVal, m_SMax(m_Specific(CmpLHS), m_APInt(C2))) &&
> -        C1->sgt(*C2) && Pred == CmpInst::ICMP_SGT)
> +        C1->sgt(*C2) &&
> +        (Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE))
>         return {SPF_SMIN, SPNB_NA, false};
>   
>       // (X <u C1) ? C1 : UMIN(X, C2) ==> UMAX(UMIN(X, C2), C1)
>       if (match(FalseVal, m_UMin(m_Specific(CmpLHS), m_APInt(C2))) &&
> -        C1->ult(*C2) && Pred == CmpInst::ICMP_ULT)
> +        C1->ult(*C2) &&
> +        (Pred == CmpInst::ICMP_ULT || Pred == CmpInst::ICMP_ULE))
>         return {SPF_UMAX, SPNB_NA, false};
>   
>       // (X >u C1) ? C1 : UMAX(X, C2) ==> UMIN(UMAX(X, C2), C1)
>       if (match(FalseVal, m_UMax(m_Specific(CmpLHS), m_APInt(C2))) &&
> -        C1->ugt(*C2) && Pred == CmpInst::ICMP_UGT)
> +        C1->ugt(*C2) &&
> +        (Pred == CmpInst::ICMP_UGT || Pred == CmpInst::ICMP_UGE))
>         return {SPF_UMIN, SPNB_NA, false};
>     }
>   
> -  if (Pred != CmpInst::ICMP_SGT && Pred != CmpInst::ICMP_SLT)
> +  if (!CmpInst::isSigned(Pred))
>       return {SPF_UNKNOWN, SPNB_NA, false};
>   
>     // Z = X -nsw Y
> @@ -4122,14 +4134,18 @@ static SelectPatternResult matchMinMax(C
>     // (X <s Y) ? 0 : Z ==> (Z <s 0) ? 0 : Z ==> SMAX(Z, 0)
>     if (match(TrueVal, m_Zero()) &&
>         match(FalseVal, m_NSWSub(m_Specific(CmpLHS), m_Specific(CmpRHS))))
> -    return {Pred == CmpInst::ICMP_SGT ? SPF_SMIN : SPF_SMAX, SPNB_NA, false};
> +    return {(Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) ? SPF_SMIN
> +                                                                     : SPF_SMAX,
> +            SPNB_NA, false};
>   
>     // Z = X -nsw Y
>     // (X >s Y) ? Z : 0 ==> (Z >s 0) ? Z : 0 ==> SMAX(Z, 0)
>     // (X <s Y) ? Z : 0 ==> (Z <s 0) ? Z : 0 ==> SMIN(Z, 0)
>     if (match(FalseVal, m_Zero()) &&
>         match(TrueVal, m_NSWSub(m_Specific(CmpLHS), m_Specific(CmpRHS))))
> -    return {Pred == CmpInst::ICMP_SGT ? SPF_SMAX : SPF_SMIN, SPNB_NA, false};
> +    return {(Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) ? SPF_SMAX
> +                                                                     : SPF_SMIN,
> +            SPNB_NA, false};
>   
>     if (!match(CmpRHS, m_APInt(C1)))
>       return {SPF_UNKNOWN, SPNB_NA, false}; @@ -4141,14 +4157,15 @@ 
> static SelectPatternResult matchMinMax(C
>       // Is the sign bit set?
>       // (X <s 0) ? X : MAXVAL ==> (X >u MAXVAL) ? X : MAXVAL ==> UMAX
>       // (X <s 0) ? MAXVAL : X ==> (X >u MAXVAL) ? MAXVAL : X ==> UMIN
> -    if (Pred == CmpInst::ICMP_SLT && *C1 == 0 && C2->isMaxSignedValue())
> +    if ((Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_SLE) && *C1 == 0 &&
> +        C2->isMaxSignedValue())
>         return {CmpLHS == TrueVal ? SPF_UMAX : SPF_UMIN, SPNB_NA, 
> false};
>   
>       // Is the sign bit clear?
>       // (X >s -1) ? MINVAL : X ==> (X <u MINVAL) ? MINVAL : X ==> UMAX
>       // (X >s -1) ? X : MINVAL ==> (X <u MINVAL) ? X : MINVAL ==> UMIN
> -    if (Pred == CmpInst::ICMP_SGT && C1->isAllOnesValue() &&
> -        C2->isMinSignedValue())
> +    if ((Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) &&
> +        C1->isAllOnesValue() && C2->isMinSignedValue())
>         return {CmpLHS == FalseVal ? SPF_UMAX : SPF_UMIN, SPNB_NA, false};
>     }
>   
> @@ -4157,13 +4174,17 @@ static SelectPatternResult matchMinMax(C
>     // (X <s C) ? ~X : ~C ==> (~X >s ~C) ? ~X : ~C ==> SMAX(~X, ~C)
>     if (match(TrueVal, m_Not(m_Specific(CmpLHS))) &&
>         match(FalseVal, m_APInt(C2)) && ~(*C1) == *C2)
> -    return {Pred == CmpInst::ICMP_SGT ? SPF_SMIN : SPF_SMAX, SPNB_NA, false};
> +    return {(Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) ? SPF_SMIN
> +                                                                     : SPF_SMAX,
> +            SPNB_NA, false};
>   
>     // (X >s C) ? ~C : ~X ==> (~X <s ~C) ? ~C : ~X ==> SMAX(~C, ~X)
>     // (X <s C) ? ~C : ~X ==> (~X >s ~C) ? ~C : ~X ==> SMIN(~C, ~X)
>     if (match(FalseVal, m_Not(m_Specific(CmpLHS))) &&
>         match(TrueVal, m_APInt(C2)) && ~(*C1) == *C2)
> -    return {Pred == CmpInst::ICMP_SGT ? SPF_SMAX : SPF_SMIN, SPNB_NA, false};
> +    return {(Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) ? SPF_SMAX
> +                                                                     : SPF_SMIN,
> +            SPNB_NA, false};
>   
>     return {SPF_UNKNOWN, SPNB_NA, false};
>   }
> 
> Modified: llvm/trunk/test/Transforms/InstCombine/minmax-fold.ll
> URL: 
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCom
> bine/minmax-fold.ll?rev=315992&r1=315991&r2=315992&view=diff
> ======================================================================
> ========
> --- llvm/trunk/test/Transforms/InstCombine/minmax-fold.ll (original)
> +++ llvm/trunk/test/Transforms/InstCombine/minmax-fold.ll Tue Oct 17 
> +++ 04:50:48 2017
> @@ -404,8 +404,8 @@ define i32 @clamp_signed3(i32 %x) {
>   ; CHECK-LABEL: @clamp_signed3(
>   ; CHECK-NEXT:    [[CMP2:%.*]] = icmp slt i32 [[X:%.*]], 255
>   ; CHECK-NEXT:    [[MIN:%.*]] = select i1 [[CMP2]], i32 [[X]], i32 255
> -; CHECK-NEXT:    [[CMP1:%.*]] = icmp sgt i32 [[X]], 15
> -; CHECK-NEXT:    [[R:%.*]] = select i1 [[CMP1]], i32 [[MIN]], i32 15
> +; CHECK-NEXT:    [[TMP1:%.*]] = icmp sgt i32 [[MIN]], 15
> +; CHECK-NEXT:    [[R:%.*]] = select i1 [[TMP1]], i32 [[MIN]], i32 15
>   ; CHECK-NEXT:    ret i32 [[R]]
>   ;
>     %cmp2 = icmp slt i32 %x, 255
> @@ -421,8 +421,8 @@ define i32 @clamp_signed4(i32 %x) {
>   ; CHECK-LABEL: @clamp_signed4(
>   ; CHECK-NEXT:    [[CMP2:%.*]] = icmp sgt i32 [[X:%.*]], 15
>   ; CHECK-NEXT:    [[MAX:%.*]] = select i1 [[CMP2]], i32 [[X]], i32 15
> -; CHECK-NEXT:    [[CMP1:%.*]] = icmp slt i32 [[X]], 255
> -; CHECK-NEXT:    [[R:%.*]] = select i1 [[CMP1]], i32 [[MAX]], i32 255
> +; CHECK-NEXT:    [[TMP1:%.*]] = icmp slt i32 [[MAX]], 255
> +; CHECK-NEXT:    [[R:%.*]] = select i1 [[TMP1]], i32 [[MAX]], i32 255
>   ; CHECK-NEXT:    ret i32 [[R]]
>   ;
>     %cmp2 = icmp sgt i32 %x, 15
> @@ -472,8 +472,8 @@ define i32 @clamp_unsigned3(i32 %x) {
>   ; CHECK-LABEL: @clamp_unsigned3(
>   ; CHECK-NEXT:    [[CMP2:%.*]] = icmp ult i32 [[X:%.*]], 255
>   ; CHECK-NEXT:    [[MIN:%.*]] = select i1 [[CMP2]], i32 [[X]], i32 255
> -; CHECK-NEXT:    [[CMP1:%.*]] = icmp ugt i32 [[X]], 15
> -; CHECK-NEXT:    [[R:%.*]] = select i1 [[CMP1]], i32 [[MIN]], i32 15
> +; CHECK-NEXT:    [[TMP1:%.*]] = icmp ugt i32 [[MIN]], 15
> +; CHECK-NEXT:    [[R:%.*]] = select i1 [[TMP1]], i32 [[MIN]], i32 15
>   ; CHECK-NEXT:    ret i32 [[R]]
>   ;
>     %cmp2 = icmp ult i32 %x, 255
> @@ -489,8 +489,8 @@ define i32 @clamp_unsigned4(i32 %x) {
>   ; CHECK-LABEL: @clamp_unsigned4(
>   ; CHECK-NEXT:    [[CMP2:%.*]] = icmp ugt i32 [[X:%.*]], 15
>   ; CHECK-NEXT:    [[MAX:%.*]] = select i1 [[CMP2]], i32 [[X]], i32 15
> -; CHECK-NEXT:    [[CMP1:%.*]] = icmp ult i32 [[X]], 255
> -; CHECK-NEXT:    [[R:%.*]] = select i1 [[CMP1]], i32 [[MAX]], i32 255
> +; CHECK-NEXT:    [[TMP1:%.*]] = icmp ult i32 [[MAX]], 255
> +; CHECK-NEXT:    [[R:%.*]] = select i1 [[TMP1]], i32 [[MAX]], i32 255
>   ; CHECK-NEXT:    ret i32 [[R]]
>   ;
>     %cmp2 = icmp ugt i32 %x, 15
> @@ -523,8 +523,8 @@ define i32 @clamp_check_for_no_infinite_
>   ; CHECK-LABEL: @clamp_check_for_no_infinite_loop2(
>   ; CHECK-NEXT:    [[CMP1:%.*]] = icmp sgt i32 [[I:%.*]], -255
>   ; CHECK-NEXT:    [[SEL1:%.*]] = select i1 [[CMP1]], i32 [[I]], i32 -255
> -; CHECK-NEXT:    [[CMP2:%.*]] = icmp slt i32 [[I]], 0
> -; CHECK-NEXT:    [[RES:%.*]] = select i1 [[CMP2]], i32 [[SEL1]], i32 0
> +; CHECK-NEXT:    [[TMP1:%.*]] = icmp slt i32 [[SEL1]], 0
> +; CHECK-NEXT:    [[RES:%.*]] = select i1 [[TMP1]], i32 [[SEL1]], i32 0
>   ; CHECK-NEXT:    ret i32 [[RES]]
>   ;
>     %cmp1 = icmp sgt i32 %i, -255
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>