[llvm] r315992 - Improve clamp recognition in ValueTracking.
Mikael Holmén via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 19 05:29:50 PDT 2017
Hi Artur, Nikolai, Sanjay,
I think this patch miscompiles the following example:
If I do
llc -O0 -march=nvptx -o - foo.ll
on
@sur1 = global i16 1234
define i16 @main() {
%_tmp21.i = icmp sle i16 0, 0
%_tmp22.i = select i1 %_tmp21.i, i16 0, i16 32767
store i16 %_tmp22.i, i16* @sur1
ret i16 0
}
I get
mov.u32 %r1, sur1;
cvta.global.u32 %r2, %r1;
mov.u16 %rs1, 32767;
st.u16 [%r2], %rs1;
mov.u32 %r3, 0;
st.param.b32 [func_retval0+0], %r3;
ret;
with the patch.
Without it I get
mov.u32 %r1, sur1;
cvta.global.u32 %r2, %r1;
mov.u16 %rs1, 0;
st.u16 [%r2], %rs1;
mov.u32 %r3, 0;
st.param.b32 [func_retval0+0], %r3;
ret;
Notice the 32767 vs 0 difference.
I'm not very good at nvptx, but if looking at the -debug printouts from
ISel we see
Initial selection DAG: BB#0 'main:'
SelectionDAG has 13 nodes:
t4: i16 = Constant<0>
t5: ch = setle
t6: i1 = Constant<-1>
t0: ch = EntryToken
t3: i32 = llvm.nvvm.ptr.global.to.gen TargetConstant:i32<3309>,
GlobalAddress:i32<i16 addrspace(1)* @sur1> 0
t10: ch = store<ST2[%cvta]> t0, Constant:i16<32767>, t3, undef:i32
t11: ch = NVPTXISD::StoreRetval<ST4[<unknown>](align=1)> t10,
Constant:i32<0>, Constant:i32<0>
t12: ch = NVPTXISD::RET_FLAG t11
with the patch and
Initial selection DAG: BB#0 'main:'
SelectionDAG has 13 nodes:
t5: ch = setle
t6: i1 = Constant<-1>
t7: i16 = Constant<32767>
t0: ch = EntryToken
t3: i32 = llvm.nvvm.ptr.global.to.gen TargetConstant:i32<3309>,
GlobalAddress:i32<i16 addrspace(1)* @sur1> 0
t10: ch = store<ST2[%cvta]> t0, Constant:i16<0>, t3, undef:i32
t11: ch = NVPTXISD::StoreRetval<ST4[<unknown>](align=1)> t10,
Constant:i32<0>, Constant:i32<0>
t12: ch = NVPTXISD::RET_FLAG t11
without. Here it's pretty clear that the patch has done something wrong,
deciding we should always store 32767 to @sur1 instead of the expected 0.
It seems to be this code in matchMinMax that triggers:
// An unsigned min/max can be written with a signed compare.
const APInt *C2;
if ((CmpLHS == TrueVal && match(FalseVal, m_APInt(C2))) ||
(CmpLHS == FalseVal && match(TrueVal, m_APInt(C2)))) {
// Is the sign bit set?
// (X <s 0) ? X : MAXVAL ==> (X >u MAXVAL) ? X : MAXVAL ==> UMAX
// (X <s 0) ? MAXVAL : X ==> (X >u MAXVAL) ? MAXVAL : X ==> UMIN
if ((Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_SLE) && *C1
== 0 &&
C2->isMaxSignedValue())
return {CmpLHS == TrueVal ? SPF_UMAX : SPF_UMIN, SPNB_NA, false};
where "Pred == CmpInst::ICMP_SLE" was added in the patch...
Originally found this on my out-of-tree target where the resulting code
really did the wrong thing noticed at runtime, and then I've reduced and
reproduced on nvptx.
Regards,
Mikael
On 10/17/2017 01:50 PM, Nikolai Bozhenov via llvm-commits wrote:
> Author: n.bozhenov
> Date: Tue Oct 17 04:50:48 2017
> New Revision: 315992
>
> URL: http://llvm.org/viewvc/llvm-project?rev=315992&view=rev
> Log:
> Improve clamp recognition in ValueTracking.
>
> Summary:
> ValueTracking was recognizing not all variations of clamp. Swapping of
> true value and false value of select was added to fix this problem. This
> change breaks the canonical form of cmp inside the matchMinMax function,
> that is why additional checks for compare predicates is needed. Added
> corresponding test cases.
>
> Reviewers: spatel
>
> Subscribers: llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D38531
>
> Patch by: Artur Gainullin <artur.gainullin at intel.com>
>
> Modified:
> llvm/trunk/lib/Analysis/ValueTracking.cpp
> llvm/trunk/test/Transforms/InstCombine/minmax-fold.ll
>
> Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=315992&r1=315991&r2=315992&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original)
> +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Tue Oct 17 04:50:48 2017
> @@ -4083,6 +4083,14 @@ static SelectPatternResult matchMinMax(C
> Value *CmpLHS, Value *CmpRHS,
> Value *TrueVal, Value *FalseVal,
> Value *&LHS, Value *&RHS) {
> + assert(!ICmpInst::isEquality(Pred) && "Expected not equality predicate only!");
> +
> + // First, check if select has inverse order of what we will check below:
> + if (CmpRHS == FalseVal) {
> + std::swap(TrueVal, FalseVal);
> + Pred = CmpInst::getInversePredicate(Pred);
> + }
> +
> // Assume success. If there's no match, callers should not use these anyway.
> LHS = TrueVal;
> RHS = FalseVal;
> @@ -4095,26 +4103,30 @@ static SelectPatternResult matchMinMax(C
>
> // (X <s C1) ? C1 : SMIN(X, C2) ==> SMAX(SMIN(X, C2), C1)
> if (match(FalseVal, m_SMin(m_Specific(CmpLHS), m_APInt(C2))) &&
> - C1->slt(*C2) && Pred == CmpInst::ICMP_SLT)
> + C1->slt(*C2) &&
> + (Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_SLE))
> return {SPF_SMAX, SPNB_NA, false};
>
> // (X >s C1) ? C1 : SMAX(X, C2) ==> SMIN(SMAX(X, C2), C1)
> if (match(FalseVal, m_SMax(m_Specific(CmpLHS), m_APInt(C2))) &&
> - C1->sgt(*C2) && Pred == CmpInst::ICMP_SGT)
> + C1->sgt(*C2) &&
> + (Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE))
> return {SPF_SMIN, SPNB_NA, false};
>
> // (X <u C1) ? C1 : UMIN(X, C2) ==> UMAX(UMIN(X, C2), C1)
> if (match(FalseVal, m_UMin(m_Specific(CmpLHS), m_APInt(C2))) &&
> - C1->ult(*C2) && Pred == CmpInst::ICMP_ULT)
> + C1->ult(*C2) &&
> + (Pred == CmpInst::ICMP_ULT || Pred == CmpInst::ICMP_ULE))
> return {SPF_UMAX, SPNB_NA, false};
>
> // (X >u C1) ? C1 : UMAX(X, C2) ==> UMIN(UMAX(X, C2), C1)
> if (match(FalseVal, m_UMax(m_Specific(CmpLHS), m_APInt(C2))) &&
> - C1->ugt(*C2) && Pred == CmpInst::ICMP_UGT)
> + C1->ugt(*C2) &&
> + (Pred == CmpInst::ICMP_UGT || Pred == CmpInst::ICMP_UGE))
> return {SPF_UMIN, SPNB_NA, false};
> }
>
> - if (Pred != CmpInst::ICMP_SGT && Pred != CmpInst::ICMP_SLT)
> + if (!CmpInst::isSigned(Pred))
> return {SPF_UNKNOWN, SPNB_NA, false};
>
> // Z = X -nsw Y
> @@ -4122,14 +4134,18 @@ static SelectPatternResult matchMinMax(C
> // (X <s Y) ? 0 : Z ==> (Z <s 0) ? 0 : Z ==> SMAX(Z, 0)
> if (match(TrueVal, m_Zero()) &&
> match(FalseVal, m_NSWSub(m_Specific(CmpLHS), m_Specific(CmpRHS))))
> - return {Pred == CmpInst::ICMP_SGT ? SPF_SMIN : SPF_SMAX, SPNB_NA, false};
> + return {(Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) ? SPF_SMIN
> + : SPF_SMAX,
> + SPNB_NA, false};
>
> // Z = X -nsw Y
> // (X >s Y) ? Z : 0 ==> (Z >s 0) ? Z : 0 ==> SMAX(Z, 0)
> // (X <s Y) ? Z : 0 ==> (Z <s 0) ? Z : 0 ==> SMIN(Z, 0)
> if (match(FalseVal, m_Zero()) &&
> match(TrueVal, m_NSWSub(m_Specific(CmpLHS), m_Specific(CmpRHS))))
> - return {Pred == CmpInst::ICMP_SGT ? SPF_SMAX : SPF_SMIN, SPNB_NA, false};
> + return {(Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) ? SPF_SMAX
> + : SPF_SMIN,
> + SPNB_NA, false};
>
> if (!match(CmpRHS, m_APInt(C1)))
> return {SPF_UNKNOWN, SPNB_NA, false};
> @@ -4141,14 +4157,15 @@ static SelectPatternResult matchMinMax(C
> // Is the sign bit set?
> // (X <s 0) ? X : MAXVAL ==> (X >u MAXVAL) ? X : MAXVAL ==> UMAX
> // (X <s 0) ? MAXVAL : X ==> (X >u MAXVAL) ? MAXVAL : X ==> UMIN
> - if (Pred == CmpInst::ICMP_SLT && *C1 == 0 && C2->isMaxSignedValue())
> + if ((Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_SLE) && *C1 == 0 &&
> + C2->isMaxSignedValue())
> return {CmpLHS == TrueVal ? SPF_UMAX : SPF_UMIN, SPNB_NA, false};
>
> // Is the sign bit clear?
> // (X >s -1) ? MINVAL : X ==> (X <u MINVAL) ? MINVAL : X ==> UMAX
> // (X >s -1) ? X : MINVAL ==> (X <u MINVAL) ? X : MINVAL ==> UMIN
> - if (Pred == CmpInst::ICMP_SGT && C1->isAllOnesValue() &&
> - C2->isMinSignedValue())
> + if ((Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) &&
> + C1->isAllOnesValue() && C2->isMinSignedValue())
> return {CmpLHS == FalseVal ? SPF_UMAX : SPF_UMIN, SPNB_NA, false};
> }
>
> @@ -4157,13 +4174,17 @@ static SelectPatternResult matchMinMax(C
> // (X <s C) ? ~X : ~C ==> (~X >s ~C) ? ~X : ~C ==> SMAX(~X, ~C)
> if (match(TrueVal, m_Not(m_Specific(CmpLHS))) &&
> match(FalseVal, m_APInt(C2)) && ~(*C1) == *C2)
> - return {Pred == CmpInst::ICMP_SGT ? SPF_SMIN : SPF_SMAX, SPNB_NA, false};
> + return {(Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) ? SPF_SMIN
> + : SPF_SMAX,
> + SPNB_NA, false};
>
> // (X >s C) ? ~C : ~X ==> (~X <s ~C) ? ~C : ~X ==> SMAX(~C, ~X)
> // (X <s C) ? ~C : ~X ==> (~X >s ~C) ? ~C : ~X ==> SMIN(~C, ~X)
> if (match(FalseVal, m_Not(m_Specific(CmpLHS))) &&
> match(TrueVal, m_APInt(C2)) && ~(*C1) == *C2)
> - return {Pred == CmpInst::ICMP_SGT ? SPF_SMAX : SPF_SMIN, SPNB_NA, false};
> + return {(Pred == CmpInst::ICMP_SGT || Pred == CmpInst::ICMP_SGE) ? SPF_SMAX
> + : SPF_SMIN,
> + SPNB_NA, false};
>
> return {SPF_UNKNOWN, SPNB_NA, false};
> }
>
> Modified: llvm/trunk/test/Transforms/InstCombine/minmax-fold.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/minmax-fold.ll?rev=315992&r1=315991&r2=315992&view=diff
> ==============================================================================
> --- llvm/trunk/test/Transforms/InstCombine/minmax-fold.ll (original)
> +++ llvm/trunk/test/Transforms/InstCombine/minmax-fold.ll Tue Oct 17 04:50:48 2017
> @@ -404,8 +404,8 @@ define i32 @clamp_signed3(i32 %x) {
> ; CHECK-LABEL: @clamp_signed3(
> ; CHECK-NEXT: [[CMP2:%.*]] = icmp slt i32 [[X:%.*]], 255
> ; CHECK-NEXT: [[MIN:%.*]] = select i1 [[CMP2]], i32 [[X]], i32 255
> -; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[X]], 15
> -; CHECK-NEXT: [[R:%.*]] = select i1 [[CMP1]], i32 [[MIN]], i32 15
> +; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i32 [[MIN]], 15
> +; CHECK-NEXT: [[R:%.*]] = select i1 [[TMP1]], i32 [[MIN]], i32 15
> ; CHECK-NEXT: ret i32 [[R]]
> ;
> %cmp2 = icmp slt i32 %x, 255
> @@ -421,8 +421,8 @@ define i32 @clamp_signed4(i32 %x) {
> ; CHECK-LABEL: @clamp_signed4(
> ; CHECK-NEXT: [[CMP2:%.*]] = icmp sgt i32 [[X:%.*]], 15
> ; CHECK-NEXT: [[MAX:%.*]] = select i1 [[CMP2]], i32 [[X]], i32 15
> -; CHECK-NEXT: [[CMP1:%.*]] = icmp slt i32 [[X]], 255
> -; CHECK-NEXT: [[R:%.*]] = select i1 [[CMP1]], i32 [[MAX]], i32 255
> +; CHECK-NEXT: [[TMP1:%.*]] = icmp slt i32 [[MAX]], 255
> +; CHECK-NEXT: [[R:%.*]] = select i1 [[TMP1]], i32 [[MAX]], i32 255
> ; CHECK-NEXT: ret i32 [[R]]
> ;
> %cmp2 = icmp sgt i32 %x, 15
> @@ -472,8 +472,8 @@ define i32 @clamp_unsigned3(i32 %x) {
> ; CHECK-LABEL: @clamp_unsigned3(
> ; CHECK-NEXT: [[CMP2:%.*]] = icmp ult i32 [[X:%.*]], 255
> ; CHECK-NEXT: [[MIN:%.*]] = select i1 [[CMP2]], i32 [[X]], i32 255
> -; CHECK-NEXT: [[CMP1:%.*]] = icmp ugt i32 [[X]], 15
> -; CHECK-NEXT: [[R:%.*]] = select i1 [[CMP1]], i32 [[MIN]], i32 15
> +; CHECK-NEXT: [[TMP1:%.*]] = icmp ugt i32 [[MIN]], 15
> +; CHECK-NEXT: [[R:%.*]] = select i1 [[TMP1]], i32 [[MIN]], i32 15
> ; CHECK-NEXT: ret i32 [[R]]
> ;
> %cmp2 = icmp ult i32 %x, 255
> @@ -489,8 +489,8 @@ define i32 @clamp_unsigned4(i32 %x) {
> ; CHECK-LABEL: @clamp_unsigned4(
> ; CHECK-NEXT: [[CMP2:%.*]] = icmp ugt i32 [[X:%.*]], 15
> ; CHECK-NEXT: [[MAX:%.*]] = select i1 [[CMP2]], i32 [[X]], i32 15
> -; CHECK-NEXT: [[CMP1:%.*]] = icmp ult i32 [[X]], 255
> -; CHECK-NEXT: [[R:%.*]] = select i1 [[CMP1]], i32 [[MAX]], i32 255
> +; CHECK-NEXT: [[TMP1:%.*]] = icmp ult i32 [[MAX]], 255
> +; CHECK-NEXT: [[R:%.*]] = select i1 [[TMP1]], i32 [[MAX]], i32 255
> ; CHECK-NEXT: ret i32 [[R]]
> ;
> %cmp2 = icmp ugt i32 %x, 15
> @@ -523,8 +523,8 @@ define i32 @clamp_check_for_no_infinite_
> ; CHECK-LABEL: @clamp_check_for_no_infinite_loop2(
> ; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[I:%.*]], -255
> ; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CMP1]], i32 [[I]], i32 -255
> -; CHECK-NEXT: [[CMP2:%.*]] = icmp slt i32 [[I]], 0
> -; CHECK-NEXT: [[RES:%.*]] = select i1 [[CMP2]], i32 [[SEL1]], i32 0
> +; CHECK-NEXT: [[TMP1:%.*]] = icmp slt i32 [[SEL1]], 0
> +; CHECK-NEXT: [[RES:%.*]] = select i1 [[TMP1]], i32 [[SEL1]], i32 0
> ; CHECK-NEXT: ret i32 [[RES]]
> ;
> %cmp1 = icmp sgt i32 %i, -255
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
More information about the llvm-commits
mailing list