[llvm] r324941 - [DAG] make binops with undef operands consistent with IR

Tue Feb 13 06:59:27 PST 2018

https://reviews.llvm.org/rL325010

On Tue, Feb 13, 2018 at 7:32 AM, Sanjay Patel <spatel at rotateright.com>
wrote:

> Thanks, Mikael.
>
> Yes, we can't always propagate the existing undef operand in the DAG; the
> sizes may not match. Seems like a small fix. Taking a look now.
>
> On Tue, Feb 13, 2018 at 2:56 AM, Mikael Holmén <mikael.holmen at ericsson.com
> > wrote:
>
>> Hi Sanjay,
>>
>> I ran into problems with this patch.
>>
>> The problem is that an i32:
>>
>>   %_tmp1905.i = shl i32 7, undef
>>
>> is changed into an i8:
>>
>> Creating new node: t3: i8 = undef
>>
>> so the folding of
>>
>> op(arg1, undef) -> undef
>>
>> actually changed the type of the result of the shl from i32 to i8.
>>
>> The result is then used in a setcc
>>
>> Creating new node: t5: i1 = setcc undef:i32, undef:i8, seteq:ch
>>
>> and then an assert triggers about the operands in setcc having different
>> sizes.
>>
>> llc: ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:7764: bool
>> isTruncateOf(llvm::SelectionDAG &, llvm::SDValue, llvm::SDValue &,
>> llvm::KnownBits &): Assertion `Op0.getValueType() == Op1.getValueType()'
>> failed.
>>
>> Reproduce with:
>> llc shl_undef.ll -o /dev/null -mtriple=x86_64-linux
>>
>> Regards,
>> Mikael
>>
>>
>> On 02/12/2018 10:37 PM, Sanjay Patel via llvm-commits wrote:
>>
>>> Author: spatel
>>> Date: Mon Feb 12 13:37:27 2018
>>> New Revision: 324941
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=324941&view=rev
>>> Log:
>>> [DAG] make binops with undef operands consistent with IR
>>>
>>> This started by noticing that scalar and vector types were producing
>>> different results with div ops in PR36305:
>>> https://bugs.llvm.org/show_bug.cgi?id=36305
>>>
>>> ...but the problem is bigger. I couldn't keep it straight without a
>>> table, so I'm attaching that as a PDF to
>>> the review. The x86 tests in undef-ops.ll correspond to that table.
>>>
>>> Green means that instsimplify and the DAG agree on the result for all
>>> types.
>>> Red means the DAG was returning undef when IR was not.
>>> Yellow means the DAG was returning a non-undef result when IR returned
>>> undef.
>>>
>>> This patch assumes that we're currently doing the right thing in IR.
>>>
>>> Note: I couldn't find any problems with lowering vector constants as the
>>> code comments were warning,
>>> but those comments were written long ago in rL36413 .
>>>
>>> Differential Revision: https://reviews.llvm.org/D43141
>>>
>>> Modified:
>>>      llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>>>      llvm/trunk/test/CodeGen/Hexagon/tail-dup-subreg-map.ll
>>>      llvm/trunk/test/CodeGen/X86/pr13577.ll
>>>      llvm/trunk/test/CodeGen/X86/pr33960.ll
>>>      llvm/trunk/test/CodeGen/X86/undef-ops.ll
>>>
>>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/S
>>> electionDAG/SelectionDAG.cpp?rev=324941&r1=324940&r2=324941&view=diff
>>> ============================================================
>>> ==================
>>> --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Mon Feb 12
>>> 13:37:27 2018
>>> @@ -4667,19 +4667,15 @@ SDValue SelectionDAG::getNode(unsigned O
>>>         case ISD::FSUB:
>>>         case ISD::FDIV:
>>>         case ISD::FREM:
>>> -      case ISD::SRA:
>>>           return N1;     // fold op(undef, arg2) -> undef
>>>         case ISD::UDIV:
>>>         case ISD::SDIV:
>>>         case ISD::UREM:
>>>         case ISD::SREM:
>>> +      case ISD::SRA:
>>>         case ISD::SRL:
>>>         case ISD::SHL:
>>> -        if (!VT.isVector())
>>> -          return getConstant(0, DL, VT);    // fold op(undef, arg2) -> 0
>>> -        // For vectors, we can't easily build an all zero vector, just
>>> return
>>> -        // the LHS.
>>> -        return N2;
>>> +        return getConstant(0, DL, VT);    // fold op(undef, arg2) -> 0
>>>         }
>>>       }
>>>     }
>>> @@ -4701,6 +4697,9 @@ SDValue SelectionDAG::getNode(unsigned O
>>>       case ISD::SDIV:
>>>       case ISD::UREM:
>>>       case ISD::SREM:
>>> +    case ISD::SRA:
>>> +    case ISD::SRL:
>>> +    case ISD::SHL:
>>>         return N2;       // fold op(arg1, undef) -> undef
>>>       case ISD::FADD:
>>>       case ISD::FSUB:
>>> @@ -4712,21 +4711,9 @@ SDValue SelectionDAG::getNode(unsigned O
>>>         break;
>>>       case ISD::MUL:
>>>       case ISD::AND:
>>> -    case ISD::SRL:
>>> -    case ISD::SHL:
>>> -      if (!VT.isVector())
>>> -        return getConstant(0, DL, VT);  // fold op(arg1, undef) -> 0
>>> -      // For vectors, we can't easily build an all zero vector, just
>>> return
>>> -      // the LHS.
>>> -      return N1;
>>> +      return getConstant(0, DL, VT);  // fold op(arg1, undef) -> 0
>>>       case ISD::OR:
>>> -      if (!VT.isVector())
>>> -        return getConstant(APInt::getAllOnesValue(VT.getSizeInBits()),
>>> DL, VT);
>>> -      // For vectors, we can't easily build an all one vector, just
>>> return
>>> -      // the LHS.
>>> -      return N1;
>>> -    case ISD::SRA:
>>> -      return N1;
>>> +      return getAllOnesConstant(DL, VT);
>>>       }
>>>     }
>>>
>>> Modified: llvm/trunk/test/CodeGen/Hexagon/tail-dup-subreg-map.ll
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>>> Hexagon/tail-dup-subreg-map.ll?rev=324941&r1=324940&r2=324941&view=diff
>>> ============================================================
>>> ==================
>>> --- llvm/trunk/test/CodeGen/Hexagon/tail-dup-subreg-map.ll (original)
>>> +++ llvm/trunk/test/CodeGen/Hexagon/tail-dup-subreg-map.ll Mon Feb 12
>>> 13:37:27 2018
>>> @@ -5,7 +5,7 @@
>>>   ; subregisters were dropped by the tail duplicator, resulting in
>>> invalid
>>>   ; COPY instructions being generated.
>>>   -; CHECK: = extractu(r{{[0-9]+}},#15,#17)
>>> +; CHECK: = asl(r{{[0-9]+}}:{{[0-9]+}},#15)
>>>     target triple = "hexagon"
>>>   @@ -36,20 +36,20 @@ if.then5.i:
>>>     br label %if.end.i
>>>     if.else.i:                                        ; preds =
>>> %if.then.i
>>> -  %shl12.i = shl i64 %0, undef
>>> +  %shl12.i = shl i64 %0, 7
>>>     br label %if.end.i
>>>     if.end.i:                                         ; preds =
>>> %if.else.i, %if.then5.i
>>>     %aSig0.0 = phi i64 [ undef, %if.then5.i ], [ %shl12.i, %if.else.i ]
>>>     %storemerge43.i = phi i64 [ %shl.i21, %if.then5.i ], [ 0, %if.else.i
>>> ]
>>> -  %sub15.i = sub nsw i32 -63, undef
>>> +  %sub15.i = sub nsw i32 -63, 8
>>>     br label %if.end13
>>>     if.else16.i:                                      ; preds = %if.then7
>>>     br label %if.end13
>>>     if.else:                                          ; preds = %entry
>>> -  %or12 = or i64 undef, 281474976710656
>>> +  %or12 = or i64 9, 281474976710656
>>>     br label %if.end13
>>>     if.end13:                                         ; preds =
>>> %if.else, %if.else16.i, %if.end.i
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/pr13577.ll
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>>> X86/pr13577.ll?rev=324941&r1=324940&r2=324941&view=diff
>>> ============================================================
>>> ==================
>>> --- llvm/trunk/test/CodeGen/X86/pr13577.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/pr13577.ll Mon Feb 12 13:37:27 2018
>>> @@ -31,7 +31,6 @@ define float @pr26070() {
>>>   ; CHECK:       ## %bb.0:
>>>   ; CHECK-NEXT:    movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
>>>   ; CHECK-NEXT:    shufps {{.*#+}} xmm0 = xmm0[0,0,0,0]
>>> -; CHECK-NEXT:    orps {{.*}}(%rip), %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %c = call float @copysignf(float 1.0, float undef) readnone
>>>     ret float %c
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/pr33960.ll
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>>> X86/pr33960.ll?rev=324941&r1=324940&r2=324941&view=diff
>>> ============================================================
>>> ==================
>>> --- llvm/trunk/test/CodeGen/X86/pr33960.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/pr33960.ll Mon Feb 12 13:37:27 2018
>>> @@ -7,12 +7,12 @@
>>>   define void @PR33960() {
>>>   ; X86-LABEL: PR33960:
>>>   ; X86:       # %bb.0: # %entry
>>> -; X86-NEXT:    movl $0, b
>>> +; X86-NEXT:    movl $-1, b
>>>   ; X86-NEXT:    retl
>>>   ;
>>>   ; X64-LABEL: PR33960:
>>>   ; X64:       # %bb.0: # %entry
>>> -; X64-NEXT:    movl $0, {{.*}}(%rip)
>>> +; X64-NEXT:    movl $-1, {{.*}}(%rip)
>>>   ; X64-NEXT:    retq
>>>   entry:
>>>     %tmp = insertelement <4 x i32> <i32 undef, i32 -7, i32 -3, i32
>>> undef>, i32 -2, i32 3
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/undef-ops.ll
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>>> X86/undef-ops.ll?rev=324941&r1=324940&r2=324941&view=diff
>>> ============================================================
>>> ==================
>>> --- llvm/trunk/test/CodeGen/X86/undef-ops.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/undef-ops.ll Mon Feb 12 13:37:27 2018
>>> @@ -77,6 +77,7 @@ define i32 @mul_undef_rhs(i32 %x) {
>>>   define <4 x i32> @mul_undef_rhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: mul_undef_rhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = mul <4 x i32> %x, undef
>>>     ret <4 x i32> %r
>>> @@ -94,6 +95,7 @@ define i32 @mul_undef_lhs(i32 %x) {
>>>   define <4 x i32> @mul_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: mul_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = mul <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>> @@ -127,6 +129,7 @@ define i32 @sdiv_undef_lhs(i32 %x) {
>>>   define <4 x i32> @sdiv_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: sdiv_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = sdiv <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>> @@ -160,6 +163,7 @@ define i32 @udiv_undef_lhs(i32 %x) {
>>>   define <4 x i32> @udiv_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: udiv_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = udiv <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>> @@ -193,6 +197,7 @@ define i32 @srem_undef_lhs(i32 %x) {
>>>   define <4 x i32> @srem_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: srem_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = srem <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>> @@ -226,6 +231,7 @@ define i32 @urem_undef_lhs(i32 %x) {
>>>   define <4 x i32> @urem_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: urem_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = urem <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>> @@ -234,7 +240,6 @@ define <4 x i32> @urem_undef_lhs_vec(<4
>>>   define i32 @ashr_undef_rhs(i32 %x) {
>>>   ; CHECK-LABEL: ashr_undef_rhs:
>>>   ; CHECK:       # %bb.0:
>>> -; CHECK-NEXT:    movl %edi, %eax
>>>   ; CHECK-NEXT:    retq
>>>     %r = ashr i32 %x, undef
>>>     ret i32 %r
>>> @@ -251,6 +256,7 @@ define <4 x i32> @ashr_undef_rhs_vec(<4
>>>   define i32 @ashr_undef_lhs(i32 %x) {
>>>   ; CHECK-LABEL: ashr_undef_lhs:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorl %eax, %eax
>>>   ; CHECK-NEXT:    retq
>>>     %r = ashr i32 undef, %x
>>>     ret i32 %r
>>> @@ -259,6 +265,7 @@ define i32 @ashr_undef_lhs(i32 %x) {
>>>   define <4 x i32> @ashr_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: ashr_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = ashr <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>> @@ -267,7 +274,6 @@ define <4 x i32> @ashr_undef_lhs_vec(<4
>>>   define i32 @lshr_undef_rhs(i32 %x) {
>>>   ; CHECK-LABEL: lshr_undef_rhs:
>>>   ; CHECK:       # %bb.0:
>>> -; CHECK-NEXT:    xorl %eax, %eax
>>>   ; CHECK-NEXT:    retq
>>>     %r = lshr i32 %x, undef
>>>     ret i32 %r
>>> @@ -293,6 +299,7 @@ define i32 @lshr_undef_lhs(i32 %x) {
>>>   define <4 x i32> @lshr_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: lshr_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = lshr <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>> @@ -301,7 +308,6 @@ define <4 x i32> @lshr_undef_lhs_vec(<4
>>>   define i32 @shl_undef_rhs(i32 %x) {
>>>   ; CHECK-LABEL: shl_undef_rhs:
>>>   ; CHECK:       # %bb.0:
>>> -; CHECK-NEXT:    xorl %eax, %eax
>>>   ; CHECK-NEXT:    retq
>>>     %r = shl i32 %x, undef
>>>     ret i32 %r
>>> @@ -327,6 +333,7 @@ define i32 @shl_undef_lhs(i32 %x) {
>>>   define <4 x i32> @shl_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: shl_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = shl <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>> @@ -344,6 +351,7 @@ define i32 @and_undef_rhs(i32 %x) {
>>>   define <4 x i32> @and_undef_rhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: and_undef_rhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = and <4 x i32> %x, undef
>>>     ret <4 x i32> %r
>>> @@ -361,6 +369,7 @@ define i32 @and_undef_lhs(i32 %x) {
>>>   define <4 x i32> @and_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: and_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    xorps %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = and <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>> @@ -378,6 +387,7 @@ define i32 @or_undef_rhs(i32 %x) {
>>>   define <4 x i32> @or_undef_rhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: or_undef_rhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    pcmpeqd %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = or <4 x i32> %x, undef
>>>     ret <4 x i32> %r
>>> @@ -395,6 +405,7 @@ define i32 @or_undef_lhs(i32 %x) {
>>>   define <4 x i32> @or_undef_lhs_vec(<4 x i32> %x) {
>>>   ; CHECK-LABEL: or_undef_lhs_vec:
>>>   ; CHECK:       # %bb.0:
>>> +; CHECK-NEXT:    pcmpeqd %xmm0, %xmm0
>>>   ; CHECK-NEXT:    retq
>>>     %r = or <4 x i32> undef, %x
>>>     ret <4 x i32> %r
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180213/536c7c42/attachment.html>