[llvm-branch-commits] [llvm] 0f99a73 - [X86] Teach combineVectorShiftImm to constant fold undef elements to 0 not undef.

Tom Stellard via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Tue Jun 16 12:30:40 PDT 2020


Author: Craig Topper
Date: 2020-06-16T12:30:13-07:00
New Revision: 0f99a730e3bf9e4aa29d2d6c407394022527e409

URL: https://github.com/llvm/llvm-project/commit/0f99a730e3bf9e4aa29d2d6c407394022527e409
DIFF: https://github.com/llvm/llvm-project/commit/0f99a730e3bf9e4aa29d2d6c407394022527e409.diff

LOG: [X86] Teach combineVectorShiftImm to constant fold undef elements to 0 not undef.

Shifts are supposed to always shift in zeros or sign bits regardless of their inputs. It's possible the input value may have been replaced with undef by SimplifyDemandedBits, but the shift in zeros are still demanded.

This issue was reported to me by ispc from 10.0. Unfortunately their failing test does not fail on trunk. Seems to be because the shl is optimized out earlier now and doesn't become VSHLI.

ispc bug https://github.com/ispc/ispc/issues/1771

Differential Revision: https://reviews.llvm.org/D81212

(cherry picked from commit 7c9a89fed8f5d53d61fe3a61a2581a7c28b1b6d2)

Added: 
    

Modified: 
    llvm/lib/Target/X86/X86ISelLowering.cpp
    llvm/test/CodeGen/X86/vec_shift5.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 60eefbc677da..e360177687b1 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -39699,14 +39699,22 @@ static SDValue combineVectorShiftImm(SDNode *N, SelectionDAG &DAG,
       getTargetConstantBitsFromNode(N0, NumBitsPerElt, UndefElts, EltBits)) {
     assert(EltBits.size() == VT.getVectorNumElements() &&
            "Unexpected shift value type");
-    for (APInt &Elt : EltBits) {
-      if (X86ISD::VSHLI == Opcode)
+    // Undef elements need to fold to 0. It's possible SimplifyDemandedBits
+    // created an undef input due to no input bits being demanded, but user
+    // still expects 0 in other bits.
+    for (unsigned i = 0, e = EltBits.size(); i != e; ++i) {
+      APInt &Elt = EltBits[i];
+      if (UndefElts[i])
+        Elt = 0;
+      else if (X86ISD::VSHLI == Opcode)
         Elt <<= ShiftVal;
       else if (X86ISD::VSRAI == Opcode)
         Elt.ashrInPlace(ShiftVal);
       else
         Elt.lshrInPlace(ShiftVal);
     }
+    // Reset undef elements since they were zeroed above.
+    UndefElts = 0;
     return getConstVector(EltBits, UndefElts, VT.getSimpleVT(), DAG, SDLoc(N));
   }
 

diff  --git a/llvm/test/CodeGen/X86/vec_shift5.ll b/llvm/test/CodeGen/X86/vec_shift5.ll
index 873de4b08349..5c84d7c748f0 100644
--- a/llvm/test/CodeGen/X86/vec_shift5.ll
+++ b/llvm/test/CodeGen/X86/vec_shift5.ll
@@ -149,7 +149,7 @@ define <4 x i32> @test10() {
 define <2 x i64> @test11() {
 ; X32-LABEL: test11:
 ; X32:       # %bb.0:
-; X32-NEXT:    movaps {{.*#+}} xmm0 = <u,u,3,0>
+; X32-NEXT:    movaps {{.*#+}} xmm0 = [0,0,3,0]
 ; X32-NEXT:    retl
 ;
 ; X64-LABEL: test11:
@@ -219,7 +219,7 @@ define <4 x i32> @test15() {
 define <2 x i64> @test16() {
 ; X32-LABEL: test16:
 ; X32:       # %bb.0:
-; X32-NEXT:    movaps {{.*#+}} xmm0 = <u,u,248,0>
+; X32-NEXT:    movaps {{.*#+}} xmm0 = [0,0,248,0]
 ; X32-NEXT:    retl
 ;
 ; X64-LABEL: test16:


        


More information about the llvm-branch-commits mailing list