[llvm] r291120 - [X86] Optimize vector shifts with variable but uniform shift amounts
Rackover, Zvi via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 10 07:03:16 PST 2017
Hi, thanks for reporting and sorry about the inconvenience.
Can you please confirm that r291535 by Craig fixes the problem?
Thanks, Zvi
-----Original Message-----
From: Yung, Douglas [mailto:douglas.yung at sony.com]
Sent: Tuesday, January 10, 2017 02:44
To: Rackover, Zvi <zvi.rackover at intel.com>
Cc: llvm-commits at lists.llvm.org
Subject: RE: [llvm] r291120 - [X86] Optimize vector shifts with variable but uniform shift amounts
Hi Zvi,
This commit caused one of our internal tests to start failing with the following error:
fatal error: error in backend: Cannot select: t57: v2i64 = zero_extend_vector_inreg t61
t61: v4i32 = bitcast t60
t60: v2i64,ch = load<LD16[%3](tbaa=<0x2504c38>)(dereferenceable)> t24, FrameIndex:i64<2>, undef:i64
t9: i64 = FrameIndex<2>
t2: i64 = undef
In function: _Z3foov
I have put a repro and details in PR31593. Can you please take a look at it?
Douglas Yung
> -----Original Message-----
> From: llvm-commits [mailto:llvm-commits-bounces at lists.llvm.org] On
> Behalf Of Zvi Rackover via llvm-commits
> Sent: Thursday, January 05, 2017 7:12
> To: llvm-commits at lists.llvm.org
> Subject: [llvm] r291120 - [X86] Optimize vector shifts with variable
> but uniform shift amounts
>
> Author: zvi
> Date: Thu Jan 5 09:11:43 2017
> New Revision: 291120
>
> URL: http://llvm.org/viewvc/llvm-project?rev=291120&view=rev
> Log:
> [X86] Optimize vector shifts with variable but uniform shift amounts
>
> Summary:
> For instructions such as PSLLW/PSLLD/PSLLQ a variable shift amount may
> be passed in an XMM register.
> The lower 64-bits of the register are evaluated to determine the shift
> amount.
> This patch improves the construction of the vector containing the
> shift amount.
>
> Reviewers: craig.topper, delena, RKSimon
>
> Subscribers: llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D28353
>
> Modified:
> llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> llvm/trunk/test/CodeGen/X86/lower-vec-shift-2.ll
> llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll
> llvm/trunk/test/CodeGen/X86/vector-shift-ashr-128.ll
> llvm/trunk/test/CodeGen/X86/vector-shift-ashr-256.ll
> llvm/trunk/test/CodeGen/X86/vector-shift-ashr-512.ll
> llvm/trunk/test/CodeGen/X86/vector-shift-lshr-128.ll
> llvm/trunk/test/CodeGen/X86/vector-shift-lshr-256.ll
> llvm/trunk/test/CodeGen/X86/vector-shift-lshr-512.ll
> llvm/trunk/test/CodeGen/X86/vector-shift-shl-128.ll
> llvm/trunk/test/CodeGen/X86/vector-shift-shl-256.ll
> llvm/trunk/test/CodeGen/X86/vector-shift-shl-512.ll
> llvm/trunk/test/CodeGen/X86/vshift-4.ll
>
> Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=291120&r1=29
> 1
> 119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Thu Jan 5 09:11:43
> +++ 2017
> @@ -18306,27 +18306,33 @@ static SDValue getTargetVShiftNode(unsig
> case X86ISD::VSRAI: Opc = X86ISD::VSRA; break;
> }
>
> + // Need to build a vector containing shift amount.
> + // SSE/AVX packed shifts only use the lower 64-bit of the shift
> count.
> + //
> +=================+============+======================================
> +=
> +
> + // | ShAmt is | HasSSE4.1? | Construct ShAmt vector as
> |
> + //
> +=================+============+======================================
> +=
> +
> + // | i64 | Yes, No | Use ShAmt as lowest elt
> |
> + // | i32 | Yes | zero-extend in-reg
> |
> + // | (i32 zext(i16)) | Yes | zero-extend in-reg
> |
> + // | i16/i32 | No | v4i32 build_vector(ShAmt, 0, ud,
> ud)) |
> + //
> +
> +=================+============+======================================
> + =+
> const X86Subtarget &Subtarget =
> static_cast<const X86Subtarget &>(DAG.getSubtarget());
> - if (Subtarget.hasSSE41() && ShAmt.getOpcode() == ISD::ZERO_EXTEND &&
> - ShAmt.getOperand(0).getSimpleValueType() == MVT::i16) {
> - // Let the shuffle legalizer expand this shift amount node.
> + if (SVT == MVT::i64)
> + ShAmt = DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(ShAmt),
> + MVT::v2i64, ShAmt); else if (Subtarget.hasSSE41() &&
> ShAmt.getOpcode() == ISD::ZERO_EXTEND &&
> + ShAmt.getOperand(0).getSimpleValueType() == MVT::i16) {
> SDValue Op0 = ShAmt.getOperand(0);
> Op0 = DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(Op0), MVT::v8i16,
> Op0);
> - ShAmt = getShuffleVectorZeroOrUndef(Op0, 0, true, Subtarget, DAG);
> + ShAmt = DAG.getZeroExtendVectorInReg(Op0, SDLoc(Op0),
> + MVT::v2i64); } else if (Subtarget.hasSSE41() &&
> + ShAmt.getOpcode() == ISD::EXTRACT_VECTOR_ELT) {
> + ShAmt = DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(ShAmt),
> MVT::v4i32, ShAmt);
> + ShAmt = DAG.getZeroExtendVectorInReg(ShAmt, SDLoc(ShAmt),
> + MVT::v2i64);
> } else {
> - // Need to build a vector containing shift amount.
> - // SSE/AVX packed shifts only use the lower 64-bit of the shift
> count.
> - SmallVector<SDValue, 4> ShOps;
> - ShOps.push_back(ShAmt);
> - if (SVT == MVT::i32) {
> - ShOps.push_back(DAG.getConstant(0, dl, SVT));
> - ShOps.push_back(DAG.getUNDEF(SVT));
> - }
> - ShOps.push_back(DAG.getUNDEF(SVT));
> -
> - MVT BVT = SVT == MVT::i32 ? MVT::v4i32 : MVT::v2i64;
> - ShAmt = DAG.getBuildVector(BVT, dl, ShOps);
> + SmallVector<SDValue, 4> ShOps = {ShAmt, DAG.getConstant(0, dl,
> SVT),
> + DAG.getUNDEF(SVT),
> DAG.getUNDEF(SVT)};
> + ShAmt = DAG.getBuildVector(MVT::v4i32, dl, ShOps);
> }
>
> // The return type has to be a 128-bit type with the same element
>
> Modified: llvm/trunk/test/CodeGen/X86/lower-vec-shift-2.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/lower-vec-shift-
> 2.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/lower-vec-shift-2.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/lower-vec-shift-2.ll Thu Jan 5
> 09:11:43
> +++ 2017
> @@ -12,8 +12,7 @@ define <8 x i16> @test1(<8 x i16> %A, <8 ; ; AVX-
> LABEL: test1:
> ; AVX: # BB#0: # %entry
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX-NEXT: vpsllw %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> entry:
> @@ -32,8 +31,7 @@ define <4 x i32> @test2(<4 x i32> %A, <4 ; ; AVX-
> LABEL: test2:
> ; AVX: # BB#0: # %entry
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX-NEXT: vpslld %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> entry:
> @@ -68,8 +66,7 @@ define <8 x i16> @test4(<8 x i16> %A, <8 ; ; AVX-
> LABEL: test4:
> ; AVX: # BB#0: # %entry
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX-NEXT: vpsrlw %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> entry:
> @@ -88,8 +85,7 @@ define <4 x i32> @test5(<4 x i32> %A, <4 ; ; AVX-
> LABEL: test5:
> ; AVX: # BB#0: # %entry
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX-NEXT: vpsrld %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> entry:
> @@ -124,8 +120,7 @@ define <8 x i16> @test7(<8 x i16> %A, <8 ; ;
> AVX-
> LABEL: test7:
> ; AVX: # BB#0: # %entry
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX-NEXT: vpsraw %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> entry:
> @@ -144,8 +139,7 @@ define <4 x i32> @test8(<4 x i32> %A, <4 ; ;
> AVX-
> LABEL: test8:
> ; AVX: # BB#0: # %entry
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX-NEXT: vpsrad %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> entry:
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-rotate-
> 128.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll Thu Jan 5
> 09:11:43
> +++ 2017
> @@ -87,14 +87,12 @@ define <2 x i64> @var_rotate_v2i64(<2 x
> ; X32-SSE-NEXT: pshufd {{.*#+}} xmm3 = xmm1[2,3,0,1]
> ; X32-SSE-NEXT: movdqa %xmm0, %xmm4
> ; X32-SSE-NEXT: psllq %xmm3, %xmm4
> -; X32-SSE-NEXT: movq {{.*#+}} xmm1 = xmm1[0],zero
> ; X32-SSE-NEXT: movdqa %xmm0, %xmm3
> ; X32-SSE-NEXT: psllq %xmm1, %xmm3
> ; X32-SSE-NEXT: movsd {{.*#+}} xmm4 = xmm3[0],xmm4[1]
> ; X32-SSE-NEXT: pshufd {{.*#+}} xmm3 = xmm2[2,3,0,1]
> ; X32-SSE-NEXT: movdqa %xmm0, %xmm1
> ; X32-SSE-NEXT: psrlq %xmm3, %xmm1
> -; X32-SSE-NEXT: movq {{.*#+}} xmm2 = xmm2[0],zero
> ; X32-SSE-NEXT: psrlq %xmm2, %xmm0
> ; X32-SSE-NEXT: movsd {{.*#+}} xmm1 = xmm0[0],xmm1[1]
> ; X32-SSE-NEXT: orpd %xmm4, %xmm1
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-shift-ashr-128.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-shift-ashr-
> 128.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-shift-ashr-128.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-shift-ashr-128.ll Thu Jan 5
> +++ 09:11:43 2017
> @@ -90,20 +90,19 @@ define <2 x i64> @var_shift_v2i64(<2 x i ; ;
> X32-
> SSE-LABEL: var_shift_v2i64:
> ; X32-SSE: # BB#0:
> -; X32-SSE-NEXT: pshufd {{.*#+}} xmm2 = xmm1[2,3,0,1]
> -; X32-SSE-NEXT: movdqa {{.*#+}} xmm3 = [0,2147483648,0,2147483648]
> -; X32-SSE-NEXT: movdqa %xmm3, %xmm4
> -; X32-SSE-NEXT: psrlq %xmm2, %xmm4
> -; X32-SSE-NEXT: movq {{.*#+}} xmm5 = xmm1[0],zero
> -; X32-SSE-NEXT: psrlq %xmm5, %xmm3
> -; X32-SSE-NEXT: movsd {{.*#+}} xmm4 = xmm3[0],xmm4[1]
> -; X32-SSE-NEXT: movdqa %xmm0, %xmm1
> -; X32-SSE-NEXT: psrlq %xmm2, %xmm1
> -; X32-SSE-NEXT: psrlq %xmm5, %xmm0
> -; X32-SSE-NEXT: movsd {{.*#+}} xmm1 = xmm0[0],xmm1[1]
> -; X32-SSE-NEXT: xorpd %xmm4, %xmm1
> -; X32-SSE-NEXT: psubq %xmm4, %xmm1
> -; X32-SSE-NEXT: movdqa %xmm1, %xmm0
> +; X32-SSE-NEXT: pshufd {{.*#+}} xmm3 = xmm1[2,3,0,1]
> +; X32-SSE-NEXT: movdqa {{.*#+}} xmm2 = [0,2147483648,0,2147483648]
> +; X32-SSE-NEXT: movdqa %xmm2, %xmm4
> +; X32-SSE-NEXT: psrlq %xmm3, %xmm4
> +; X32-SSE-NEXT: psrlq %xmm1, %xmm2
> +; X32-SSE-NEXT: movsd {{.*#+}} xmm4 = xmm2[0],xmm4[1]
> +; X32-SSE-NEXT: movdqa %xmm0, %xmm2
> +; X32-SSE-NEXT: psrlq %xmm3, %xmm2
> +; X32-SSE-NEXT: psrlq %xmm1, %xmm0
> +; X32-SSE-NEXT: movsd {{.*#+}} xmm2 = xmm0[0],xmm2[1]
> +; X32-SSE-NEXT: xorpd %xmm4, %xmm2
> +; X32-SSE-NEXT: psubq %xmm4, %xmm2
> +; X32-SSE-NEXT: movdqa %xmm2, %xmm0
> ; X32-SSE-NEXT: retl
> %shift = ashr <2 x i64> %a, %b
> ret <2 x i64> %shift
> @@ -637,7 +636,6 @@ define <2 x i64> @splatvar_shift_v2i64(< ; ;
> X32-
> SSE-LABEL: splatvar_shift_v2i64:
> ; X32-SSE: # BB#0:
> -; X32-SSE-NEXT: movq {{.*#+}} xmm1 = xmm1[0],zero
> ; X32-SSE-NEXT: movdqa {{.*#+}} xmm2 = [0,2147483648,0,2147483648]
> ; X32-SSE-NEXT: psrlq %xmm1, %xmm2
> ; X32-SSE-NEXT: psrlq %xmm1, %xmm0
> @@ -659,29 +657,25 @@ define <4 x i32> @splatvar_shift_v4i32(< ; ;
> SSE41-LABEL: splatvar_shift_v4i32:
> ; SSE41: # BB#0:
> -; SSE41-NEXT: pxor %xmm2, %xmm2
> -; SSE41-NEXT: pblendw {{.*#+}} xmm2 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> -; SSE41-NEXT: psrad %xmm2, %xmm0
> +; SSE41-NEXT: pmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> +; SSE41-NEXT: psrad %xmm1, %xmm0
> ; SSE41-NEXT: retq
> ;
> ; AVX-LABEL: splatvar_shift_v4i32:
> ; AVX: # BB#0:
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX-NEXT: vpsrad %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> ;
> ; XOP-LABEL: splatvar_shift_v4i32:
> ; XOP: # BB#0:
> -; XOP-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOP-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; XOP-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; XOP-NEXT: vpsrad %xmm1, %xmm0, %xmm0
> ; XOP-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v4i32:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vxorps %xmm2, %xmm2, %xmm2
> -; AVX512-NEXT: vmovss {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
> +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX512-NEXT: vpsrad %xmm1, %xmm0, %xmm0
> ; AVX512-NEXT: retq
> ;
> @@ -706,29 +700,25 @@ define <8 x i16> @splatvar_shift_v8i16(< ; ;
> SSE41-LABEL: splatvar_shift_v8i16:
> ; SSE41: # BB#0:
> -; SSE41-NEXT: pxor %xmm2, %xmm2
> -; SSE41-NEXT: pblendw {{.*#+}} xmm2 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> -; SSE41-NEXT: psraw %xmm2, %xmm0
> +; SSE41-NEXT: pmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> +; SSE41-NEXT: psraw %xmm1, %xmm0
> ; SSE41-NEXT: retq
> ;
> ; AVX-LABEL: splatvar_shift_v8i16:
> ; AVX: # BB#0:
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX-NEXT: vpsraw %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> ;
> ; XOP-LABEL: splatvar_shift_v8i16:
> ; XOP: # BB#0:
> -; XOP-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOP-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; XOP-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; XOP-NEXT: vpsraw %xmm1, %xmm0, %xmm0
> ; XOP-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v8i16:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX512-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; AVX512-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX512-NEXT: vpsraw %xmm1, %xmm0, %xmm0
> ; AVX512-NEXT: retq
> ;
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-shift-ashr-256.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-shift-ashr-
> 256.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-shift-ashr-256.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-shift-ashr-256.ll Thu Jan 5
> +++ 09:11:43 2017
> @@ -426,9 +426,8 @@ define <4 x i64> @splatvar_shift_v4i64(< define
> <8 x i32> @splatvar_shift_v8i32(<8 x i32> %a, <8 x i32> %b) nounwind {
> ;
> AVX1-LABEL: splatvar_shift_v8i32:
> ; AVX1: # BB#0:
> -; AVX1-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX1-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> ; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> +; AVX1-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX1-NEXT: vpsrad %xmm1, %xmm2, %xmm2
> ; AVX1-NEXT: vpsrad %xmm1, %xmm0, %xmm0
> ; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -436,16 +435,14 @@ define <8 x i32> @splatvar_shift_v8i32(< ; ;
> AVX2-LABEL: splatvar_shift_v8i32:
> ; AVX2: # BB#0:
> -; AVX2-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX2-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; AVX2-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX2-NEXT: vpsrad %xmm1, %ymm0, %ymm0
> ; AVX2-NEXT: retq
> ;
> ; XOPAVX1-LABEL: splatvar_shift_v8i32:
> ; XOPAVX1: # BB#0:
> -; XOPAVX1-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOPAVX1-NEXT: vpblendw {{.*#+}} xmm1 =
> xmm1[0,1],xmm2[2,3,4,5,6,7]
> ; XOPAVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> +; XOPAVX1-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; XOPAVX1-NEXT: vpsrad %xmm1, %xmm2, %xmm2
> ; XOPAVX1-NEXT: vpsrad %xmm1, %xmm0, %xmm0
> ; XOPAVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -453,15 +450,13 @@ define <8 x i32> @splatvar_shift_v8i32(< ; ;
> XOPAVX2-LABEL: splatvar_shift_v8i32:
> ; XOPAVX2: # BB#0:
> -; XOPAVX2-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOPAVX2-NEXT: vpblendw {{.*#+}} xmm1 =
> xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; XOPAVX2-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; XOPAVX2-NEXT: vpsrad %xmm1, %ymm0, %ymm0
> ; XOPAVX2-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v8i32:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vxorps %xmm2, %xmm2, %xmm2
> -; AVX512-NEXT: vmovss {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
> +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX512-NEXT: vpsrad %xmm1, %ymm0, %ymm0
> ; AVX512-NEXT: retq
> %splat = shufflevector <8 x i32> %b, <8 x i32> undef, <8 x i32>
> zeroinitializer @@ -473,8 +468,7 @@ define <16 x i16>
> @splatvar_shift_v16i16 ; AVX1-LABEL: splatvar_shift_v16i16:
> ; AVX1: # BB#0:
> ; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> -; AVX1-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX1-NEXT: vmovd %eax, %xmm1
> +; AVX1-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX1-NEXT: vpsraw %xmm1, %xmm2, %xmm2
> ; AVX1-NEXT: vpsraw %xmm1, %xmm0, %xmm0
> ; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -482,16 +476,14 @@ define <16 x i16> @splatvar_shift_v16i16 ; ;
> AVX2-LABEL: splatvar_shift_v16i16:
> ; AVX2: # BB#0:
> -; AVX2-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX2-NEXT: vmovd %eax, %xmm1
> +; AVX2-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX2-NEXT: vpsraw %xmm1, %ymm0, %ymm0
> ; AVX2-NEXT: retq
> ;
> ; XOPAVX1-LABEL: splatvar_shift_v16i16:
> ; XOPAVX1: # BB#0:
> ; XOPAVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> -; XOPAVX1-NEXT: vpextrw $0, %xmm1, %eax
> -; XOPAVX1-NEXT: vmovd %eax, %xmm1
> +; XOPAVX1-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; XOPAVX1-NEXT: vpsraw %xmm1, %xmm2, %xmm2
> ; XOPAVX1-NEXT: vpsraw %xmm1, %xmm0, %xmm0
> ; XOPAVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -499,15 +491,13 @@ define <16 x i16> @splatvar_shift_v16i16 ; ;
> XOPAVX2-LABEL: splatvar_shift_v16i16:
> ; XOPAVX2: # BB#0:
> -; XOPAVX2-NEXT: vpextrw $0, %xmm1, %eax
> -; XOPAVX2-NEXT: vmovd %eax, %xmm1
> +; XOPAVX2-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; XOPAVX2-NEXT: vpsraw %xmm1, %ymm0, %ymm0
> ; XOPAVX2-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v16i16:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX512-NEXT: vmovd %eax, %xmm1
> +; AVX512-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX512-NEXT: vpsraw %xmm1, %ymm0, %ymm0
> ; AVX512-NEXT: retq
> %splat = shufflevector <16 x i16> %b, <16 x i16> undef, <16 x i32>
> zeroinitializer
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-shift-ashr-512.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-shift-ashr-
> 512.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-shift-ashr-512.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-shift-ashr-512.ll Thu Jan 5
> +++ 09:11:43 2017
> @@ -525,8 +525,7 @@ define <8 x i64> @splatvar_shift_v8i64(< define
> <16 x i32> @splatvar_shift_v16i32(<16 x i32> %a, <16 x i32> %b)
> nounwind { ; ALL-LABEL: splatvar_shift_v16i32:
> ; ALL: ## BB#0:
> -; ALL-NEXT: vxorps %xmm2, %xmm2, %xmm2
> -; ALL-NEXT: vmovss {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
> +; ALL-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; ALL-NEXT: vpsrad %xmm1, %zmm0, %zmm0
> ; ALL-NEXT: retq
> %splat = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32>
> zeroinitializer @@ -537,16 +536,14 @@ define <16 x i32>
> @splatvar_shift_v16i32 define <32 x i16> @splatvar_shift_v32i16(<32 x
> i16> %a, <32 x i16> %b) nounwind { ; AVX512DQ-LABEL:
> splatvar_shift_v32i16:
> ; AVX512DQ: ## BB#0:
> -; AVX512DQ-NEXT: vpextrw $0, %xmm2, %eax
> -; AVX512DQ-NEXT: vmovd %eax, %xmm2
> +; AVX512DQ-NEXT: vpmovzxwq {{.*#+}} xmm2 =
> xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero
> ; AVX512DQ-NEXT: vpsraw %xmm2, %ymm0, %ymm0
> ; AVX512DQ-NEXT: vpsraw %xmm2, %ymm1, %ymm1
> ; AVX512DQ-NEXT: retq
> ;
> ; AVX512BW-LABEL: splatvar_shift_v32i16:
> ; AVX512BW: ## BB#0:
> -; AVX512BW-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX512BW-NEXT: vmovd %eax, %xmm1
> +; AVX512BW-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX512BW-NEXT: vpsraw %xmm1, %zmm0, %zmm0
> ; AVX512BW-NEXT: retq
> %splat = shufflevector <32 x i16> %b, <32 x i16> undef, <32 x i32>
> zeroinitializer
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-shift-lshr-128.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-shift-lshr-
> 128.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-shift-lshr-128.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-shift-lshr-128.ll Thu Jan 5
> +++ 09:11:43 2017
> @@ -69,7 +69,6 @@ define <2 x i64> @var_shift_v2i64(<2 x i
> ; X32-SSE-NEXT: pshufd {{.*#+}} xmm3 = xmm1[2,3,0,1]
> ; X32-SSE-NEXT: movdqa %xmm0, %xmm2
> ; X32-SSE-NEXT: psrlq %xmm3, %xmm2
> -; X32-SSE-NEXT: movq {{.*#+}} xmm1 = xmm1[0],zero
> ; X32-SSE-NEXT: psrlq %xmm1, %xmm0
> ; X32-SSE-NEXT: movsd {{.*#+}} xmm2 = xmm0[0],xmm2[1]
> ; X32-SSE-NEXT: movapd %xmm2, %xmm0
> @@ -493,7 +492,6 @@ define <2 x i64> @splatvar_shift_v2i64(< ; ;
> X32-
> SSE-LABEL: splatvar_shift_v2i64:
> ; X32-SSE: # BB#0:
> -; X32-SSE-NEXT: movq {{.*#+}} xmm1 = xmm1[0],zero
> ; X32-SSE-NEXT: psrlq %xmm1, %xmm0
> ; X32-SSE-NEXT: retl
> %splat = shufflevector <2 x i64> %b, <2 x i64> undef, <2 x i32>
> zeroinitializer @@ -511,29 +509,25 @@ define <4 x i32>
> @splatvar_shift_v4i32(< ; ; SSE41-LABEL: splatvar_shift_v4i32:
> ; SSE41: # BB#0:
> -; SSE41-NEXT: pxor %xmm2, %xmm2
> -; SSE41-NEXT: pblendw {{.*#+}} xmm2 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> -; SSE41-NEXT: psrld %xmm2, %xmm0
> +; SSE41-NEXT: pmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> +; SSE41-NEXT: psrld %xmm1, %xmm0
> ; SSE41-NEXT: retq
> ;
> ; AVX-LABEL: splatvar_shift_v4i32:
> ; AVX: # BB#0:
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX-NEXT: vpsrld %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> ;
> ; XOP-LABEL: splatvar_shift_v4i32:
> ; XOP: # BB#0:
> -; XOP-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOP-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; XOP-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; XOP-NEXT: vpsrld %xmm1, %xmm0, %xmm0
> ; XOP-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v4i32:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vxorps %xmm2, %xmm2, %xmm2
> -; AVX512-NEXT: vmovss {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
> +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX512-NEXT: vpsrld %xmm1, %xmm0, %xmm0
> ; AVX512-NEXT: retq
> ;
> @@ -558,29 +552,25 @@ define <8 x i16> @splatvar_shift_v8i16(< ; ;
> SSE41-LABEL: splatvar_shift_v8i16:
> ; SSE41: # BB#0:
> -; SSE41-NEXT: pxor %xmm2, %xmm2
> -; SSE41-NEXT: pblendw {{.*#+}} xmm2 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> -; SSE41-NEXT: psrlw %xmm2, %xmm0
> +; SSE41-NEXT: pmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> +; SSE41-NEXT: psrlw %xmm1, %xmm0
> ; SSE41-NEXT: retq
> ;
> ; AVX-LABEL: splatvar_shift_v8i16:
> ; AVX: # BB#0:
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX-NEXT: vpsrlw %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> ;
> ; XOP-LABEL: splatvar_shift_v8i16:
> ; XOP: # BB#0:
> -; XOP-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOP-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; XOP-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; XOP-NEXT: vpsrlw %xmm1, %xmm0, %xmm0
> ; XOP-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v8i16:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX512-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; AVX512-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX512-NEXT: vpsrlw %xmm1, %xmm0, %xmm0
> ; AVX512-NEXT: retq
> ;
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-shift-lshr-256.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-shift-lshr-
> 256.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-shift-lshr-256.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-shift-lshr-256.ll Thu Jan 5
> +++ 09:11:43 2017
> @@ -337,9 +337,8 @@ define <4 x i64> @splatvar_shift_v4i64(< define
> <8 x i32> @splatvar_shift_v8i32(<8 x i32> %a, <8 x i32> %b) nounwind {
> ;
> AVX1-LABEL: splatvar_shift_v8i32:
> ; AVX1: # BB#0:
> -; AVX1-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX1-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> ; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> +; AVX1-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX1-NEXT: vpsrld %xmm1, %xmm2, %xmm2
> ; AVX1-NEXT: vpsrld %xmm1, %xmm0, %xmm0
> ; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -347,16 +346,14 @@ define <8 x i32> @splatvar_shift_v8i32(< ; ;
> AVX2-LABEL: splatvar_shift_v8i32:
> ; AVX2: # BB#0:
> -; AVX2-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX2-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; AVX2-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX2-NEXT: vpsrld %xmm1, %ymm0, %ymm0
> ; AVX2-NEXT: retq
> ;
> ; XOPAVX1-LABEL: splatvar_shift_v8i32:
> ; XOPAVX1: # BB#0:
> -; XOPAVX1-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOPAVX1-NEXT: vpblendw {{.*#+}} xmm1 =
> xmm1[0,1],xmm2[2,3,4,5,6,7]
> ; XOPAVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> +; XOPAVX1-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; XOPAVX1-NEXT: vpsrld %xmm1, %xmm2, %xmm2
> ; XOPAVX1-NEXT: vpsrld %xmm1, %xmm0, %xmm0
> ; XOPAVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -364,15 +361,13 @@ define <8 x i32> @splatvar_shift_v8i32(< ; ;
> XOPAVX2-LABEL: splatvar_shift_v8i32:
> ; XOPAVX2: # BB#0:
> -; XOPAVX2-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOPAVX2-NEXT: vpblendw {{.*#+}} xmm1 =
> xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; XOPAVX2-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; XOPAVX2-NEXT: vpsrld %xmm1, %ymm0, %ymm0
> ; XOPAVX2-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v8i32:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vxorps %xmm2, %xmm2, %xmm2
> -; AVX512-NEXT: vmovss {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
> +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX512-NEXT: vpsrld %xmm1, %ymm0, %ymm0
> ; AVX512-NEXT: retq
> %splat = shufflevector <8 x i32> %b, <8 x i32> undef, <8 x i32>
> zeroinitializer @@ -384,8 +379,7 @@ define <16 x i16>
> @splatvar_shift_v16i16 ; AVX1-LABEL: splatvar_shift_v16i16:
> ; AVX1: # BB#0:
> ; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> -; AVX1-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX1-NEXT: vmovd %eax, %xmm1
> +; AVX1-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX1-NEXT: vpsrlw %xmm1, %xmm2, %xmm2
> ; AVX1-NEXT: vpsrlw %xmm1, %xmm0, %xmm0
> ; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -393,16 +387,14 @@ define <16 x i16> @splatvar_shift_v16i16 ; ;
> AVX2-LABEL: splatvar_shift_v16i16:
> ; AVX2: # BB#0:
> -; AVX2-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX2-NEXT: vmovd %eax, %xmm1
> +; AVX2-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX2-NEXT: vpsrlw %xmm1, %ymm0, %ymm0
> ; AVX2-NEXT: retq
> ;
> ; XOPAVX1-LABEL: splatvar_shift_v16i16:
> ; XOPAVX1: # BB#0:
> ; XOPAVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> -; XOPAVX1-NEXT: vpextrw $0, %xmm1, %eax
> -; XOPAVX1-NEXT: vmovd %eax, %xmm1
> +; XOPAVX1-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; XOPAVX1-NEXT: vpsrlw %xmm1, %xmm2, %xmm2
> ; XOPAVX1-NEXT: vpsrlw %xmm1, %xmm0, %xmm0
> ; XOPAVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -410,15 +402,13 @@ define <16 x i16> @splatvar_shift_v16i16 ; ;
> XOPAVX2-LABEL: splatvar_shift_v16i16:
> ; XOPAVX2: # BB#0:
> -; XOPAVX2-NEXT: vpextrw $0, %xmm1, %eax
> -; XOPAVX2-NEXT: vmovd %eax, %xmm1
> +; XOPAVX2-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; XOPAVX2-NEXT: vpsrlw %xmm1, %ymm0, %ymm0
> ; XOPAVX2-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v16i16:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX512-NEXT: vmovd %eax, %xmm1
> +; AVX512-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX512-NEXT: vpsrlw %xmm1, %ymm0, %ymm0
> ; AVX512-NEXT: retq
> %splat = shufflevector <16 x i16> %b, <16 x i16> undef, <16 x i32>
> zeroinitializer
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-shift-lshr-512.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-shift-lshr-
> 512.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-shift-lshr-512.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-shift-lshr-512.ll Thu Jan 5
> +++ 09:11:43 2017
> @@ -505,8 +505,7 @@ define <8 x i64> @splatvar_shift_v8i64(< define
> <16 x i32> @splatvar_shift_v16i32(<16 x i32> %a, <16 x i32> %b)
> nounwind { ; ALL-LABEL: splatvar_shift_v16i32:
> ; ALL: ## BB#0:
> -; ALL-NEXT: vxorps %xmm2, %xmm2, %xmm2
> -; ALL-NEXT: vmovss {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
> +; ALL-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; ALL-NEXT: vpsrld %xmm1, %zmm0, %zmm0
> ; ALL-NEXT: retq
> %splat = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32>
> zeroinitializer @@ -517,16 +516,14 @@ define <16 x i32>
> @splatvar_shift_v16i32 define <32 x i16> @splatvar_shift_v32i16(<32 x
> i16> %a, <32 x i16> %b) nounwind { ; AVX512DQ-LABEL:
> splatvar_shift_v32i16:
> ; AVX512DQ: ## BB#0:
> -; AVX512DQ-NEXT: vpextrw $0, %xmm2, %eax
> -; AVX512DQ-NEXT: vmovd %eax, %xmm2
> +; AVX512DQ-NEXT: vpmovzxwq {{.*#+}} xmm2 =
> xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero
> ; AVX512DQ-NEXT: vpsrlw %xmm2, %ymm0, %ymm0
> ; AVX512DQ-NEXT: vpsrlw %xmm2, %ymm1, %ymm1
> ; AVX512DQ-NEXT: retq
> ;
> ; AVX512BW-LABEL: splatvar_shift_v32i16:
> ; AVX512BW: ## BB#0:
> -; AVX512BW-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX512BW-NEXT: vmovd %eax, %xmm1
> +; AVX512BW-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX512BW-NEXT: vpsrlw %xmm1, %zmm0, %zmm0
> ; AVX512BW-NEXT: retq
> %splat = shufflevector <32 x i16> %b, <32 x i16> undef, <32 x i32>
> zeroinitializer
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-shift-shl-128.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-shift-shl-
> 128.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-shift-shl-128.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-shift-shl-128.ll Thu Jan 5
> +++ 09:11:43 2017
> @@ -67,7 +67,6 @@ define <2 x i64> @var_shift_v2i64(<2 x i
> ; X32-SSE-NEXT: pshufd {{.*#+}} xmm3 = xmm1[2,3,0,1]
> ; X32-SSE-NEXT: movdqa %xmm0, %xmm2
> ; X32-SSE-NEXT: psllq %xmm3, %xmm2
> -; X32-SSE-NEXT: movq {{.*#+}} xmm1 = xmm1[0],zero
> ; X32-SSE-NEXT: psllq %xmm1, %xmm0
> ; X32-SSE-NEXT: movsd {{.*#+}} xmm2 = xmm0[0],xmm2[1]
> ; X32-SSE-NEXT: movapd %xmm2, %xmm0
> @@ -441,7 +440,6 @@ define <2 x i64> @splatvar_shift_v2i64(< ; ;
> X32-
> SSE-LABEL: splatvar_shift_v2i64:
> ; X32-SSE: # BB#0:
> -; X32-SSE-NEXT: movq {{.*#+}} xmm1 = xmm1[0],zero
> ; X32-SSE-NEXT: psllq %xmm1, %xmm0
> ; X32-SSE-NEXT: retl
> %splat = shufflevector <2 x i64> %b, <2 x i64> undef, <2 x i32>
> zeroinitializer @@ -459,29 +457,25 @@ define <4 x i32>
> @splatvar_shift_v4i32(< ; ; SSE41-LABEL: splatvar_shift_v4i32:
> ; SSE41: # BB#0:
> -; SSE41-NEXT: pxor %xmm2, %xmm2
> -; SSE41-NEXT: pblendw {{.*#+}} xmm2 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> -; SSE41-NEXT: pslld %xmm2, %xmm0
> +; SSE41-NEXT: pmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> +; SSE41-NEXT: pslld %xmm1, %xmm0
> ; SSE41-NEXT: retq
> ;
> ; AVX-LABEL: splatvar_shift_v4i32:
> ; AVX: # BB#0:
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX-NEXT: vpslld %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> ;
> ; XOP-LABEL: splatvar_shift_v4i32:
> ; XOP: # BB#0:
> -; XOP-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOP-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; XOP-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; XOP-NEXT: vpslld %xmm1, %xmm0, %xmm0
> ; XOP-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v4i32:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vxorps %xmm2, %xmm2, %xmm2
> -; AVX512-NEXT: vmovss {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
> +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX512-NEXT: vpslld %xmm1, %xmm0, %xmm0
> ; AVX512-NEXT: retq
> ;
> @@ -506,29 +500,25 @@ define <8 x i16> @splatvar_shift_v8i16(< ; ;
> SSE41-LABEL: splatvar_shift_v8i16:
> ; SSE41: # BB#0:
> -; SSE41-NEXT: pxor %xmm2, %xmm2
> -; SSE41-NEXT: pblendw {{.*#+}} xmm2 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> -; SSE41-NEXT: psllw %xmm2, %xmm0
> +; SSE41-NEXT: pmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> +; SSE41-NEXT: psllw %xmm1, %xmm0
> ; SSE41-NEXT: retq
> ;
> ; AVX-LABEL: splatvar_shift_v8i16:
> ; AVX: # BB#0:
> -; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; AVX-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX-NEXT: vpsllw %xmm1, %xmm0, %xmm0
> ; AVX-NEXT: retq
> ;
> ; XOP-LABEL: splatvar_shift_v8i16:
> ; XOP: # BB#0:
> -; XOP-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOP-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; XOP-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; XOP-NEXT: vpsllw %xmm1, %xmm0, %xmm0
> ; XOP-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v8i16:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX512-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3,4,5,6,7]
> +; AVX512-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX512-NEXT: vpsllw %xmm1, %xmm0, %xmm0
> ; AVX512-NEXT: retq
> ;
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-shift-shl-256.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-shift-shl-
> 256.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-shift-shl-256.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-shift-shl-256.ll Thu Jan 5
> +++ 09:11:43 2017
> @@ -301,9 +301,8 @@ define <4 x i64> @splatvar_shift_v4i64(< define
> <8 x i32> @splatvar_shift_v8i32(<8 x i32> %a, <8 x i32> %b) nounwind {
> ;
> AVX1-LABEL: splatvar_shift_v8i32:
> ; AVX1: # BB#0:
> -; AVX1-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX1-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> ; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> +; AVX1-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX1-NEXT: vpslld %xmm1, %xmm2, %xmm2
> ; AVX1-NEXT: vpslld %xmm1, %xmm0, %xmm0
> ; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -311,16 +310,14 @@ define <8 x i32> @splatvar_shift_v8i32(< ; ;
> AVX2-LABEL: splatvar_shift_v8i32:
> ; AVX2: # BB#0:
> -; AVX2-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; AVX2-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; AVX2-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX2-NEXT: vpslld %xmm1, %ymm0, %ymm0
> ; AVX2-NEXT: retq
> ;
> ; XOPAVX1-LABEL: splatvar_shift_v8i32:
> ; XOPAVX1: # BB#0:
> -; XOPAVX1-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOPAVX1-NEXT: vpblendw {{.*#+}} xmm1 =
> xmm1[0,1],xmm2[2,3,4,5,6,7]
> ; XOPAVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> +; XOPAVX1-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; XOPAVX1-NEXT: vpslld %xmm1, %xmm2, %xmm2
> ; XOPAVX1-NEXT: vpslld %xmm1, %xmm0, %xmm0
> ; XOPAVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -328,15 +325,13 @@ define <8 x i32> @splatvar_shift_v8i32(< ; ;
> XOPAVX2-LABEL: splatvar_shift_v8i32:
> ; XOPAVX2: # BB#0:
> -; XOPAVX2-NEXT: vpxor %xmm2, %xmm2, %xmm2
> -; XOPAVX2-NEXT: vpblendw {{.*#+}} xmm1 =
> xmm1[0,1],xmm2[2,3,4,5,6,7]
> +; XOPAVX2-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; XOPAVX2-NEXT: vpslld %xmm1, %ymm0, %ymm0
> ; XOPAVX2-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v8i32:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vxorps %xmm2, %xmm2, %xmm2
> -; AVX512-NEXT: vmovss {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
> +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; AVX512-NEXT: vpslld %xmm1, %ymm0, %ymm0
> ; AVX512-NEXT: retq
> %splat = shufflevector <8 x i32> %b, <8 x i32> undef, <8 x i32>
> zeroinitializer @@ -348,8 +343,7 @@ define <16 x i16>
> @splatvar_shift_v16i16 ; AVX1-LABEL: splatvar_shift_v16i16:
> ; AVX1: # BB#0:
> ; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> -; AVX1-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX1-NEXT: vmovd %eax, %xmm1
> +; AVX1-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX1-NEXT: vpsllw %xmm1, %xmm2, %xmm2
> ; AVX1-NEXT: vpsllw %xmm1, %xmm0, %xmm0
> ; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -357,16 +351,14 @@ define <16 x i16> @splatvar_shift_v16i16 ; ;
> AVX2-LABEL: splatvar_shift_v16i16:
> ; AVX2: # BB#0:
> -; AVX2-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX2-NEXT: vmovd %eax, %xmm1
> +; AVX2-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX2-NEXT: vpsllw %xmm1, %ymm0, %ymm0
> ; AVX2-NEXT: retq
> ;
> ; XOPAVX1-LABEL: splatvar_shift_v16i16:
> ; XOPAVX1: # BB#0:
> ; XOPAVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
> -; XOPAVX1-NEXT: vpextrw $0, %xmm1, %eax
> -; XOPAVX1-NEXT: vmovd %eax, %xmm1
> +; XOPAVX1-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; XOPAVX1-NEXT: vpsllw %xmm1, %xmm2, %xmm2
> ; XOPAVX1-NEXT: vpsllw %xmm1, %xmm0, %xmm0
> ; XOPAVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
> @@ -374,15 +366,13 @@ define <16 x i16> @splatvar_shift_v16i16 ; ;
> XOPAVX2-LABEL: splatvar_shift_v16i16:
> ; XOPAVX2: # BB#0:
> -; XOPAVX2-NEXT: vpextrw $0, %xmm1, %eax
> -; XOPAVX2-NEXT: vmovd %eax, %xmm1
> +; XOPAVX2-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; XOPAVX2-NEXT: vpsllw %xmm1, %ymm0, %ymm0
> ; XOPAVX2-NEXT: retq
> ;
> ; AVX512-LABEL: splatvar_shift_v16i16:
> ; AVX512: ## BB#0:
> -; AVX512-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX512-NEXT: vmovd %eax, %xmm1
> +; AVX512-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX512-NEXT: vpsllw %xmm1, %ymm0, %ymm0
> ; AVX512-NEXT: retq
> %splat = shufflevector <16 x i16> %b, <16 x i16> undef, <16 x i32>
> zeroinitializer
>
> Modified: llvm/trunk/test/CodeGen/X86/vector-shift-shl-512.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vector-shift-shl-
> 512.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vector-shift-shl-512.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vector-shift-shl-512.ll Thu Jan 5
> +++ 09:11:43 2017
> @@ -502,8 +502,7 @@ define <8 x i64> @splatvar_shift_v8i64(< define
> <16 x i32> @splatvar_shift_v16i32(<16 x i32> %a, <16 x i32> %b)
> nounwind { ; ALL-LABEL: splatvar_shift_v16i32:
> ; ALL: ## BB#0:
> -; ALL-NEXT: vxorps %xmm2, %xmm2, %xmm2
> -; ALL-NEXT: vmovss {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
> +; ALL-NEXT: vpmovzxdq {{.*#+}} xmm1 = xmm1[0],zero,xmm1[1],zero
> ; ALL-NEXT: vpslld %xmm1, %zmm0, %zmm0
> ; ALL-NEXT: retq
> %splat = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32>
> zeroinitializer @@ -514,16 +513,14 @@ define <16 x i32>
> @splatvar_shift_v16i32 define <32 x i16> @splatvar_shift_v32i16(<32 x
> i16> %a, <32 x i16> %b) nounwind { ; AVX512DQ-LABEL:
> splatvar_shift_v32i16:
> ; AVX512DQ: ## BB#0:
> -; AVX512DQ-NEXT: vpextrw $0, %xmm2, %eax
> -; AVX512DQ-NEXT: vmovd %eax, %xmm2
> +; AVX512DQ-NEXT: vpmovzxwq {{.*#+}} xmm2 =
> xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero
> ; AVX512DQ-NEXT: vpsllw %xmm2, %ymm0, %ymm0
> ; AVX512DQ-NEXT: vpsllw %xmm2, %ymm1, %ymm1
> ; AVX512DQ-NEXT: retq
> ;
> ; AVX512BW-LABEL: splatvar_shift_v32i16:
> ; AVX512BW: ## BB#0:
> -; AVX512BW-NEXT: vpextrw $0, %xmm1, %eax
> -; AVX512BW-NEXT: vmovd %eax, %xmm1
> +; AVX512BW-NEXT: vpmovzxwq {{.*#+}} xmm1 =
> xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero
> ; AVX512BW-NEXT: vpsllw %xmm1, %zmm0, %zmm0
> ; AVX512BW-NEXT: retq
> %splat = shufflevector <32 x i16> %b, <32 x i16> undef, <32 x i32>
> zeroinitializer
>
> Modified: llvm/trunk/test/CodeGen/X86/vshift-4.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/X86/vshift-
> 4.ll?rev=291120&r1=291119&r2=291120&view=diff
> ======================================================================
> =
> =======
> --- llvm/trunk/test/CodeGen/X86/vshift-4.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/vshift-4.ll Thu Jan 5 09:11:43 2017
> @@ -9,7 +9,6 @@ define void @shift1a(<2 x i64> %val, <2 ; X32-LABEL:
> shift1a:
> ; X32: # BB#0: # %entry
> ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
> -; X32-NEXT: movq {{.*#+}} xmm1 = xmm1[0],zero
> ; X32-NEXT: psllq %xmm1, %xmm0
> ; X32-NEXT: movdqa %xmm0, (%eax)
> ; X32-NEXT: retl
> @@ -34,7 +33,6 @@ define void @shift1b(<2 x i64> %val, <2
> ; X32-NEXT: pshufd {{.*#+}} xmm2 = xmm1[2,3,0,1]
> ; X32-NEXT: movdqa %xmm0, %xmm3
> ; X32-NEXT: psllq %xmm2, %xmm3
> -; X32-NEXT: movq {{.*#+}} xmm1 = xmm1[0],zero
> ; X32-NEXT: psllq %xmm1, %xmm0
> ; X32-NEXT: movsd {{.*#+}} xmm3 = xmm0[0],xmm3[1]
> ; X32-NEXT: movapd %xmm3, (%eax)
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
More information about the llvm-commits
mailing list