[PATCH] [X86][SSE] Vectorized i64 uniform constant SRA shifts
Elena Demikhovsky
elena.demikhovsky at intel.com
Sun May 10 06:50:58 PDT 2015
REPOSITORY
rL LLVM
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16552
@@ -16524,2 +16551,3 @@
// Special case in 32-bit mode, where i64 is expanded into high and low parts.
- if (!Subtarget->is64Bit() && VT == MVT::v2i64 &&
+ if (!Subtarget->is64Bit() && VT == MVT::v2i64 &&
+ (Op.getOpcode() != ISD::SRA || Subtarget->hasAVX512()) &&
----------------
I think, that I fixed here a bug and removed AVX512. Could you, please, check?
================
Comment at: test/CodeGen/X86/vector-sext.ll:111
@@ -122,17 +110,3 @@
; SSE41: # BB#0: # %entry
-; SSE41-NEXT: pmovzxdq %xmm0, %xmm1
-; SSE41-NEXT: pextrq $1, %xmm1, %rax
-; SSE41-NEXT: cltq
-; SSE41-NEXT: movd %rax, %xmm3
-; SSE41-NEXT: movd %xmm1, %rax
-; SSE41-NEXT: cltq
-; SSE41-NEXT: movd %rax, %xmm2
-; SSE41-NEXT: punpcklqdq {{.*#+}} xmm2 = xmm2[0],xmm3[0]
-; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]
-; SSE41-NEXT: pextrq $1, %xmm0, %rax
-; SSE41-NEXT: cltq
-; SSE41-NEXT: movd %rax, %xmm3
-; SSE41-NEXT: movd %xmm0, %rax
-; SSE41-NEXT: cltq
-; SSE41-NEXT: movd %rax, %xmm1
-; SSE41-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm3[0]
+; SSE41-NEXT: pmovzxdq {{.*#+}} xmm2 = xmm0[0],zero,xmm0[1],zero
+; SSE41-NEXT: psllq $32, %xmm2
----------------
I see this code as 2 pmovsxdq instructions and one shuffle between them. For windows, it looks like:
pmovsxdq (%rcx), %xmm0
pmovsxdq 8(%rcx), %xmm1
retq
For linux your parameter is in xmm, so you need one shuflle with <2, 3, undef, undef>
http://reviews.llvm.org/D9645
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list