[PATCH] [X86][SSE] Vectorized i64 uniform constant SRA shifts

Elena Demikhovsky elena.demikhovsky at intel.com
Sun May 10 06:50:58 PDT 2015


REPOSITORY
  rL LLVM

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16552
@@ -16524,2 +16551,3 @@
   // Special case in 32-bit mode, where i64 is expanded into high and low parts.
-  if (!Subtarget->is64Bit() && VT == MVT::v2i64  &&
+  if (!Subtarget->is64Bit() && VT == MVT::v2i64 &&
+      (Op.getOpcode() != ISD::SRA || Subtarget->hasAVX512()) &&
----------------
I think, that I fixed here a bug and removed AVX512. Could you, please, check?

================
Comment at: test/CodeGen/X86/vector-sext.ll:111
@@ -122,17 +110,3 @@
 ; SSE41:       # BB#0: # %entry
-; SSE41-NEXT:    pmovzxdq %xmm0, %xmm1
-; SSE41-NEXT:    pextrq $1, %xmm1, %rax
-; SSE41-NEXT:    cltq
-; SSE41-NEXT:    movd %rax, %xmm3
-; SSE41-NEXT:    movd %xmm1, %rax
-; SSE41-NEXT:    cltq
-; SSE41-NEXT:    movd %rax, %xmm2
-; SSE41-NEXT:    punpcklqdq {{.*#+}} xmm2 = xmm2[0],xmm3[0]
-; SSE41-NEXT:    pshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]
-; SSE41-NEXT:    pextrq $1, %xmm0, %rax
-; SSE41-NEXT:    cltq
-; SSE41-NEXT:    movd %rax, %xmm3
-; SSE41-NEXT:    movd %xmm0, %rax
-; SSE41-NEXT:    cltq
-; SSE41-NEXT:    movd %rax, %xmm1
-; SSE41-NEXT:    punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm3[0]
+; SSE41-NEXT:    pmovzxdq {{.*#+}} xmm2 = xmm0[0],zero,xmm0[1],zero
+; SSE41-NEXT:    psllq $32, %xmm2
----------------
I see this code as 2 pmovsxdq instructions and one shuffle between them. For windows, it looks like:
        pmovsxdq        (%rcx), %xmm0
        pmovsxdq        8(%rcx), %xmm1
        retq

For linux your parameter is in xmm, so you need one shuflle with <2, 3, undef, undef>

http://reviews.llvm.org/D9645

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list