[PATCH] D11439: [X86][SSE] Vectorize i64 ASHR operations

Tue Jul 28 09:35:23 PDT 2015

qcolombet added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:17457
@@ +17456,3 @@
+  // M = SIGN_BIT u>> A
+  // R s>> a === ((R u>> A) ^ M) - M
+  if ((VT == MVT::v2i64 || (VT == MVT::v4i64 && Subtarget->hasInt256())) &&
----------------
That wasn’t immediately clear to me that s>> and u>> referred to signed and unsigned shift.
Use lshr and ashr instead, like in llvm ir (or the SD name variant if you prefer).

================
Comment at: test/CodeGen/X86/vector-shift-ashr-128.ll:27
@@ -27,1 +26,3 @@
+; SSE2-NEXT:    xorpd %xmm4, %xmm2
+; SSE2-NEXT:    psubq %xmm4, %xmm2
 ; SSE2-NEXT:    movdqa %xmm2, %xmm0
----------------
Is this sequence actually better?

I guess the GPR to vector and vector to GPR copies are quite expensive so that is the case.
Just double checking.

================
Comment at: test/CodeGen/X86/vector-shift-ashr-128.ll:44
@@ -41,1 +43,3 @@
+; SSE41-NEXT:    pxor %xmm2, %xmm0
+; SSE41-NEXT:    psubq %xmm2, %xmm0
 ; SSE41-NEXT:    retq
----------------
Same question here (this time I guess not using pextr is good).


Repository:
  rL LLVM

http://reviews.llvm.org/D11439