[PATCH] [X86][SSE] Avoid scalarization of v2i64 vector shifts

Wed Mar 18 09:14:41 PDT 2015

Hi Simon,

please see comments below.


REPOSITORY
  rL LLVM

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16194
@@ +16193,3 @@
+  // per-lane and then shuffle the partial results back together.
+  if (VT == MVT::v2i64) {
+    // Splat the shift amounts so the scalar shifts above will catch it.
----------------
This would generate worse code if `Op.getOpcode() == ISD::SRA`.
You should check that the opcode is not ISD::SRA. Otherwise, you would end up scalarizing two shifts.

================
Comment at: test/CodeGen/X86/x86-shifts.ll:122-123
@@ -121,4 +121,4 @@
 ; CHECK: shr2_nosplat
-; CHECK-NOT:  psrlq
-; CHECK-NOT:  psrlq
-; CHECK:      ret
+; CHECK: psrlq
+; CHECK: psrlq
+; CHECK: ret
----------------
Could you please add a check for the shift count?
Something like
CHECK-DAG: psrlq $8
CHECK-DAG: psrlq $1

It would be nice to also have checks for the two extra 'punpcklqdq shuffles that would be generated by your patch.

http://reviews.llvm.org/D8416

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/