[PATCH] [X86][SSE] Avoid scalarization of v2i64 vector shifts
Simon Pilgrim
llvm-dev at redking.me.uk
Wed Mar 18 09:42:33 PDT 2015
Thanks Andrea, I'll update a new version of the patch later today.
REPOSITORY
rL LLVM
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16194
@@ +16193,3 @@
+ // per-lane and then shuffle the partial results back together.
+ if (VT == MVT::v2i64) {
+ // Splat the shift amounts so the scalar shifts above will catch it.
----------------
andreadb wrote:
> This would generate worse code if `Op.getOpcode() == ISD::SRA`.
> You should check that the opcode is not ISD::SRA. Otherwise, you would end up scalarizing two shifts.
Yes - i64 SRA isn't currently supported (and doesn't go through LowerShift at all atm) but I will add the check. Initial tests indicate that SRA would be faster for constant shifts (and all AVX2 implementations) - but as I said I'll deal with SRA properly in a future patch.
================
Comment at: test/CodeGen/X86/x86-shifts.ll:122-123
@@ -121,4 +121,4 @@
; CHECK: shr2_nosplat
-; CHECK-NOT: psrlq
-; CHECK-NOT: psrlq
-; CHECK: ret
+; CHECK: psrlq
+; CHECK: psrlq
+; CHECK: ret
----------------
andreadb wrote:
> Could you please add a check for the shift count?
> Something like
> CHECK-DAG: psrlq $8
> CHECK-DAG: psrlq $1
>
> It would be nice to also have checks for the two extra 'punpcklqdq shuffles that would be generated by your patch.
Yes - I'll add more complete CHECK lines for all my changes.
http://reviews.llvm.org/D8416
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list