[PATCH] [X86][AVX2] vpslldq/vpsrldq byte shifts for AVX2

Sun Feb 15 04:14:29 PST 2015

Thanks Craig, I've fixed the minor issues - I'll put up a new patch shortly.

Chandler has made some changes to computeZeroableShuffleElements() that means that the AVX1 versions now see the zeroable lanes and use vpslldq/vpsrldq (the xmm versions) so that looks like its sorted too.

REPOSITORY
  rL LLVM

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7837
@@ +7836,3 @@
+    V = DAG.getNode(Op, DL, ShiftVT, V,
+                    DAG.getConstant(ByteShift * 8, MVT::i8));
+    return DAG.getNode(ISD::BITCAST, DL, VT, V);
----------------
craig.topper wrote:
> Can we avoid the multiply by 8 and just use the Shift value directly? Of course the patterns would also need to be fixed to just use the immediate directly instead of BYTE_imm.
Yup we will be able to remove this multiply by 8 soon. Once this patch is in we can then remove the avx2 builtins for vpslldq/vpsrldq (similar to what happened in rL228481 for the SSE2 versions). At that point all the logic will be internal and we can replace the 'bit shift' immediate with a 'byte shift' version.

http://reviews.llvm.org/D7596

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/