[PATCH] D19228: [X86][AVX2] Prefer VPERMQ/VPERMPD over VPERM2I128/VPERM2F128 for unary shuffles
Sanjay Patel via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 18 11:08:14 PDT 2016
spatel added a comment.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:10584
@@ -10581,3 +10583,3 @@
- // If either input operand is a zero vector, use VPERM2X128 because its mask
- // allows us to replace the zero input with an implicit zero.
+ // With AVX2 we should use VPERMQ/VPERMPD to allow memory folding.
+ if (Subtarget.hasAVX2() && isSingleInputShuffleMask(Mask) && !IsV1Zero)
----------------
I may have missed it, but the advantage shown in the test changes is just that we get to use an instruction with a single input operand. Add a test to show the load folding win?
Repository:
rL LLVM
http://reviews.llvm.org/D19228
More information about the llvm-commits
mailing list