[PATCH] D21148: [X86][SSE] Added support for combining target shuffles to (V)PSHUFD/VPERMILPD/VPERMILPS immediate permute
Ahmed Bougacha via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 22 15:12:05 PDT 2016
ab added a comment.
Code LGTM, but I'm not sure I have a clear picture; questions inline.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:24546-24547
@@ -24541,3 +24545,4 @@
unsigned &Shuffle, MVT &ShuffleVT) {
- bool FloatDomain = SrcVT.isFloatingPoint();
+ bool FloatDomain = SrcVT.isFloatingPoint() ||
+ (!Subtarget.hasAVX2() && SrcVT.is256BitVector());
----------------
This looks like independent goodness; maybe extract that out?
================
Comment at: test/CodeGen/X86/vector-shuffle-128-v2.ll:162
@@ +161,3 @@
+; AVX: # BB#0:
+; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,1]
+; AVX-NEXT: retq
----------------
I'm surprised by this and other changes; isn't the combine for shuffle chains? (it does look better for folding though; just trying to understand)
================
Comment at: test/CodeGen/X86/vector-shuffle-128-v4.ll:230
@@ +229,3 @@
+; AVX: # BB#0:
+; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,0,1,1]
+; AVX-NEXT: retq
----------------
In particular, this looks slightly more expensive according to Agner's Intel tables (for the folded variants)
Repository:
rL LLVM
http://reviews.llvm.org/D21148
More information about the llvm-commits
mailing list