[PATCH] D23897: [SelectionDAG] Generate vector_shuffle nodes for undersized result vector sizes

Tue Aug 30 16:20:46 PDT 2016

spatel added a comment.

In https://reviews.llvm.org/D23897#529226, @mkuper wrote:

> This version no longer has the AVX2 regressions, but I don't really know if it's the right way to approach it.
>  Any thoughts?


This seems fine to me, but I'm not sure what the alternatives are, so I'll let someone else comment on that.

I've just pointed out some nits in the inline comments.


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26342-26343
@@ -26341,1 +26341,4 @@
 
+// We are looking for a shuffle where both sources are vectors with a width
+// of half the output's width that were concatenated with undef. 
+// AVX2 has VPERMD/Q, so if we can express this a single-source shuffle, that's
----------------
Hard to parse that sentence. Is this accurate/better?
"We are looking for a shuffle where both sources are concatenated with undef and have a width that is half of the output's width."

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26344
@@ +26343,3 @@
+// of half the output's width that were concatenated with undef. 
+// AVX2 has VPERMD/Q, so if we can express this a single-source shuffle, that's
+// preferable.
----------------
spatel wrote:
> This function is specifically for AVX2, so I think the check that is currently outside of the function:
>   if (Subtarget.hasInt256())
> 
> should be inserted at the start in here. Also, it should be "hasAVX2()" rather than "hasInt256()" because this is distinguishing AVX2-ness, not 256-bit-ness.
this a -> this as a

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26344-26348
@@ +26343,7 @@
+// of half the output's width that were concatenated with undef. 
+// AVX2 has VPERMD/Q, so if we can express this a single-source shuffle, that's
+// preferable.
+static SDValue combineShuffleOfConcatUndef(SDNode *N, SelectionDAG &DAG) {
+  EVT VT = N->getValueType(0);
+
+  // We only care about shuffles of 128/256-bit vectors of i32/i64.
----------------
This function is specifically for AVX2, so I think the check that is currently outside of the function:
  if (Subtarget.hasInt256())

should be inserted at the start in here. Also, it should be "hasAVX2()" rather than "hasInt256()" because this is distinguishing AVX2-ness, not 256-bit-ness.


https://reviews.llvm.org/D23897