[llvm] r299047 - [DAGCombine] A shuffle of a splat is always the splat itself

Rackover, Zvi via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 3 10:55:24 PDT 2017


Hi Andrea,

Thank you very much for pointing out this bug. I reverted this commit in r299368 and will give it another try with the correct transformation.
I have a patch with a similar simplification in InstSimplify under review, D31527<https://reviews.llvm.org/D31527>. Will make sure it is not suffering from the same issue and at the least add a negative test to cover the problem you showed here.

Thanks, Zvi

From: Andrea Di Biagio [mailto:andrea.dibiagio at gmail.com]
Sent: Monday, April 03, 2017 18:55
To: Rackover, Zvi <zvi.rackover at intel.com>
Cc: llvm-commits <llvm-commits at lists.llvm.org>
Subject: Re: [llvm] r299047 - [DAGCombine] A shuffle of a splat is always the splat itself

Hi Zvi (and Sanjay),
There is a problem with this patch.
Checking for `isSplat()` is not enough, we also need to check if the splat mask contains undef indices. Otherwise, we risk to incorrectly propagate undef to users of the folded splat.
Consider the following example:



t49: v4i32 = vector_shuffle<0,2,u,u> t58, undef:v4i32

 t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32

A shuffle with interleaved `undef` indices is often introduced by the type legalizer when promoting from an illegal vector type to another types with same number of elements but wider element type.

In this case, the lower half of t49 is defined, while the upper half of t49 is undef.
This patch incorrectly simplifies that shuffle as follows:

t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32

This is problematic in the following case:

t51: i64 = extract_vector_elt t50, Constant:i64<0>
 t50: v2i64 = bitcast t49
  t49: v4i32 = vector_shuffle<0,2,u,u> t58, undef:v4i32

   t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32



This patch makes the upper half of t51 undefined. However, in the original code, the entire t51 was defined.
Here is another interesting case:

t51: i32 = extract_vector_elt t49, Constant:i64<0>
t50: i32 = extract_vector_elt t49, Constant:i64<1>
 t49: v4i32 = vector_shuffle<0,2,u,u> t58, undef:v4i32

  t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32



Before this patch, both t50 and t51 were defined, and the extracted value was the same.
With this patch, we now have this:

t51: i32 = extract_vector_elt t58, Constant:i64<0>
t50: i32 = extract_vector_elt t58, Constant:i64<1>
 t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32

Here, t50 is undef and can be folded away.

Here is a small reproducible:

// ===========================================================================
extern "C" int printf(const char*, ...);

using v2 = unsigned int __attribute__((__vector_size__(8)));
using v4 = unsigned int __attribute__((__vector_size__(16)));

int main() {
  v4 alpha = (v4){1, 2, 3, 4};
  v2 bravo = __builtin_shufflevector(alpha, alpha, 2, 2);
  printf("<%x %x>\n", bravo[0], bravo[1]);
}
// ===========================================================================

At runtime, the executable built at -O0 (-target x86_64-unknown-linux-gnu) prints out: <3 4>

At  -O2, it would print <3 3> (this is correct).


-Andrea

On Thu, Mar 30, 2017 at 2:42 AM, Zvi Rackover via llvm-commits <llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>> wrote:
Author: zvi
Date: Wed Mar 29 20:42:57 2017
New Revision: 299047

URL: http://llvm.org/viewvc/llvm-project?rev=299047&view=rev
Log:
[DAGCombine]  A shuffle of a splat is always the splat itself

Summary:
Add a simplification:
shuffle (splat-shuffle), undef, M --> splat-shuffle

Fixes pr32449

Patch by Sanjay Patel

Reviewers: eli.friedman, RKSimon, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31426

Modified:
    llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
    llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll

Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=299047&r1=299046&r2=299047&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Wed Mar 29 20:42:57 2017
@@ -14643,6 +14643,12 @@ SDValue DAGCombiner::visitVECTOR_SHUFFLE
       return DAG.getVectorShuffle(VT, SDLoc(N), N0, N1, NewMask);
   }

+  // A shuffle of a splat is always the splat itself:
+  // shuffle (splat-shuffle), undef, M --> splat-shuffle
+  if (auto *N0Shuf = dyn_cast<ShuffleVectorSDNode>(N0))
+    if (N1.isUndef() && N0Shuf->isSplat())
+      return N0;
+
   // If it is a splat, check if the argument vector is another splat or a
   // build_vector.
   if (SVN->isSplat() && SVN->getSplatIndex() < (int)NumElts) {

Modified: llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll?rev=299047&r1=299046&r2=299047&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll (original)
+++ llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll Wed Mar 29 20:42:57 2017
@@ -5,9 +5,8 @@
 define <2 x double> @foo2(<2 x double> %v, <2 x double> *%p) nounwind {
 ; AVX2-LABEL: foo2:
 ; AVX2:       # BB#0:
-; AVX2-NEXT:    vpermilpd {{.*#+}} xmm1 = xmm0[1,1]
-; AVX2-NEXT:    vpermilpd {{.*#+}} xmm0 = xmm1[1,0]
-; AVX2-NEXT:    vmovapd %xmm1, (%rdi)
+; AVX2-NEXT:    vpermilpd {{.*#+}} xmm0 = xmm0[1,1]
+; AVX2-NEXT:    vmovapd %xmm0, (%rdi)
 ; AVX2-NEXT:    retq
   %res = shufflevector <2 x double> %v, <2 x double> undef, <2 x i32> <i32 1, i32 1>
   %res1 = shufflevector<2 x double> %res, <2 x double> undef, <2 x i32> <i32 1, i32 undef>
@@ -18,9 +17,8 @@ define <2 x double> @foo2(<2 x double> %
 define <4 x double> @foo4(<4 x double> %v, <4 x double> *%p) nounwind {
 ; AVX2-LABEL: foo4:
 ; AVX2:       # BB#0:
-; AVX2-NEXT:    vpermpd {{.*#+}} ymm1 = ymm0[2,2,2,2]
-; AVX2-NEXT:    vpermpd {{.*#+}} ymm0 = ymm1[2,0,2,3]
-; AVX2-NEXT:    vmovapd %ymm1, (%rdi)
+; AVX2-NEXT:    vpermpd {{.*#+}} ymm0 = ymm0[2,2,2,2]
+; AVX2-NEXT:    vmovapd %ymm0, (%rdi)
 ; AVX2-NEXT:    retq
   %res = shufflevector <4 x double> %v, <4 x double> undef, <4 x i32> <i32 2, i32 2, i32 2, i32 2>
   %res1 = shufflevector<4 x double> %res, <4 x double> undef, <4 x i32> <i32 2, i32 0, i32 undef, i32 undef>
@@ -32,10 +30,8 @@ define <8 x float> @foo8(<8 x float> %v,
 ; AVX2-LABEL: foo8:
 ; AVX2:       # BB#0:
 ; AVX2-NEXT:    vmovshdup {{.*#+}} ymm0 = ymm0[1,1,3,3,5,5,7,7]
-; AVX2-NEXT:    vpermpd {{.*#+}} ymm1 = ymm0[2,2,2,2]
-; AVX2-NEXT:    vmovaps {{.*#+}} ymm0 = <2,0,u,u,5,1,3,7>
-; AVX2-NEXT:    vpermps %ymm1, %ymm0, %ymm0
-; AVX2-NEXT:    vmovapd %ymm1, (%rdi)
+; AVX2-NEXT:    vpermpd {{.*#+}} ymm0 = ymm0[2,2,2,2]
+; AVX2-NEXT:    vmovapd %ymm0, (%rdi)
 ; AVX2-NEXT:    retq
   %res = shufflevector <8 x float> %v, <8 x float> undef, <8 x i32> <i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5>
   %res1 = shufflevector<8 x float> %res, <8 x float> undef, <8 x i32> <i32 2, i32 0, i32 undef, i32 undef, i32 5, i32 1, i32 3, i32 7>


_______________________________________________
llvm-commits mailing list
llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170403/8f5e93a3/attachment.html>


More information about the llvm-commits mailing list