[llvm] r299047 - [DAGCombine] A shuffle of a splat is always the splat itself
Rackover, Zvi via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 3 10:55:24 PDT 2017
Hi Andrea,
Thank you very much for pointing out this bug. I reverted this commit in r299368 and will give it another try with the correct transformation.
I have a patch with a similar simplification in InstSimplify under review, D31527<https://reviews.llvm.org/D31527>. Will make sure it is not suffering from the same issue and at the least add a negative test to cover the problem you showed here.
Thanks, Zvi
From: Andrea Di Biagio [mailto:andrea.dibiagio at gmail.com]
Sent: Monday, April 03, 2017 18:55
To: Rackover, Zvi <zvi.rackover at intel.com>
Cc: llvm-commits <llvm-commits at lists.llvm.org>
Subject: Re: [llvm] r299047 - [DAGCombine] A shuffle of a splat is always the splat itself
Hi Zvi (and Sanjay),
There is a problem with this patch.
Checking for `isSplat()` is not enough, we also need to check if the splat mask contains undef indices. Otherwise, we risk to incorrectly propagate undef to users of the folded splat.
Consider the following example:
t49: v4i32 = vector_shuffle<0,2,u,u> t58, undef:v4i32
t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32
A shuffle with interleaved `undef` indices is often introduced by the type legalizer when promoting from an illegal vector type to another types with same number of elements but wider element type.
In this case, the lower half of t49 is defined, while the upper half of t49 is undef.
This patch incorrectly simplifies that shuffle as follows:
t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32
This is problematic in the following case:
t51: i64 = extract_vector_elt t50, Constant:i64<0>
t50: v2i64 = bitcast t49
t49: v4i32 = vector_shuffle<0,2,u,u> t58, undef:v4i32
t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32
This patch makes the upper half of t51 undefined. However, in the original code, the entire t51 was defined.
Here is another interesting case:
t51: i32 = extract_vector_elt t49, Constant:i64<0>
t50: i32 = extract_vector_elt t49, Constant:i64<1>
t49: v4i32 = vector_shuffle<0,2,u,u> t58, undef:v4i32
t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32
Before this patch, both t50 and t51 were defined, and the extracted value was the same.
With this patch, we now have this:
t51: i32 = extract_vector_elt t58, Constant:i64<0>
t50: i32 = extract_vector_elt t58, Constant:i64<1>
t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32
Here, t50 is undef and can be folded away.
Here is a small reproducible:
// ===========================================================================
extern "C" int printf(const char*, ...);
using v2 = unsigned int __attribute__((__vector_size__(8)));
using v4 = unsigned int __attribute__((__vector_size__(16)));
int main() {
v4 alpha = (v4){1, 2, 3, 4};
v2 bravo = __builtin_shufflevector(alpha, alpha, 2, 2);
printf("<%x %x>\n", bravo[0], bravo[1]);
}
// ===========================================================================
At runtime, the executable built at -O0 (-target x86_64-unknown-linux-gnu) prints out: <3 4>
At -O2, it would print <3 3> (this is correct).
-Andrea
On Thu, Mar 30, 2017 at 2:42 AM, Zvi Rackover via llvm-commits <llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>> wrote:
Author: zvi
Date: Wed Mar 29 20:42:57 2017
New Revision: 299047
URL: http://llvm.org/viewvc/llvm-project?rev=299047&view=rev
Log:
[DAGCombine] A shuffle of a splat is always the splat itself
Summary:
Add a simplification:
shuffle (splat-shuffle), undef, M --> splat-shuffle
Fixes pr32449
Patch by Sanjay Patel
Reviewers: eli.friedman, RKSimon, spatel
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31426
Modified:
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=299047&r1=299046&r2=299047&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Wed Mar 29 20:42:57 2017
@@ -14643,6 +14643,12 @@ SDValue DAGCombiner::visitVECTOR_SHUFFLE
return DAG.getVectorShuffle(VT, SDLoc(N), N0, N1, NewMask);
}
+ // A shuffle of a splat is always the splat itself:
+ // shuffle (splat-shuffle), undef, M --> splat-shuffle
+ if (auto *N0Shuf = dyn_cast<ShuffleVectorSDNode>(N0))
+ if (N1.isUndef() && N0Shuf->isSplat())
+ return N0;
+
// If it is a splat, check if the argument vector is another splat or a
// build_vector.
if (SVN->isSplat() && SVN->getSplatIndex() < (int)NumElts) {
Modified: llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll?rev=299047&r1=299046&r2=299047&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll (original)
+++ llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll Wed Mar 29 20:42:57 2017
@@ -5,9 +5,8 @@
define <2 x double> @foo2(<2 x double> %v, <2 x double> *%p) nounwind {
; AVX2-LABEL: foo2:
; AVX2: # BB#0:
-; AVX2-NEXT: vpermilpd {{.*#+}} xmm1 = xmm0[1,1]
-; AVX2-NEXT: vpermilpd {{.*#+}} xmm0 = xmm1[1,0]
-; AVX2-NEXT: vmovapd %xmm1, (%rdi)
+; AVX2-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,1]
+; AVX2-NEXT: vmovapd %xmm0, (%rdi)
; AVX2-NEXT: retq
%res = shufflevector <2 x double> %v, <2 x double> undef, <2 x i32> <i32 1, i32 1>
%res1 = shufflevector<2 x double> %res, <2 x double> undef, <2 x i32> <i32 1, i32 undef>
@@ -18,9 +17,8 @@ define <2 x double> @foo2(<2 x double> %
define <4 x double> @foo4(<4 x double> %v, <4 x double> *%p) nounwind {
; AVX2-LABEL: foo4:
; AVX2: # BB#0:
-; AVX2-NEXT: vpermpd {{.*#+}} ymm1 = ymm0[2,2,2,2]
-; AVX2-NEXT: vpermpd {{.*#+}} ymm0 = ymm1[2,0,2,3]
-; AVX2-NEXT: vmovapd %ymm1, (%rdi)
+; AVX2-NEXT: vpermpd {{.*#+}} ymm0 = ymm0[2,2,2,2]
+; AVX2-NEXT: vmovapd %ymm0, (%rdi)
; AVX2-NEXT: retq
%res = shufflevector <4 x double> %v, <4 x double> undef, <4 x i32> <i32 2, i32 2, i32 2, i32 2>
%res1 = shufflevector<4 x double> %res, <4 x double> undef, <4 x i32> <i32 2, i32 0, i32 undef, i32 undef>
@@ -32,10 +30,8 @@ define <8 x float> @foo8(<8 x float> %v,
; AVX2-LABEL: foo8:
; AVX2: # BB#0:
; AVX2-NEXT: vmovshdup {{.*#+}} ymm0 = ymm0[1,1,3,3,5,5,7,7]
-; AVX2-NEXT: vpermpd {{.*#+}} ymm1 = ymm0[2,2,2,2]
-; AVX2-NEXT: vmovaps {{.*#+}} ymm0 = <2,0,u,u,5,1,3,7>
-; AVX2-NEXT: vpermps %ymm1, %ymm0, %ymm0
-; AVX2-NEXT: vmovapd %ymm1, (%rdi)
+; AVX2-NEXT: vpermpd {{.*#+}} ymm0 = ymm0[2,2,2,2]
+; AVX2-NEXT: vmovapd %ymm0, (%rdi)
; AVX2-NEXT: retq
%res = shufflevector <8 x float> %v, <8 x float> undef, <8 x i32> <i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5>
%res1 = shufflevector<8 x float> %res, <8 x float> undef, <8 x i32> <i32 2, i32 0, i32 undef, i32 undef, i32 5, i32 1, i32 3, i32 7>
_______________________________________________
llvm-commits mailing list
llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170403/8f5e93a3/attachment.html>
More information about the llvm-commits
mailing list