[llvm] r299047 - [DAGCombine] A shuffle of a splat is always the splat itself

Andrea Di Biagio via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 3 08:54:54 PDT 2017


Hi Zvi (and Sanjay),

There is a problem with this patch.
Checking for `isSplat()` is not enough, we also need to check if the splat
mask contains undef indices. Otherwise, we risk to incorrectly propagate
undef to users of the folded splat.

Consider the following example:

t49: v4i32 = vector_shuffle<0,2,u,u> t58, undef:v4i32
 t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32
A shuffle with interleaved `undef` indices is often introduced by the
type legalizer when promoting from an illegal vector type to another
types with same number of elements but wider element type.

In this case, the lower half of t49 is defined, while the upper half
of t49 is undef.
This patch incorrectly simplifies that shuffle as follows:

t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32
This is problematic in the following case:

t51: i64 = extract_vector_elt t50, Constant:i64<0>
 t50: v2i64 = bitcast t49
  t49: v4i32 = vector_shuffle<0,2,u,u> t58, undef:v4i32
   t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32

This patch makes the upper half of t51 undefined. However, in the
original code, the entire t51 was defined.
Here is another interesting case:

t51: i32 = extract_vector_elt t49, Constant:i64<0>
t50: i32 = extract_vector_elt t49, Constant:i64<1>
 t49: v4i32 = vector_shuffle<0,2,u,u> t58, undef:v4i32
  t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32


Before this patch, both t50 and t51 were defined, and the extracted
value was the same.
With this patch, we now have this:

t51: i32 = extract_vector_elt t58, Constant:i64<0>
t50: i32 = extract_vector_elt t58, Constant:i64<1>
 t58: v4i32 = vector_shuffle<2,u,2,u> t13, undef:v4i32
Here, t50 is undef and can be folded away.

Here is a small reproducible:

// ===========================================================================
extern "C" int printf(const char*, ...);

using v2 = unsigned int __attribute__((__vector_size__(8)));
using v4 = unsigned int __attribute__((__vector_size__(16)));

int main() {
  v4 alpha = (v4){1, 2, 3, 4};
  v2 bravo = __builtin_shufflevector(alpha, alpha, 2, 2);
  printf("<%x %x>\n", bravo[0], bravo[1]);
}
// ===========================================================================

At runtime, the executable built at -O0 (-target
x86_64-unknown-linux-gnu) prints out: <3 4>

At  -O2, it would print <3 3> (this is correct).


-Andrea


On Thu, Mar 30, 2017 at 2:42 AM, Zvi Rackover via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> Author: zvi
> Date: Wed Mar 29 20:42:57 2017
> New Revision: 299047
>
> URL: http://llvm.org/viewvc/llvm-project?rev=299047&view=rev
> Log:
> [DAGCombine]  A shuffle of a splat is always the splat itself
>
> Summary:
> Add a simplification:
> shuffle (splat-shuffle), undef, M --> splat-shuffle
>
> Fixes pr32449
>
> Patch by Sanjay Patel
>
> Reviewers: eli.friedman, RKSimon, spatel
>
> Reviewed By: spatel
>
> Subscribers: llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D31426
>
> Modified:
>     llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>     llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll
>
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/
> CodeGen/SelectionDAG/DAGCombiner.cpp?rev=299047&r1=
> 299046&r2=299047&view=diff
> ============================================================
> ==================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Wed Mar 29
> 20:42:57 2017
> @@ -14643,6 +14643,12 @@ SDValue DAGCombiner::visitVECTOR_SHUFFLE
>        return DAG.getVectorShuffle(VT, SDLoc(N), N0, N1, NewMask);
>    }
>
> +  // A shuffle of a splat is always the splat itself:
> +  // shuffle (splat-shuffle), undef, M --> splat-shuffle
> +  if (auto *N0Shuf = dyn_cast<ShuffleVectorSDNode>(N0))
> +    if (N1.isUndef() && N0Shuf->isSplat())
> +      return N0;
> +
>    // If it is a splat, check if the argument vector is another splat or a
>    // build_vector.
>    if (SVN->isSplat() && SVN->getSplatIndex() < (int)NumElts) {
>
> Modified: llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
> CodeGen/X86/shuffle-of-splat-multiuses.ll?rev=299047&r1=
> 299046&r2=299047&view=diff
> ============================================================
> ==================
> --- llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/shuffle-of-splat-multiuses.ll Wed Mar 29
> 20:42:57 2017
> @@ -5,9 +5,8 @@
>  define <2 x double> @foo2(<2 x double> %v, <2 x double> *%p) nounwind {
>  ; AVX2-LABEL: foo2:
>  ; AVX2:       # BB#0:
> -; AVX2-NEXT:    vpermilpd {{.*#+}} xmm1 = xmm0[1,1]
> -; AVX2-NEXT:    vpermilpd {{.*#+}} xmm0 = xmm1[1,0]
> -; AVX2-NEXT:    vmovapd %xmm1, (%rdi)
> +; AVX2-NEXT:    vpermilpd {{.*#+}} xmm0 = xmm0[1,1]
> +; AVX2-NEXT:    vmovapd %xmm0, (%rdi)
>  ; AVX2-NEXT:    retq
>    %res = shufflevector <2 x double> %v, <2 x double> undef, <2 x i32>
> <i32 1, i32 1>
>    %res1 = shufflevector<2 x double> %res, <2 x double> undef, <2 x i32>
> <i32 1, i32 undef>
> @@ -18,9 +17,8 @@ define <2 x double> @foo2(<2 x double> %
>  define <4 x double> @foo4(<4 x double> %v, <4 x double> *%p) nounwind {
>  ; AVX2-LABEL: foo4:
>  ; AVX2:       # BB#0:
> -; AVX2-NEXT:    vpermpd {{.*#+}} ymm1 = ymm0[2,2,2,2]
> -; AVX2-NEXT:    vpermpd {{.*#+}} ymm0 = ymm1[2,0,2,3]
> -; AVX2-NEXT:    vmovapd %ymm1, (%rdi)
> +; AVX2-NEXT:    vpermpd {{.*#+}} ymm0 = ymm0[2,2,2,2]
> +; AVX2-NEXT:    vmovapd %ymm0, (%rdi)
>  ; AVX2-NEXT:    retq
>    %res = shufflevector <4 x double> %v, <4 x double> undef, <4 x i32>
> <i32 2, i32 2, i32 2, i32 2>
>    %res1 = shufflevector<4 x double> %res, <4 x double> undef, <4 x i32>
> <i32 2, i32 0, i32 undef, i32 undef>
> @@ -32,10 +30,8 @@ define <8 x float> @foo8(<8 x float> %v,
>  ; AVX2-LABEL: foo8:
>  ; AVX2:       # BB#0:
>  ; AVX2-NEXT:    vmovshdup {{.*#+}} ymm0 = ymm0[1,1,3,3,5,5,7,7]
> -; AVX2-NEXT:    vpermpd {{.*#+}} ymm1 = ymm0[2,2,2,2]
> -; AVX2-NEXT:    vmovaps {{.*#+}} ymm0 = <2,0,u,u,5,1,3,7>
> -; AVX2-NEXT:    vpermps %ymm1, %ymm0, %ymm0
> -; AVX2-NEXT:    vmovapd %ymm1, (%rdi)
> +; AVX2-NEXT:    vpermpd {{.*#+}} ymm0 = ymm0[2,2,2,2]
> +; AVX2-NEXT:    vmovapd %ymm0, (%rdi)
>  ; AVX2-NEXT:    retq
>    %res = shufflevector <8 x float> %v, <8 x float> undef, <8 x i32> <i32
> 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5>
>    %res1 = shufflevector<8 x float> %res, <8 x float> undef, <8 x i32>
> <i32 2, i32 0, i32 undef, i32 undef, i32 5, i32 1, i32 3, i32 7>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170403/f504098f/attachment.html>


More information about the llvm-commits mailing list