[PATCH] D75348: [DAGCombiner] recognize shuffle (shuffle X, Mask0), Mask --> splat X

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 28 08:51:55 PST 2020


spatel marked an inline comment as done.
spatel added inline comments.


================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:19298
+  }
+  assert(all_of(NewMask, [](int M) { return M == -1; }) ||
+         getSplatIndex(NewMask) != -1 && "Expected a splat mask");
----------------
RKSimon wrote:
> Check for isShuffleMaskLegal() ?
That's the conservatively correct approach, but I'm not sure if we want that limit. We know that the shuffle that we are creating is the same type that we started with, and it's a splat, so we treat that as a special-case that any target should handle efficiently:
https://llvm.org/docs/CodeGenerator.html#selectiondag-legalize-phase

I don't have a real-world example of this, but it's easy to hack an x86 test to show an improvement without the check:

```
define void @disguised_splat(<8 x i32> %x, <8 x i32>* %p0, <8 x i32>* %p1) {
  %shuf0 = shufflevector <8 x i32> %x, <8 x i32> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 0, i32 0, i32 3, i32 3, i32 4>
  store <8 x i32> %shuf0, <8 x i32>* %p0
  %shuf1 = shufflevector <8 x i32> %shuf0, <8 x i32> undef, <8 x i32> <i32 4, i32 3, i32 4, i32 3, i32 4, i32 4, i32 3, i32 3>
  store <8 x i32> %shuf1, <8 x i32>* %p1
  ret void
}

```
With mask legality check, we'll fail to detect the splat before the op gets split:

```
$ llc -o - shuf.ll
	shufps	$48, %xmm0, %xmm1       ## xmm1 = xmm1[0,0],xmm0[3,0]
	movaps	%xmm0, %xmm2
	shufps	$44, %xmm1, %xmm2       ## xmm2 = xmm2[0,3],xmm1[2,0]
	pshufd	$36, %xmm0, %xmm1       ## xmm1 = xmm0[0,1,2,0]
	movdqa	%xmm1, (%rdi)
	movaps	%xmm2, 16(%rdi)
	pshufd	$0, %xmm0, %xmm2        ## xmm2 = xmm0[0,0,0,0]
	shufps	$240, %xmm1, %xmm0      ## xmm0 = xmm0[0,0],xmm1[3,3]
	movdqa	%xmm2, (%rsi)
	movaps	%xmm0, 16(%rsi)

```
Without mask legality check, we can simplify before splitting:

```
$ llc -o - shuf.ll 
	shufps	$48, %xmm0, %xmm1       ## xmm1 = xmm1[0,0],xmm0[3,0]
	pshufd	$36, %xmm0, %xmm2       ## xmm2 = xmm0[0,1,2,0]
	pshufd	$0, %xmm0, %xmm3        ## xmm3 = xmm0[0,0,0,0]
	shufps	$44, %xmm1, %xmm0       ## xmm0 = xmm0[0,3],xmm1[2,0]
	movdqa	%xmm2, (%rdi)
	movaps	%xmm0, 16(%rdi)
	movdqa	%xmm3, (%rsi)
	movdqa	%xmm3, 16(%rsi)

```

The safe thing would be to add the mask legality predicate here, then remove it as a follow-up (after adding a test like above to show a diff)...then wait to see if there are any complaints. :)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75348/new/

https://reviews.llvm.org/D75348





More information about the llvm-commits mailing list