[PATCH] D15096: [InstCombine] transform more extract/insert pairs into shuffles (PR2109)

Sanjay Patel via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 30 15:52:46 PST 2015


spatel created this revision.
spatel added reviewers: t.p.northover, hfinkel, RKSimon.
spatel added a subscriber: llvm-commits.
Herald added a subscriber: aemerson.

This is an extension of the shuffle combining from r203229:
http://reviews.llvm.org/rL203229

The idea is to widen a short input vector with undef elements so the existing shuffle transform for extract/insert can kick in.

The motivation is to finally solve PR2109:
https://llvm.org/bugs/show_bug.cgi?id=2109

For that example, the IR becomes:
  %1 = bitcast <2 x i32>* %P to <2 x float>*
  %ld1 = load <2 x float>, <2 x float>* %1, align 8
  %2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
  %i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
  ret <4 x float> %i2

And x86 SSE output improves from:
  movq	(%rdi), %xmm1           ## xmm1 = mem[0],zero
  movdqa	%xmm1, %xmm2
  shufps	$229, %xmm2, %xmm2      ## xmm2 = xmm2[1,1,2,3]
  shufps	$48, %xmm0, %xmm1       ## xmm1 = xmm1[0,0],xmm0[3,0]
  shufps	$132, %xmm1, %xmm0      ## xmm0 = xmm0[0,1],xmm1[0,2]
  shufps	$32, %xmm0, %xmm2       ## xmm2 = xmm2[0,0],xmm0[2,0]
  shufps	$36, %xmm2, %xmm0       ## xmm0 = xmm0[0,1],xmm2[2,0]
  retq

To the almost optimal:
  movhpd	(%rdi), %xmm0

Note: There's a tension in the existing transform related to generating arbitrary shufflevector masks. We avoid that in other places in InstCombine because we're scared that codegen can't handle strange masks, but it looks like we're ok with producing those here. I purposely chose weird insert/extract indexes for the regression tests to see the effect in these cases. For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or better for these examples.

Note 2: The 2x shufflevector mask limitation is not in the IR Language Reference shufflevector instruction definition, but it is encoded in ShuffleVectorInst::isValidOperands().

http://reviews.llvm.org/D15096

Files:
  lib/Transforms/InstCombine/InstCombineVectorOps.cpp
  test/Transforms/InstCombine/insert-extract-shuffle.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D15096.41443.patch
Type: text/x-patch
Size: 6530 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151130/0e8a0759/attachment.bin>


More information about the llvm-commits mailing list