[PATCH] D76623: [VectorCombine] try to form a better extractelement
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 23 09:48:40 PDT 2020
spatel created this revision.
spatel added reviewers: lebedev.ri, RKSimon, jgorbe.
Herald added subscribers: hiraditya, mcrosier.
Extracting to the same index that we are going to insert back into allows forming select ("blend") shuffles and enables further transforms.
Admittedly, this is a quick-fix for a more general problem that I'm hoping to solve by adding transforms for patterns that start with an insertelement.
But this might resolve some regressions known to be caused by the extract-extract transform (although I have not gotten more details on those yet).
In the motivating case from PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724
The combination of subsequent instcombine and codegen transforms gets us this improvement:
vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3]
vhaddps %xmm1, %xmm1, %xmm4
vmovshdup %xmm1, %xmm3 ## xmm3 = xmm1[1,1,3,3]
vaddps %xmm0, %xmm2, %xmm0
vaddps %xmm1, %xmm3, %xmm1
vshufps $200, %xmm4, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm4[0,3]
vinsertps $177, %xmm1, %xmm0, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm1[2]
-->
vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3]
vhaddps %xmm1, %xmm1, %xmm1
vaddps %xmm0, %xmm2, %xmm0
vshufps $200, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm1[0,3]
https://reviews.llvm.org/D76623
Files:
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
llvm/test/Transforms/VectorCombine/X86/extract-binop.ll
llvm/test/Transforms/VectorCombine/X86/extract-cmp.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D76623.252069.patch
Type: text/x-patch
Size: 7346 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200323/e2330da1/attachment.bin>
More information about the llvm-commits
mailing list