[PATCH] D23897: [SelectionDAG] Generate vector_shuffle nodes for undersized result vector sizes
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 26 17:46:40 PDT 2016
mkuper updated this revision to Diff 69452.
mkuper added a comment.
Fixed bug, and added test cases.
And, fortunately (thanks again, Simon!) while adding the tests I discovered that it regresses AVX2 codegen in some cases (even though it improves AVX codegen for the same tests). So this can't really be committed as is.
The difference comes down to which pattern shuffle lowering likes more:
t2: v4i32,ch = CopyFromReg t0, Register:v4i32 %vreg0
t4: v4i32,ch = CopyFromReg t0, Register:v4i32 %vreg1
t8: v8i32 = concat_vectors t2, t4
t10: v8i32 = vector_shuffle<0,6,3,6,1,7,4,u> t8, undef:v8i32
Or
t2: v4i32,ch = CopyFromReg t0, Register:v4i32 %vreg0
t8: v8i32 = concat_vectors t2, undef:v4i32
t4: v4i32,ch = CopyFromReg t0, Register:v4i32 %vreg1
t9: v8i32 = concat_vectors t4, undef:v4i32
t10: v8i32 = vector_shuffle<0,10,3,10,1,11,8,u> t8, t9
Before this patch, the builder created a sequence of inserts and extracts that eventually got combined into the first version - which is preferred by AVX2, because it can efficiently lower it with VPERM. With this patch, we directly create the second version, which AVX handles better, but AVX2 chokes on. I could modify the patch to directly create the first version instead, which would solve the original problem in PR29025 - but then we won't get the AVX gains for i32 vectors.
So we'll need both a builder change (that generates one of the two patterns above) and either a combine that picks the better version based on profitability, or lowering improvements. I'm still rather fuzzy on the details, though.
https://reviews.llvm.org/D23897
Files:
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
test/CodeGen/X86/oddshuffles.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D23897.69452.patch
Type: text/x-patch
Size: 21989 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160827/f507c031/attachment.bin>
More information about the llvm-commits
mailing list