[PATCH] D10950: [SLPVectorizer] Try different vectorization factors and set max vector register size based on target
Sanjay Patel
spatel at rotateright.com
Sun Jul 5 21:22:59 PDT 2015
spatel created this revision.
spatel added reviewers: hfinkel, nadav, rengolin, arsenm.
spatel added a subscriber: llvm-commits.
This patch is based on discussion on the llvmdev mailing list:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/087405.html
and also solves:
https://llvm.org/bugs/show_bug.cgi?id=17170
As mentioned on the dev list and bug report, the new loop on the vector register size may cause an unacceptable compile-time increase, so this may need to be shielded by some more aggressive optimization specification. If not, this patch could be extended to other SLP pattern matchers that hardcode the vector register size (see FIXME comments).
The AMDGPU XFAIL test either should be fixed or removed if it's not valid any more?
http://reviews.llvm.org/D10950
Files:
lib/Transforms/Vectorize/SLPVectorizer.cpp
test/Transforms/SLPVectorizer/AMDGPU/simplebb.ll
test/Transforms/SLPVectorizer/X86/cse.ll
test/Transforms/SLPVectorizer/X86/gep.ll
test/Transforms/SLPVectorizer/X86/loopinvariant.ll
test/Transforms/SLPVectorizer/X86/pr19657.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D10950.29061.patch
Type: text/x-patch
Size: 8472 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150706/50bc3117/attachment.bin>
More information about the llvm-commits
mailing list