[PATCH] D10950: [SLPVectorizer] Try different vectorization factors and set max vector register size based on target

Sun Jul 5 21:22:59 PDT 2015

spatel created this revision.
spatel added reviewers: hfinkel, nadav, rengolin, arsenm.
spatel added a subscriber: llvm-commits.

This patch is based on discussion on the llvmdev mailing list:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/087405.html

and also solves:
https://llvm.org/bugs/show_bug.cgi?id=17170

As mentioned on the dev list and bug report, the new loop on the vector register size may cause an unacceptable compile-time increase, so this may need to be shielded by some more aggressive optimization specification. If not, this patch could be extended to other SLP pattern matchers that hardcode the vector register size (see FIXME comments).

The AMDGPU XFAIL test either should be fixed or removed if it's not valid any more?

http://reviews.llvm.org/D10950

Files:
  lib/Transforms/Vectorize/SLPVectorizer.cpp
  test/Transforms/SLPVectorizer/AMDGPU/simplebb.ll
  test/Transforms/SLPVectorizer/X86/cse.ll
  test/Transforms/SLPVectorizer/X86/gep.ll
  test/Transforms/SLPVectorizer/X86/loopinvariant.ll
  test/Transforms/SLPVectorizer/X86/pr19657.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D10950.29061.patch
Type: text/x-patch
Size: 8472 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150706/50bc3117/attachment.bin>