<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div>Hi,</div><div><br></div>This patch improves the scheduling in the SLPVectorizer. Originally the SLPVectorizer could only move instructions of a bundle down. The new scheduling algorithm is more general, which means that (in theory) it should always schedule better than the old approach.<div><div><br></div><div>The change fixes: <<a href="rdar://problem/13677020">rdar://problem/13677020</a>> [SLP vectorizer] - SLP Vectorizer needs to be able to start trees at multiple user reductions.</div><div>Another example is <a href="http://llvm.org/bugs/show_bug.cgi?id=19657">Bug 19657</a>, where the new algorithm replaces the need for the heuristics for fixing it.</div><div><br></div><div>The compile time and execution times for the test suite look ok. Measured with the test suite on x86.</div><div><br></div><div><div>Compile time: approximately the same as before. The new algorithm fixes a compile time problem for the (synthetic) test SingleSource/UnitTest/Vector/constpool. This compiles much faster now.</div></div><div><br></div><div>Execution time: not too much changes. A few benchmarks show improvements when compiled with -O3 -flto.</div><div><br></div><div>Statistics: With the new scheduling algorithm the SLPVectorizer can generate about 9% more llvm vector instructions. Note that this is a static count, not the number of _executed_ instructions (So if not inside a critical loop it will not show up in execution time).</div><div><br></div></div><div>As this patch is not trivial, I plan to commit it after the 3.5 branch.</div><div><br></div><div>Please let me know your comments (Arnold and Nadav already looked at the patch).</div><div><br></div><div>Thanks,</div><div>Erik</div><div><br></div><div></div></body></html>