[llvm] r182976 - Reapply with r182909 with a fix to the calculation of the new indices for

Mon Jun 3 16:33:15 PDT 2013

Hi Nick, 

Thanks for working on this.  I am glad you got it in, and like you said, I think that it can improve the vector code generation. 

> So I've updated it to 5 as you asked, but two points. LLVM has made a trade-off in many places (loop unrolling and the inliner's cost analysis for instance) that we spend more compile time optimizing code which uses vectors. Second, in order for this 2^n to ever happen, the function needs to have 2^n instructions in it. Reduced to a mere chain of five instructions, I hope the optimization still fires on real-world code, such as your backwards-iterating example where the loop vectorizer emits redundant shuffles. I don't want to lose the point of this optimization.

Consider this code:

   A
 /    \
 \    /
   B
 /    \
 \    /
   C
 /    \
 \    /
   D
 /    \
 \    /
   E

Assume that A-E are all binary operators and that both LHS and RHS use the same value.  If you start scanning E recursively then you will scan D twice, C 4 times, B 8 times, etc. 

If you think that 5 is too conservative then you can increase it, assuming that this is a fatal case. But I still care about the worse case scenario. 

Thanks,
Nadav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130603/a9253e7c/attachment.html>