[llvm] r182976 - Reapply with r182909 with a fix to the calculation of the new indices for
Nadav Rotem
nrotem at apple.com
Mon Jun 3 16:33:15 PDT 2013
Hi Nick,
Thanks for working on this. I am glad you got it in, and like you said, I think that it can improve the vector code generation.
> So I've updated it to 5 as you asked, but two points. LLVM has made a trade-off in many places (loop unrolling and the inliner's cost analysis for instance) that we spend more compile time optimizing code which uses vectors. Second, in order for this 2^n to ever happen, the function needs to have 2^n instructions in it. Reduced to a mere chain of five instructions, I hope the optimization still fires on real-world code, such as your backwards-iterating example where the loop vectorizer emits redundant shuffles. I don't want to lose the point of this optimization.
Consider this code:
A
/ \
\ /
B
/ \
\ /
C
/ \
\ /
D
/ \
\ /
E
Assume that A-E are all binary operators and that both LHS and RHS use the same value. If you start scanning E recursively then you will scan D twice, C 4 times, B 8 times, etc.
If you think that 5 is too conservative then you can increase it, assuming that this is a fatal case. But I still care about the worse case scenario.
Thanks,
Nadav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130603/a9253e7c/attachment.html>
More information about the llvm-commits
mailing list