[llvm-commits] [llvm] r171798 - in /llvm/trunk: lib/Transforms/Vectorize/LoopVectorize.cpp test/Transforms/LoopVectorize/X86/unroll-small-loops.ll

Mon Jan 7 22:35:18 PST 2013

On Jan 7, 2013, at 9:29 PM, Chris Lattner <clattner at apple.com> wrote:

> if we don't need a scalar cleanup loop (e.g. because the vectorization factor of a loop is known to subdivide the constant tripcount), isn't it always beneficial to do the vectorization, even if the new tripcount is low?

Yes, I agree.  This is something that I haven't gotten to. 

I am now looking at a few examples where we vectorize and unroll too much, and I haven't found right solution yet. Until now I worked to increase the iteration 'width', by vectorizing and unrolling. In some cases we handle 32 floats in one iteration (v8f32, unrolled 4 times). I assumed that 'n' was high and that the cost of the scalar post-loop is negligible compared to the vectorized loop.  But if we widen the loop to 32-elements, then the cost of the scalar loop is potentially 31-scalar operations. In some cases we only discover the length of the array at runtime, and it can be small. I think that reckless-widening of loops is not the right way to go, but I am still not sure what to do.

Thanks,
Nadav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130107/1d0cdf41/attachment.html>