[LLVMdev] Vectorization: Next Steps
r.jordans at tue.nl
Thu Feb 9 02:04:10 PST 2012
On 02/09/2012 02:26 AM, Chris Lattner wrote:
>>> I think that a loop vectorizor and a basic block vectorizer both make perfect sense and are important for different classes of code. However, I don't think that we should go down the path of trying to use a "basic block vectorizor + loop unrolling" serve the purpose of a loop vectorizer. Trying to make a BBVectorizer and a loop unroller play together will be really fragile, because they'll both have to duplicate the same metrics (otherwise, for example, you'd unroll a loop that isn't vectorizable). This will also be a huge hit to compile time.
>> The only problem with this comes from loops for which unrolling is
>> necessary to expose vectorization because the memory access pattern is
>> too complicated to model in more-traditional loop vectorization. This
>> generally is useful only in cases with a large number of flops per
>> memory operation (or maybe integer ops too, but I have less experience
>> with those), so maybe we can design a useful heuristic to handle those
>> cases. That having been said, unroll+(failed vectorize)+rollback is not
>> really any more expensive at compile time than unroll+(failed vectorize)
>> except that the resulting code would run faster (actually it is cheaper
>> to compile because the optimization/compilation of the unvectorized
>> unrolled loop code takes longer than the non-unrolled loop). There might
>> be a clean way of doing this; I'll think about it.
> I don't really understand the issue here, can you elaborate on when this might be a win? I really don't like "speculatively unroll, try to do something, then reroll". That is terrible for compile time and just strikes me as poor design :-)
This seems a bit related to Resource-Directed Loop Pipelining  to me.
RDLP uses loop unrolling in combination with loop shifting (or peeling)
to map a loop-body to a parallel architecture. It was originally focused
on VLIW like parallelism but I think that a similar technique may be
useful for vectorization.
More information about the llvm-dev