[LLVMdev] SPMD Autovectorizer

Tue Jul 7 12:57:57 PDT 2015

On 7 July 2015 at 19:43, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi> wrote:
> I wouldn't, but simply utilize the parallel loop metadata that
> was originally designed for this purpose. What is done with
> that MD is up to other passes.

Yes, that's what I was suggesting. Sorry if that wasn't clear.

Now, IIRC, OpenCL had a lot of trouble from getting odd-sized vector
types in IR that the middle end would not understand, especially the
vectorizers. The solution, at least as of 2 years ago, was to
serialise everything and let the CL back-end to vectorize it.

Since CL back-ends are normally very different from each others, with
very different cost models, and some secretive implementation details,
it's very hard to do that generically in the LLVM middle-end.

Also, if you have different domains (like many SIMD cores), sharing
the operations across cores and across lanes may be completely
different than, say, pthreads vs. AVX, so the model may not even apply
here. If you need write back loops, non-trivial synchronization
barriers between cores and other crazy stuff, adding all that to the
vectorizer would be bloating code beyond usability. On the other hand,
maybe not.

I'd be interested in knowing what kind of changes we'd need to get the
OMP+SIMD model into CL-type code, if that's what you're proposing...

cheers,
--renato