[LLVMdev] Proposal: Generic auto-vectorization and parallelization approach for LLVM and Polly

Sun Jan 9 06:23:43 PST 2011

On 9 January 2011 00:41, Tobias Grosser <grosser at fim.uni-passau.de> wrote:
> I do not get this one? Why would you just use a part of Polly?

Oh, you can. Just that maybe you don't need to go over all Polly if
openCL already has the vector semantics done in the front-end.

> Was I wrong by assuming LLVM will even today without any special pass generate correct code
> for the width OpenCL vectors. For me Polly just is an optimization,
> that could revisit the whole vectorization decision by looking at the big
> picture of the whole loop nest and generating a target specific loop nest
> with target specific vectorization (and openmp parallelisation).

I'm really not the OpenCL expert, but I hear that it's not as trivial
as one would think.

I know from generating NEON code in the front-end that any fiddling in
the semantics of the instructions could make the pattern-matching
algorithm to fail and you fall back to normal instructions.

I'm just trying to be cautions here not to fall into false hopes, but
someone with more knowledge in OpenCL would know better.

> I have seen the AMD presentation and believe we can generate efficient
> vector code for GPUs. Obviously with some adaptions, however I am convinced
> this is doable.

Great! Even better than I thought! ;)

cheers,
--renatoorder