[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

Pekka Jääskeläinen pekka.jaaskelainen at tut.fi
Fri Jan 25 09:16:19 PST 2013


On 01/25/2013 04:21 PM, Hal Finkel wrote:
> My point is that I specifically think that you should try it. I'm curious
> to see how what you come up with might apply to other use cases as well.

OK, attached is the first quick attempt towards this. I'm not
proposing committing this, but would like to get comments
to possibly move towards something committable.

It simply looks for a metadata named 'parallel_for' in any of the
instructions in the loop's header and assumes the loop is a parallel
one if such is found. This metadata is added by the pocl's wiloops
generation routine. It passes the pocl test suite when enabled but
probably cannot vectorize many kernels (at least) due to the missing
intra-kernel vector scalarizer.

Some known problems that need addressing:

- Metadata can only be attached to Instructions (not Loops or even
   BasicBlocks), therefore the brute force approach of marking all
   instructions in the header BB in hopes of that optimizers
   might retain at least one of them. E.g., a special intrinsics call
   might be a better solution.

- The loop header can be potentially shared with multilevel loops where the
   outer or inner levels might not be parallel. Not a problem in the pocl use
   case as the wiloops are fully parallel at all the three levels, but needs
   to be sorted out in a general solution.

   Perhaps it would be better to attach the metadata to the iteration
   count increment/check instruction(s) or similar to better identify the
   parallel (for) loop in question.

- Are there optimizations that might push code *illegally* to the parallel
   loop from the outside of it? If there's, e.g., a non-parallel loop inside
   a parallel loop, loop invariant code motion might move code from the
   inner loop to the parallel loop's body. That should be a safe optimization,
   to my understanding (it preservers the ordering semantics), but I wonder if
   there are others that might cause breakage.

-- 
Pekka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-3.3-loopvectorizer-parallel_for-metadata-detection.patch
Type: text/x-patch
Size: 1761 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130125/e4f8f53b/attachment.bin>


More information about the llvm-dev mailing list