[LLVMdev] Loop vectorizer

Wed Oct 17 00:43:26 PDT 2012

----- Original Message -----
> From: "Ralf Karrenberg" <Chareos at gmx.de>
> To: llvmdev at cs.uiuc.edu
> Sent: Wednesday, October 17, 2012 2:13:08 AM
> Subject: Re: [LLVMdev] Loop vectorizer
> 
> Hi everybody,
> 
> On 10/17/12 12:32 AM, Hal Finkel wrote:
> >>> Do you have a plan for xforms to increase the amount of
> >>> vectorization?
> >>
> >> Yes. We will need to implement a predication phase and to design
> >> the
> >> interaction with other loop transformations. Also, this will have
> >> to
> >> work well with the cost model. We also need to think of a good way
> >> to
> >> detect early on if the transformations are likely to be effective,
> >> because we currently don't have a good way of undoing compiler
> >> transformations.
> >>
> >> I think that a simple if-converter will be a good place to start.
> >> What
> >> do you think ?
> >
> > Quick comment: IIRC, Ralf Karrenberg has already implemented this
> > (as part of his WVF project:
> > https://github.com/karrenberg/wfv/tree/llvm_30). It might be
> > worthwhile to work on cleaning up his implementation instead of
> > starting from scratch.
> >
> >   -Hal
> 
> WFV [1] does indeed include phases that correspond to full
> control-flow
> to data-flow conversion (not just if-conversion, it can flatten all
> kinds of control flow including nested loops with multiple exits
> etc.).
> 
> I am currently working on a full re-implementation of the WFV
> algorithm
> on top of the latest trunk.
> One part of it that is basically finished is an analysis pass that I
> call "vectorization analysis", which annotates a function (WFV works
> on
> entire functions) with metadata used during control-flow to data-flow
> conversion and instruction vectorization.

Is there a reason to use metadata here as opposed to just keeping state in the analysis pass?

> To give you a broad idea, this includes information like:
> - uniform/varying operation
> - same/consecutive/random index vector (for load/store)
> - aligned/unaligned index vector (for load/store)
> - operations that can not be vectorized (marked as "split", e.g.
> non-vectorizable types etc.)
> - operations that need to be split and guarded (e.g. unknown calls,
> stores)
> - mandatory/optional blocks (renamed from "divergent"/"non-divergent"
> in
> [2])
> - divergent/non-divergent loops

Sounds great!

> 
> Generally, it would be possible to implement a loop vectorizer on top
> of
> WFV simply by running a loop dependency analysis to determine if the
> loop in question is vectorizable, extracting the loop body into a
> function, running WFV on it, and inlining the call again.

I presume that we could refactor your code in combination with Nadav's work to directly vectorize loop bodies as well. Do you disagree?

> 
> I am willing to provide all of my implementation as soon as required.
> I hope to have mostly finished the rewrite at that point.

I encourage you to do this as soon as possible, otherwise I think that we might miss the opportunity to take advantage of your work in current development.

Thanks again,
Hal

> 
> Cheers,
> Ralf
> 
> 
> [1] "Whole-Function Vectorization", Karrenberg and Hack, CGO'11
> [2] "Improving Performance of OpenCL on CPUs", Karrenberg and Hack,
> CC'12
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory