[LLVMdev] Loop vectorizer
hfinkel at anl.gov
Wed Oct 17 00:43:26 PDT 2012
----- Original Message -----
> From: "Ralf Karrenberg" <Chareos at gmx.de>
> To: llvmdev at cs.uiuc.edu
> Sent: Wednesday, October 17, 2012 2:13:08 AM
> Subject: Re: [LLVMdev] Loop vectorizer
> Hi everybody,
> On 10/17/12 12:32 AM, Hal Finkel wrote:
> >>> Do you have a plan for xforms to increase the amount of
> >>> vectorization?
> >> Yes. We will need to implement a predication phase and to design
> >> the
> >> interaction with other loop transformations. Also, this will have
> >> to
> >> work well with the cost model. We also need to think of a good way
> >> to
> >> detect early on if the transformations are likely to be effective,
> >> because we currently don't have a good way of undoing compiler
> >> transformations.
> >> I think that a simple if-converter will be a good place to start.
> >> What
> >> do you think ?
> > Quick comment: IIRC, Ralf Karrenberg has already implemented this
> > (as part of his WVF project:
> > https://github.com/karrenberg/wfv/tree/llvm_30). It might be
> > worthwhile to work on cleaning up his implementation instead of
> > starting from scratch.
> > -Hal
> WFV  does indeed include phases that correspond to full
> to data-flow conversion (not just if-conversion, it can flatten all
> kinds of control flow including nested loops with multiple exits
> I am currently working on a full re-implementation of the WFV
> on top of the latest trunk.
> One part of it that is basically finished is an analysis pass that I
> call "vectorization analysis", which annotates a function (WFV works
> entire functions) with metadata used during control-flow to data-flow
> conversion and instruction vectorization.
Is there a reason to use metadata here as opposed to just keeping state in the analysis pass?
> To give you a broad idea, this includes information like:
> - uniform/varying operation
> - same/consecutive/random index vector (for load/store)
> - aligned/unaligned index vector (for load/store)
> - operations that can not be vectorized (marked as "split", e.g.
> non-vectorizable types etc.)
> - operations that need to be split and guarded (e.g. unknown calls,
> - mandatory/optional blocks (renamed from "divergent"/"non-divergent"
> - divergent/non-divergent loops
> Generally, it would be possible to implement a loop vectorizer on top
> WFV simply by running a loop dependency analysis to determine if the
> loop in question is vectorizable, extracting the loop body into a
> function, running WFV on it, and inlining the call again.
I presume that we could refactor your code in combination with Nadav's work to directly vectorize loop bodies as well. Do you disagree?
> I am willing to provide all of my implementation as soon as required.
> I hope to have mostly finished the rewrite at that point.
I encourage you to do this as soon as possible, otherwise I think that we might miss the opportunity to take advantage of your work in current development.
>  "Whole-Function Vectorization", Karrenberg and Hack, CGO'11
>  "Improving Performance of OpenCL on CPUs", Karrenberg and Hack,
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev