[LLVMdev] GSoC 2009: Auto-vectorization

Tue Mar 31 22:25:35 PDT 2009

Nick Lewycky wrote:
> Andreas Bolka wrote:
>> Hi all,
>>
>> I'd like to have a first stab at a loop-based auto-vectorization pass as
>> part of 2009's Google Summer of Code program. As far as I can tell from
>> searching the mailing list archives, no work on such an auto-vectorizer
>> seems to be currently in progress.
> 
> Hi Andreas,
> 
> Actually, you'd be the third person to try writing one, with myself 
> being the second. :) Don't let that discourage you, I don't think 
> anyone's actively working on it now.
> 
>> Whereas auto-vectorization is a well-researched topic, complexity seems
>> to quickly explode once more advanced loop constructs are to be
>> supported. Therefore my aim is to start with the most minimal
>> implementation possible, to explore the difficulties encountered in
>> the specific context of LLVM and to build a foundation from which future
>> work can progress.
> 
> There's two types of autovectorization, SLP (superword level 
> parallelism) and ILP (instruction level parallelism). You can do ILP 
> which is loop-ignorant autovectorization of straight-line code which 
> turns out to be the code that runs inside the loop.

Pardon me. I've been disabused by Owen Anderson for writing the above.

SLP is loop-ignorant. ILP is a rather generic term and probably 
shouldn't be described as a type of autovectorization. Cross-iteration 
and intra-iteration better describes the distinction.

In any event, the paper I was thinking of when I wrote the above is:
http://www.cag.lcs.mit.edu/slp/SLP-PLDI-2000.pdf

My concern being that writing both loop dependence analysis and loop 
autovectorization would be difficult in the time span alloted for GSoC, 
and you may want to consider just doing SLP.

Nick

> 
>> So, initially, I aim at supporting only the simplest loops such as:
>>
>>     int a[256], b[256], c[256];
>>     for (int i = 0; i < 256; ++i)
>>       c[i] = a[i] + b[i];
>>
>> My goal is to implement the necessary analyses and transformations to
>> turn IR corresponding to such code into IR utilizing vector
>> instructions; i.e. the core of the desired result would look like:
>>
>>     %va = load <256 x i32>* %a
>>     %vb = load <256 x i32>* %b
>>     %vc = add <256 x i32> %a, %b
>>     store <256 x i32> %vc, <256 x i32>* %c
>>
>> After this transformation, the code could be left as is (which would
>> generate a fully unrolled vector-add, as long as the vector length
>> doesn't exceed the lowering capabilities of the code generator), or
>> strip-mined into a loop using fixed-size vectors (preferably
>> corresponding to the SIMD register width of the target architecture).
>>
>> Auto-vectorization has numerous dependencies on other
>> analysis/transformation passes (loop normalization, constant/scalar
>> expansion, dependence analysis, etc). A significant part of the proposed
>> project is to evaluate already-available LLVM passes regarding their
>> applicability to support auto-vectorization and to extend/implement
>> what's missing.
> 
> There's no loop dependence analysis in LLVM yet. How are you planning to 
> implement it? Can you point to a paper, or write up a plain English 
> description?
> 
>> As for my personal background, I'm a graduate student at Vienna
>> University of Technology (TU Wien) with a strong background in compiler
>> theory, acquired in a wide variety of undergraduate- and graduate-level
>> courses. I recently implemented two simple compilers LLVM and conducted
>> various experiments using LLVM's vector support. My general interest in
>> data-parallel computation stems from a great fondness of APL (especially
>> of its modern offspring K). While I'm familiar with the concepts behind
>> auto-vectorization, I haven't implemented an auto-vectorizer before. I'm
>> author of and contributor to several smaller open source projects.
>>
>> I appreciate any suggestions and would be very excited if someone is
>> interested in mentoring this.
> 
> I'm interested, but I'd like to see more details about exactly how (ie., 
> with what algorithm) you intend to use to find and transform code first.
> 
> Nick
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>