[LLVMdev] GSoC 2009: Auto-vectorization

Tue Mar 31 17:27:37 PDT 2009

Hi all,

I'd like to have a first stab at a loop-based auto-vectorization pass as
part of 2009's Google Summer of Code program. As far as I can tell from
searching the mailing list archives, no work on such an auto-vectorizer
seems to be currently in progress.

Whereas auto-vectorization is a well-researched topic, complexity seems
to quickly explode once more advanced loop constructs are to be
supported. Therefore my aim is to start with the most minimal
implementation possible, to explore the difficulties encountered in
the specific context of LLVM and to build a foundation from which future
work can progress.

So, initially, I aim at supporting only the simplest loops such as:

    int a[256], b[256], c[256];
    for (int i = 0; i < 256; ++i)
      c[i] = a[i] + b[i];

My goal is to implement the necessary analyses and transformations to
turn IR corresponding to such code into IR utilizing vector
instructions; i.e. the core of the desired result would look like:

    %va = load <256 x i32>* %a
    %vb = load <256 x i32>* %b
    %vc = add <256 x i32> %a, %b
    store <256 x i32> %vc, <256 x i32>* %c

After this transformation, the code could be left as is (which would
generate a fully unrolled vector-add, as long as the vector length
doesn't exceed the lowering capabilities of the code generator), or
strip-mined into a loop using fixed-size vectors (preferably
corresponding to the SIMD register width of the target architecture).

Auto-vectorization has numerous dependencies on other
analysis/transformation passes (loop normalization, constant/scalar
expansion, dependence analysis, etc). A significant part of the proposed
project is to evaluate already-available LLVM passes regarding their
applicability to support auto-vectorization and to extend/implement
what's missing.

As for my personal background, I'm a graduate student at Vienna
University of Technology (TU Wien) with a strong background in compiler
theory, acquired in a wide variety of undergraduate- and graduate-level
courses. I recently implemented two simple compilers LLVM and conducted
various experiments using LLVM's vector support. My general interest in
data-parallel computation stems from a great fondness of APL (especially
of its modern offspring K). While I'm familiar with the concepts behind
auto-vectorization, I haven't implemented an auto-vectorizer before. I'm
author of and contributor to several smaller open source projects.

I appreciate any suggestions and would be very excited if someone is
interested in mentoring this.

-- 
Best regards,
Andreas Bolka