[LLVMdev] GSoC 2009: Auto-vectorization
Chris Lattner
clattner at apple.com
Tue Mar 31 23:18:28 PDT 2009
On Mar 31, 2009, at 5:27 PM, Andreas Bolka wrote:
> Hi all,
> I'd like to have a first stab at a loop-based auto-vectorization
> pass as
> part of 2009's Google Summer of Code program. As far as I can tell
> from
> searching the mailing list archives, no work on such an auto-
> vectorizer
> seems to be currently in progress.
Hi Andreas,
This would be a very interesting project! The most important part of
the proposal is to break it down into small pieces.
> Whereas auto-vectorization is a well-researched topic, complexity
> seems
> to quickly explode once more advanced loop constructs are to be
> supported. Therefore my aim is to start with the most minimal
> implementation possible, to explore the difficulties encountered in
> the specific context of LLVM and to build a foundation from which
> future
> work can progress.
This is a great approach. I'm very much in favor of solving small
problems first, instead of trying to solve all the worlds issues at
once. :)
> So, initially, I aim at supporting only the simplest loops such as:
>
> int a[256], b[256], c[256];
> for (int i = 0; i < 256; ++i)
> c[i] = a[i] + b[i];
>
> My goal is to implement the necessary analyses and transformations to
> turn IR corresponding to such code into IR utilizing vector
> instructions;
Sounds great. Some important steps:
1. We need an abstract dependence analysis interface (like we have for
AA) and one implementation (probably handling trivial loops like the
one above to start with). Many people have talked about this, but no
code has gone in yet.
2. We need some interface that can be implemented by a target machine
to describe what the vector capabilities of the machine are.
3. We need the transformation code. Starting with a simple loop
vectorizor (as opposed to an SLP system) would make sense to me.
Once the basics are in place, it can be feature-crept to support new
idioms (reductions, alignment analysis, etc). The first cut doesn't
need to provide an immediate speedup, but should have many testcases
that demonstrates the loop forms it can handle.
> i.e. the core of the desired result would look like:
>
> %va = load <256 x i32>* %a
> %vb = load <256 x i32>* %b
> %vc = add <256 x i32> %a, %b
> store <256 x i32> %vc, <256 x i32>* %c
Actually, IMO this is probably not the directly we want to go.
Vectorizing to extremely wide vectors like this isn't along the lines
of what we want: we should get something like the original loop
unrolled 4 times to do operations on <4 x i32>'s (for x86/ppc).
> Auto-vectorization has numerous dependencies on other
> analysis/transformation passes (loop normalization, constant/scalar
> expansion, dependence analysis, etc). A significant part of the
> proposed
> project is to evaluate already-available LLVM passes regarding their
> applicability to support auto-vectorization and to extend/implement
> what's missing.
Sounds good. I'm also fine with it only working in carefully
controlled circumstances. Depending on how ambitious you are, you
could also do a GSoC project just around a) building dependence
analysis interface + some *non*-trivial implementations, and b) build
some dep analysis clients around it (e.g. loop interchange, etc).
This may be a less risky and higher impact project in the short term,
if you are interested in it.
> As for my personal background, I'm a graduate student at Vienna
> University of Technology (TU Wien) with a strong background in
> compiler
> theory, acquired in a wide variety of undergraduate- and graduate-
> level
> courses. I recently implemented two simple compilers LLVM and
> conducted
> various experiments using LLVM's vector support. My general interest
> in
> data-parallel computation stems from a great fondness of APL
> (especially
> of its modern offspring K). While I'm familiar with the concepts
> behind
> auto-vectorization, I haven't implemented an auto-vectorizer before.
> I'm
> author of and contributor to several smaller open source projects.
Great! Either approach (building a very simple auto-vec system from
end-to-end, or building some non-trivial dep analysis implementations
+ a client or two) sound like very promising GSoC projects!
-Chris
More information about the llvm-dev
mailing list