[LLVMdev] Vectorization: Next Steps

Thu Feb 9 04:27:27 PST 2012

I have a super-simple test case 4x4 matrix * 4-vector which gets correctly
unrolled, but is not vectorized by -bb-vectorize. (I used llvm 3.1svn)
I attached the test case so you can see what is going wrong there.

2012/2/3 Hal Finkel <hfinkel at anl.gov>

> As some of you may know, I committed my basic-block autovectorization
> pass a few days ago. I encourage anyone interested to try it out (pass
> -vectorize to opt or -mllvm -vectorize to clang) and provide feedback.
> Especially in combination with -unroll-allow-partial, I have observed
> some significant benchmark speedups, but, I have also observed some
> significant slowdowns. I would like to share my thoughts, and hopefully
> get feedback, on next steps.
>
> 1. "Target Data" for vectorization - I think that in order to improve
> the vectorization quality, the vectorizer will need more information
> about the target. This information could be provided in the form of a
> kind of extended target data. This extended target data might contain:
>  - What basic types can be vectorized, and how many of them will fit
> into (the largest) vector registers
>  - What classes of operations can be vectorized (division, conversions /
> sign extension, etc. are not always supported)
>  - What alignment is necessary for loads and stores
>  - Is scalar-to-vector free?
>
> 2. Feedback between passes - We may to implement a closer coupling
> between optimization passes than currently exists. Specifically, I have
> in mind two things:
>  - The vectorizer should communicate more closely with the loop
> unroller. First, the loop unroller should try to unroll to preserve
> maximal load/store alignments. Second, I think it would make a lot of
> sense to be able to unroll and, only if this helps vectorization should
> the unrolled version be kept in preference to the original. With basic
> block vectorization, it is often necessary to (partially) unroll in
> order to vectorize. Even when we also have real loop vectorization,
> however, I still think that it will be important for the loop unroller
> to communicate with the vectorizer.
>  - After vectorization, it would make sense for the vectorization pass
> to request further simplification, but only on those parts of the code
> that it modified.
>
> 3. Loop vectorization - It would be nice to have, in addition to
> basic-block vectorization, a more-traditional loop vectorization pass. I
> think that we'll need a better loop analysis pass in order for this to
> happen. Some of this was started in LoopDependenceAnalysis, but that
> pass is not yet finished. We'll need something like this to recognize
> affine memory references, etc.
>
> I look forward to hearing everyone's thoughts.
>
>  -Hal
>
> --
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120209/d8a6d21a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matrix.c
Type: text/x-csrc
Size: 443 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120209/d8a6d21a/attachment.c>