[PATCH] Add a Scalarize pass

Sun Nov 10 11:05:55 PST 2013

Hello Nadav,

On 11/10/2013 08:06 PM, Nadav Rotem wrote:
> The proposed "scalarizer" pass is only useful for domain specific
> languages - as part of a non traditional optimization pipe.  For
> example, scalarization in LLVM IR allows re-vectorization in OpenCL.

Yep. The only requirement for a "DSL" in this discussion is
that it can describe vector computation which produces LLVM IR
vector instructions. Even standard C extended with vector datatypes
goes to this category.

> However, the current implementation of the pass is not very useful
> for OpenCL because it does not scalarize function calls (such as
> DOT). But maybe someone can add this missing functionality one day.

This was the idea; improve it gradually with time. However, the pass
is useful already for OpenCL horizontal vectorization. Not all kernels
use built-ins, at least not in all of the data parallel regions.

Scalarization of function calls is an improvement that reverses
how vectorizers vectorize intrisincs.

There are two cases here: standard-specified built-ins which
should be probably LLVM intrinsics, and the user functions. But that's
another story we can discuss separately.

> The scalarizer pass is not useful for the traditional optimization
> pipe because the LLVM codegen can already scalarize vectors. It
> happens automatically for targets that don’t support vectors. The
> vectorizer will not generate new vector instructions for processors
> with no vector instructions. People who use intrinsics or inline
> assembly are responsible for their optimizations and the compiler is
> not expected to save them when they port their code to targets that
> don’t support SIMD.

I used a confusing term "intrinsics" earlier, sorry.

What I meant was the case where the programmer has used vector code
(e.g. using the Clang ext_vector_type attribute), but otherwise works on
any language.

Performance portability of such code can be improved as the
semantics of the LLVM IR vector instructions is well-known.

Scalarization can be useful outside type legalization, in
case the programmer has written vector code non-optimally for the (new)
target at hand and we want to give it a better shot using
an autovectorizer.

Target-specific instrinsics (in contrast to language-specific
intrinsics) and inline asm is a separate case, which I didn't
mean here.

-- 
--Pekka