[LLVMdev] Is there pass to break down <4 x float> to scalars

Pekka Jääskeläinen pekka.jaaskelainen at tut.fi
Fri Oct 25 07:14:22 PDT 2013


Hi,

Great to see someone working on this. This will benefit the performance
portability goal of the pocl's OpenCL kernel compiler. It has been one of
the low hanging fruits in improving its implicit WG vectorization
applicability.

The use case there is that sometimes it makes sense to devectorize
the explicitly used vector datatype code of OpenCL kernels in order to make
better opportunities for the "horizontal" vectorization across work-items
inside the work-group.

E.g., the last time I checked, the inner loop vectorizer (which pocl exploits)
just refused to vectorize loops with vector instructions. It might not
be so drastic with the SLP or the BB vectorizer, but in general, it might
make sense to let the vectorizer to do the decisions on how to map the
parallel (scalar) operations best to the vector hardware, and just help it
with the parallelism knowledge propagated from the parallel program.
One can then fall back to the original (hand vectorized) code in case
the autovectorization failed, to get some vector hardware utilization
still.

On 10/25/2013 04:15 PM, Richard Sandiford wrote:
> To be honest I hadn't really thought about targets with vector units
> at all.:-)   I was just assuming that we'd want to keep vector operations
> together if there's native support.  E.g. ISTR comments about not wanting
> to rewrite vec_selects because it can be hard to synthesise optimal
> sequences from a single canonical form.  But I might have got that wrong.
> Also, llvmpipe uses intrinsics for some things, so it might be strange
> if we decompose IR operations but leave the intriniscs alone.

The issue of intrinsics and vectorization was discussed some time ago.
There it might be better to devectorize to a scalar version of the
instrinsics (if available) as at least the loopvectorizer can vectorize
also a set of selected intrinsics, and the target might have direct
machine instructions for those (which could not be exploited easily from
"inlined" versions).

-- 
Pekka




More information about the llvm-dev mailing list