[LLVMdev] How to vectorize a vector type cast?

Gurd, Preston preston.gurd at intel.com
Tue Feb 28 14:11:19 PST 2012

Since Clang does not seem to allow type casts, such as uchar4 to float4, between vector types, it seems it is necessary to write them as element by element conversions, such as

typedef float float4 __attribute__((ext_vector_type(4)));
typedef unsigned char uchar4 __attribute__((ext_vector_type(4)));

float4 to_float4(uchar4 in)
  float4 out = {in.x, in.y, in.z, in.w};
  return out;

Running this code through "clang -c -emit-llvm" and then through "opt -O2 -S", produces the following IR:

define <4 x float> @to_float4(i32 %in.coerce) nounwind uwtable readnone {
  %0 = bitcast i32 %in.coerce to <4 x i8>
  %1 = extractelement <4 x i8> %0, i32 0
  %conv = uitofp i8 %1 to float
  %vecinit = insertelement <4 x float> undef, float %conv, i32 0
  %2 = extractelement <4 x i8> %0, i32 1
  %conv2 = uitofp i8 %2 to float
  %vecinit3 = insertelement <4 x float> %vecinit, float %conv2, i32 1
  %3 = extractelement <4 x i8> %0, i32 2
  %conv4 = uitofp i8 %3 to float
  %vecinit5 = insertelement <4 x float> %vecinit3, float %conv4, i32 2
  %4 = extractelement <4 x i8> %0, i32 3
  %conv6 = uitofp i8 %4 to float
  %vecinit7 = insertelement <4 x float> %vecinit5, float %conv6, i32 3
  ret <4 x float> %vecinit7

Which does the cast as a sequence of scalar operations, whereas it could be done as

   %1 = uitofp <4 x i8> %0 to <4 x float>
   ret <4 x float> %1

It seemed to me that the recently committed basic block vectorizer might be able to do this kind of optimization, but the current version does not do so.

Is this optimization the kind of thing that the bb-vectorizer is intended to be able to do? And, if so, do you have any suggestions as to how that may be done? Or, if not, can you suggest another possible way to parallelize this kind of code?



Preston Gurd <preston.gurd at intel.com>
  Intel Waterloo

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120228/014cce40/attachment.html>

More information about the llvm-dev mailing list