[cfe-dev] Vectors with non-power-of-2 elements
Martin J. O'Riordan via cfe-dev
cfe-dev at lists.llvm.org
Mon Jan 4 08:52:49 PST 2016
We are experiencing a number of problems with handling vectors whose number
of elements is not a power-of-2, and in particular 3-element vectors. With
the following example:
#include <stdio.h>
typedef float __attribute__((ext_vector_type(3))) float3; // Clang Only
// typedef float __attribute__((vector_size(12))) float3; // For GCC
volatile float3 v3f32;
int main() {
float3 f3 = { 1.1f, 2.2f, 3.3f };
printf ( "Sizeof 'float3' is %d\n", sizeof(float3));
v3f32 = f3; // Force a write
f3 = v3f32; // Force a read
return 0;
}
'clang' reports the size as being 16-bytes, and transacts the object to and
from memory as 16-bytes. Also, when vectors of 3-elements are passed with
VARARGS, I have to use 'va_arg' with the 4-element variant or the compiler
will crash when validating the types.
We have no special code for handling 3-element vectors, and I have
subsequently tried this with the X86 binary distributions of 'clang' v3.5.2
and v3.7.0 and I observe the same issue as we are seeing in our SHAVE
target.
With 'gcc' and the 'element_size' variant, I get an error complaining that
the number of bytes is not a power-of-2, but a comment in
'tools/clang/lib/Sema/SemaType.cpp' says:
// Success! Instantiate the vector type, the number of elements is > 0,
and
// not required to be a power of 2, unlike GCC.
which would lead me to believe that 3-element vectors should be fine.
Is there something I have to describe in my target machine implementation or
target transform information that will allow 'float3' above be 12-bytes, and
to transact to memory using 12-byte transfers? Or is this a more general
bug in the implementation? I have experimented with DataLayout changes such
as:
-v96:32
-v48:16
-v12:8
but this just results in crashes in LLVM.
With the types of algorithms that are developed for our platform, 3-element
vectors are quite common. Less common, but also fairly frequent are
5-element and 7-element vectors (pixel analysis and 2D convolutions).
OpenCL provides for 2-, 3-, 4-, 8- and 16-element vectors, but it is not
clear to me that the 3-element vector support for OpenCL is working either.
Longer term, it would be valuable to us if Clang/LLVM supported 3-, 5- and
7-element vectors as first-class citizens of the compiler (e.g. v3f32, v7i8,
etc.), but that is a topic for another day. For now I am happy if I can get
the 'v3X' types working.
Thanks,
MartinO - Movidius Ltd.
More information about the cfe-dev
mailing list