[PATCH] AArch64: big endian constant vector pools

Fri Apr 11 03:19:40 PDT 2014

>    @var =  global <4 x i16> <i16 32, i16 33, i16 34, i16 35>
>    define <4 x i16> @foo() {
>      %val = load <4 x i16>* @var
>      %diff = sub <4 x i16> %val, < i16 32, i16 33, i16 34, i16 35 >
>      ret <4 x i16> %diff
>    }
>
>   The problem here lies in the ldr load of the variable, which is then used as a vector (that would be a separate issue).

It doesn't: your other patches are canonicalising on using ldr/str
instead of ld1/st1 to define LLVM's register-internal representation
of a vector. Using the ldr is perfectly sound.

Also, you can't reverse the .hword entries in the global because
that's entirely IR-level, and we've decided that the element at index
0 always has the lowest address. It's easy to come up with IR examples
that assume this and break if you try that transformation.

That leaves your suggested constpool change as the problem.

>   You may check the following c-test case that gets vectorized by clang, thus generating vector instructions:

This is rather a large example to illustrate a point on this scale. I
only see one block in SumArray vectorised by trunk Clang (-target
aarch64_be-linux-gnu -O3). It doesn't seem to contain any BUILD_VECTOR
nodes.

So at the moment, I'm afraid I can't tell you what the real issue with
that is. We're not helped by ld1/st1 still being generated. The sooner
that issue can be resolved the happier I'll be that we're all talking
about the same thing.

Cheers.

Tim.