[LLVMdev] NEON intrinsics preventing redundant load optimization?

Simon Taylor simontaylor1 at ntlworld.com
Mon Jan 5 02:14:53 PST 2015


On 4 Jan 2015, at 21:06, Tim Northover <t.p.northover at gmail.com> wrote:

>>> I’ve managed to replace the load/store intrinsics with pointer dereferences (along with a typedef to get the alignment correct). This generates 100% the same IR + asm as the auto-vectorized C version (both using -O3), and works with the toolchain in the latest XCode. Are there any concerns around doing this?
>> 
>> My view is that you should only use intrinsics where the language has
>> no semantics for it. Since this is not the case, using pointers is
>> probably the best way, anyway.
> 
> I think dereferencing pointers is explicitly discouraged in the
> documentation for portability reasons. It may well have issues on
> wrong-endian targets.

The ARM ACLE docs recommend against the GCC extension that allows an initializer list because of potential endianness issues:
float32x4_t values = {1, 2, 3, 4};

I don’t recall seeing anything about pointer dereferencing, but it may have the same issues. I’m a bit hazy on endianness issues with NEON anyway (in terms of element numbering, casts between types, etc) but it seems like all the smartphone platform ABIs are defined to be little-endian so I haven’t spent too much time worrying about it.

Simon





More information about the llvm-dev mailing list