[PATCH] D94098: [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'.
Alexandros Lamprineas via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Jul 19 15:54:51 PDT 2021
labrinea added a comment.
> struct foo { unsigned long long x[8]; };
> void store(int *in, void *addr)
> {
>
> struct foo x = { in[0], in[1], in[4], in[16], in[25], in[36], in[49], in[64] };
> __asm__ volatile ("st64b %0,[%1]" : : "r" (x), "r" (addr) : "memory" );
>
> }
For this particular example if we pass the asm operands as i512 the compiler generates the following, which doesn't look bad.
ldpsw x2, x3, [x0]
ldrsw x4, [x0, #16]
ldrsw x5, [x0, #64]
ldrsw x6, [x0, #100]
ldrsw x7, [x0, #144]
ldrsw x8, [x0, #196]
ldrsw x9, [x0, #256]
//APP
st64b x2, [x1]
//NO_APP
Looking at the IR, it seems that SROA gets in the way. It loads all eight i32 values and constructs the i512 operand by performing bitwise operations on them. So I was wrong saying that the load of an i512 value won't get optimized.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94098/new/
https://reviews.llvm.org/D94098
More information about the cfe-commits
mailing list