[PATCH] D100499: [AArch64] Neon Polynomial vadd Intrinsic Fix
David Spickett via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Apr 16 02:47:23 PDT 2021
DavidSpickett added a comment.
Both clang and GCC have their issues when it comes to matching the ACLE, so I wouldn't take the header guard as fact. It could be that we never implemented the A32 path for these functions/when they were added the document was in flux/no on ever tried this on A32.
I think you could implement `vadd_p8` on A32 with:
veor.u8 d0, d0, d0
I think we just show A64 versions in the documentation. I think. Possible that what I've got above is a simd instruction but not an "advanced simd" instruction and that somehow doesn't count?
(caveat: I've mostly been making sure the function prototypes match the ACLE, not actually using these to do real work)
If I bodge the header to have vadd_p8 on Arm I get:
$ cat /tmp/test.c
#include <arm_neon.h>
poly8x8_t test_vadd_p8(poly8x8_t a, poly8x8_t b) {
return vadd_p8 (a, b);
}
$ ./bin/clang -target arm-arm-none-eabi -mcpu=cortex-a57 -S -o - /tmp/test.c -O3
<...>
test_vadd_p8:
.fnstart
vmov d16, r0, r1
vmov d17, r2, r3
veor d16, d17, d16
vmov r0, r1, d16
bx lr
Which seems to confirm but I don't know why it's put behind the `__aarch64__` guard.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D100499/new/
https://reviews.llvm.org/D100499
More information about the cfe-commits
mailing list