[PATCH] D100499: [AArch64] Neon Polynomial vadd Intrinsic Fix
    David Spickett via Phabricator via cfe-commits 
    cfe-commits at lists.llvm.org
       
    Fri Apr 16 02:47:23 PDT 2021
    
    
  
DavidSpickett added a comment.
Both clang and GCC have their issues when it comes to matching the ACLE, so I wouldn't take the header guard as fact. It could be that we never implemented the A32 path for these functions/when they were added the document was in flux/no on ever tried this on A32.
I think you could implement `vadd_p8` on A32 with:
  veor.u8 d0, d0, d0
I think we just show A64 versions in the documentation. I think. Possible that what I've got above is a simd instruction but not an "advanced simd" instruction and that somehow doesn't count?
(caveat: I've mostly been making sure the function prototypes match the ACLE, not actually using these to do real work)
If I bodge the header to have vadd_p8 on Arm I get:
  $ cat /tmp/test.c
  #include <arm_neon.h>
  
  poly8x8_t test_vadd_p8(poly8x8_t a, poly8x8_t b) {
      return vadd_p8 (a, b);
  }
  $ ./bin/clang -target arm-arm-none-eabi -mcpu=cortex-a57 -S -o - /tmp/test.c -O3
  <...>
  test_vadd_p8:
          .fnstart
          vmov    d16, r0, r1
          vmov    d17, r2, r3
          veor    d16, d17, d16
          vmov    r0, r1, d16
          bx      lr
Which seems to confirm but I don't know why it's put behind the `__aarch64__` guard.
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100499/new/
https://reviews.llvm.org/D100499
    
    
More information about the cfe-commits
mailing list