[llvm] [LV][AArch64]: Utilise SVE ld4/st4 instructions via auto-vectorisation (PR #89018)

Paul Walker via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 18 02:51:53 PDT 2024


paulwalker-arm wrote:

> RISC-V has interleave loads for up to 8. So I guess we would need interleave5 and interleave7?

Yes, sorry.  I guess I meant "Hopefully we can emulate all required interleave factors by only implement specific intrinsics for factors that are a prime number"?  An alternative proposal is to have intrinsics for all but then lower them to sequences of fewer intrinsics within the InterleavedAccess pass or perhaps even SelectionDAGBuilder.  I suppose this really depends on how awkward cost modelling the sequences turns out to be.

@efriedma-quic - Is your concern related to vectorisation or the costing of already vectorised code?

https://github.com/llvm/llvm-project/pull/89018


More information about the llvm-commits mailing list