[llvm] [LV][AArch64]: Utilise SVE ld4/st4 instructions via auto-vectorisation (PR #89018)
Paul Walker via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 18 02:51:53 PDT 2024
paulwalker-arm wrote:
> RISC-V has interleave loads for up to 8. So I guess we would need interleave5 and interleave7?
Yes, sorry. I guess I meant "Hopefully we can emulate all required interleave factors by only implement specific intrinsics for factors that are a prime number"? An alternative proposal is to have intrinsics for all but then lower them to sequences of fewer intrinsics within the InterleavedAccess pass or perhaps even SelectionDAGBuilder. I suppose this really depends on how awkward cost modelling the sequences turns out to be.
@efriedma-quic - Is your concern related to vectorisation or the costing of already vectorised code?
https://github.com/llvm/llvm-project/pull/89018
More information about the llvm-commits
mailing list