[llvm] [CodeGen][PreISelIntrinsicLowering] Add VP-based lowering for memcpy/memmove/memset (PR #165585)

Wed Oct 29 09:34:34 PDT 2025

paulwalker-arm wrote:

I'm not sure we want this for AArch64:

- We tend to prefer having optimal library implementations of these functions.
- Not many SVE implementations exist with a vector length larger than NEON, which makes NEON with its load/store pair instructions the better option for the constant-length variants.
- The ISA has introduced dedicated copy instructions which is likely to be the preferred solution for "inlining" such calls.

So not a hard no, but unless there's a good production reason for doing this I'd rather be cautious.

FYI: The VP intrinsics are not really supported for SVE as they don't fit our model where `vscale` is the runtime vector length and so we don't have the problem the VP intrinsics are solving. I do wish LLVM had general masked intrinsics for all the potentially faulting operations (to match the existing masked loads and stores), so this might change if the VP intrinsics become this standard across all vector types, but so far I've been resistant to adopting what is currently a single target interface.

https://github.com/llvm/llvm-project/pull/165585