[compiler-rt] [AArch64][compiler-rt] Add memcpy, memset, memmove, memchr builtins. (PR #77496)

Sander de Smalen via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 26 05:19:15 PST 2024


sdesmalen-arm wrote:

> > Add naive implementation of memcpy, memset, memmove, memchr for SME targets.
> 
> Do we have an idea of the expected performance differences here in the presence of SME?

This PR adds the `__arm_sc_` (prefixed) streaming-compatible variants to be compliant with the [SME ACLE](https://github.com/ARM-software/acle/blob/main/main/acle.md#streaming-compatible-versions-of-standard-routines) and are written with a focus on functional correctness as opposed to performance. They're more of a starting point right now, but we definitely want to optimise these routines going forward.

The reason to have streaming-compatible variants of these routines is because regular memcpy/memset/etc routines are not streaming-compatible. In practice this means they might be implemented using NEON instructions and so the compiler has to ensure that every call to it will result in a `smstop sm -> bl memcpy -> smstart sm` sequence when the caller is executed in Streaming-SVE mode. By adding the `__arm_streaming_compatible` attribute to these functions, the compiler will ensure the implementation generates no incompatible (e.g. NEON) instructions and also ensures that calls to functions with a streaming-compatible interface (e.g. `__arm_sc_memcpy`) don't result in toggling the streaming mode around the call.

https://github.com/llvm/llvm-project/pull/77496


More information about the llvm-commits mailing list