[libc-commits] [libc] [libc][SVE] add sve handling for memcpy with count less than 32b (PR #167446)
Guillaume Chatelet via libc-commits
libc-commits at lists.llvm.org
Mon Feb 2 02:24:20 PST 2026
gchatelet wrote:
We finally have good evidence that the patch is positive in production.
As said earlier we need to treat the `0` size apart and have a special path for size `<=16`.
Something along these lines
```c++
if (count == 0) return;
#ifdef LIBC_TARGET_CPU_HAS_SVE
auto src_ptr = reinterpret_cast<const uint8_t*>(src);
auto dst_ptr = reinterpret_cast<uint8_t*>(dst);
if (count <= 16) {
const svbool_t mask = svwhilelt_b8_u64(0, count);
svst1_u8(mask, dst_ptr, svld1_u8(mask, src_ptr));
return;
}
if (count <= 32) {
const size_t vlen = svcntb();
svbool_t m0 = svwhilelt_b8_u64(0, count);
svbool_t m1 = svwhilelt_b8_u64(vlen, count);
svst1_u8(m0, dst_ptr, svld1_u8(m0, src_ptr));
svst1_u8(m1, dst_ptr + vlen, svld1_u8(m1, src_ptr + vlen));
return;
}
#else
if (count == 1) return builtin::Memcpy<1>::block(dst, src);
...
```
I'm still a bit bothered by how this patch will age when `vlen` bumps to 32 or higher but for now this seems to be a win.
https://github.com/llvm/llvm-project/pull/167446
More information about the libc-commits
mailing list