[libc-commits] [libc] [ARM][Libc] Fix ARM big-endian low-end inline_memset byte fills (PR #198777)

Simi Pallipurath via libc-commits libc-commits at lists.llvm.org
Wed May 20 06:11:15 PDT 2026


https://github.com/simpal01 created https://github.com/llvm/llvm-project/pull/198777

Fix inline_memset_arm_low_end to use the splatted 32-bit fill value for all byte stores. This ensures that all stores, including unaligned and trailing byte fills, write the requested repeated byte value on big-endian ARM.

The low-end ARM memset path already builds value32 as value 0x01010101U for aligned word/block stores, but its byte-wise prefix and tail handling still passed the raw 8-bit value * widened to uint32_t. Use value32 for the byte-wise alignment prefix and final tail loop as well. This makes the low-end path endian-safe and consistent with inline_memset_arm_mid_end, fixing incorrect NUL/truncated output in printf-family formatting and direct memset tests on big-endian ARM no-unaligned-access targets.

>From 5b01d1f31b20263955cccd34d46c542433ac9cd4 Mon Sep 17 00:00:00 2001
From: Simi Pallipurath <simi.pallipurath at arm.com>
Date: Wed, 20 May 2026 13:17:28 +0100
Subject: [PATCH] [ARM][Libc] Fix ARM big-endian low-end inline_memset byte
 fills

Fix inline_memset_arm_low_end to use the splatted 32-bit
fill value for all byte stores. This ensures that all
stores, including unaligned and trailing byte fills,
write the requested repeated byte value on big-endian ARM.

The low-end ARM memset path already builds value32 as value
0x01010101U for aligned word/block stores, but its byte-wise
prefix and tail handling still passed the raw 8-bit value *
widened to uint32_t. Use value32 for the byte-wise alignment
prefix and final tail loop as well. This makes the low-end
path endian-safe and consistent with inline_memset_arm_mid_end,
fixing incorrect NUL/truncated output in printf-family formatting
and direct memset tests on big-endian ARM no-unaligned-access targets.
---
 libc/src/string/memory_utils/arm/inline_memset.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libc/src/string/memory_utils/arm/inline_memset.h b/libc/src/string/memory_utils/arm/inline_memset.h
index ce36e40617bb1..5bcf28a2a4978 100644
--- a/libc/src/string/memory_utils/arm/inline_memset.h
+++ b/libc/src/string/memory_utils/arm/inline_memset.h
@@ -79,20 +79,20 @@ set_bytes_and_bump_pointers(Ptr &dst, uint32_t value, size_t size) {
 // not allow unaligned stores so all accesses are aligned.
 [[maybe_unused]] LIBC_INLINE void
 inline_memset_arm_low_end(Ptr dst, uint8_t value, size_t size) {
+  const uint32_t value32 = value * 0x01010101U; // splat value in each byte
   if (size >= 8)
     LIBC_ATTR_LIKELY {
       // Align `dst` to word boundary.
       if (const size_t offset = distance_to_align_up<kWordSize>(dst))
         LIBC_ATTR_UNLIKELY {
-          set_bytes_and_bump_pointers(dst, value, offset);
+          set_bytes_and_bump_pointers(dst, value32, offset);
           size -= offset;
         }
-      const uint32_t value32 = value * 0x01010101U; // splat value in each byte
       consume_by_block<64, AssumeAccess::kAligned>(dst, value32, size);
       consume_by_block<16, AssumeAccess::kAligned>(dst, value32, size);
       consume_by_block<4, AssumeAccess::kAligned>(dst, value32, size);
     }
-  set_bytes_and_bump_pointers(dst, value, size);
+  set_bytes_and_bump_pointers(dst, value32, size);
 }
 
 // Implementation for Cortex-M3, M4, M7, M23, M33, M35P, M52 with hardware



More information about the libc-commits mailing list