[clang] [clang][ARM] Fix build failure in <arm_acle.h> for __swp (PR #151354)

Thu Jul 31 02:00:43 PDT 2025

================
@@ -55,11 +55,27 @@ __chkfeat(uint64_t __features) {
 /* 7.5 Swap */
 static __inline__ uint32_t __attribute__((__always_inline__, __nodebug__))
 __swp(uint32_t __x, volatile uint32_t *__p) {
----------------
statham-arm wrote:

I'm not an expert on the modern memory order terminology, so I can't say whether `__ATOMIC_RELAXED` is the right thing or not. But just looking at the code generation:
* on modern architectures, this generates the expected ldrex/strex loop
* on Armv6-M it generates an `atomicrmw volatile xchg` in IR, which LLVM lowers to a libcall `__atomic_exchange_4`, which seems like an improvement on the `__atomic_compare_exchange_4` that I got out of ACLE's suggested fallback sequence.
* but on old Arm architectures such as v5, we _also_ get a libcall, when we could get `SWP` and not put any demands on the runtime.

So, for the moment, I've updated the patch to use your suggested sequence for v6-M, but not to use it universally. If we could arrange for `atomicrmw volatile xchg` to be instruction-selected as `SWP` in preference to a libcall (but not in preference to ldrex/strex) then I'd be happy using this unconditionally.

https://github.com/llvm/llvm-project/pull/151354