[PATCH] Implement low-level ARM ldrex/strex intrinsics
eli.friedman at gmail.com
Fri Jul 12 13:48:01 PDT 2013
On Fri, Jul 12, 2013 at 6:34 AM, Tim Northover <tnorthover at apple.com> wrote:
> The attached patches implement three new (overloaded) intrinsics for ARM targets:
> T __builtin_arm_ldrex(T *addr)
> int __builtin_arm_strex(T val, T *addr)
> void __builtin_arm_clrex()
> The idea is that (with quite a bit of hedging about backend register allocation and spills) these can be used to implement the higher-level atomic operations that already exist (and many more that don't happen to fit what x86 can do) in normal C and C++ code.
> There's some precedent in other compilers for instructions like these. ARM's own RVCT implements the bare __ldrex and __strex intrinsics on a narrower range of types, and from the documentation I believe these are compatible on the common subset.
> On the low level details, there were two choices of how to emit an (e.g.) short ldrex:
> call i16 @llvm.arm.ldrex.i16(i8* %addr)
> call i32 @llvm.arm.ldrex.p0i16(i16* %addr)
> I have almost complete implementations of both and eventually decided that the latter was more natural: backends just aren't designed to handle intrinsics that actually need lowering and hacks were needed in multiple places to work around this.
> The disadvantage is extra extensions and truncations occuring with every short exclusive operation instead of just when needed, but I've made an effort to fold those in a reasonably sane manner in the backend.
> I've added some documentation to Clang, but there didn't seem to be any precedent for documenting target-specific intrinsics on the LLVM side, so I didn't do that.
> Any comments? Can I commit?
The obvious objection is that the definition of these intrinsics is
basically "this intrinsic has undefined behavior". If we can't
provide some sort of guarantee to make them actually usable, we
shouldn't provide them at all.
More information about the cfe-commits