[llvm-dev] Adding sanity to the Atomics implementation

Thu Jan 28 07:41:06 PST 2016

I don't have much of substance to add, but I will say that I like the 
proposal (I too prefer Alternative A).  When adding C11 atomic support 
for hexagon, I found it surprising that support for the __sync_* was 
implemented completely differently than the C11 atomics.

On 1/27/2016 1:55 PM, James Knight via llvm-dev wrote:
> Right now, the atomics implementation in clang is a bit of a mess. It 
> has three basically completely distinct code-paths:
>
>  1. There's the legacy __sync_* builtins, which clang lowers directly
>     to atomic IR instructions. Then, the llvm atomic IR instructions
>     themselves can sometimes emit libcalls to __sync_* library
>     functions (which are basically undocumented, and users are often
>     responsible for implementing themselves if they need it).
>  2. There's the new __atomic_* builtins, which clang will, depending
>     on size and alignment and target, lower either to a libcall to a
>     "standardized-by-GCC
>     <https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary>" __atomic_*
>     library function (implemented by libatomic), or, to the atomic IR
>     instructions. (Also covered by the same code is the __c11_atomic_*
>     builtins, which are almost-identical clang-specific aliases of the
>     __atomic_* builtins.)
>  3. There's the C11 "_Atomic" type modifier and some OpenMP stuff,
>     which clang lowers into atomic IR or libcalls to __atomic_*
>     library functions, via almost completely separate code from #2.
>     (Note: doesn't apply to C++ std::atomic<>, which gets implemented
>     with the builtins instead). Beyond just a lot of duplicated logic
>     in the code, the _Atomic impl is kinda broken: sometimes it emits
>     llvm IR instructions directly (letting things fall back to
>     __sync_* calls), and sometimes it checks if it should emit a
>     libcall first (falling back to __atomic_* calls). That's pretty
>     bad behavior.
>
>
> I'd like to make a proposal for cleaning this mess up.
>
> BTW, as a sidenote...
> One thing that's important to remember: at least on the C/C++ level, 
> if you support lock-free atomic-ops for a given size/alignment, /ALL/ 
> the atomic ops for that size/alignment must be lock-free. This 
> property is usually quite easy, because you'll have either LL/SC or 
> CAS instructions, which can be used to implement any other operation. 
> However, many older CPU models support atomic load, store, and 
> exchange operations, but are missing any way to do an atomic 
> compare-and-swap. This is true for at least ARMv5, SPARCv8, and Intel 
> 80386. When your minimum CPU is set to such a cpu model, all atomic 
> operations must be done via libcall -- it's not acceptable for 
> atomic_store to emit an atomic "store" instruction, but 
> atomic_fetch_add to require a libcall which gets implemented with a 
> lock. If that were to happen, then atomic_fetch_add could not actually 
> be made atomic versus a simultaneous atomic_store.
>
>
> So anyhow, there's basically two paths I think we could take for 
> cleanup. I like "Alternative A" below better, but would be interested 
> to hear if others have reasons to think the other would be preferable 
> for some reason.
>
> /Both/ alternatives I've suggested will have the effect that the 
> __sync_* builtins in clang will now lower to __atomic_* function calls 
> on platforms without inline atomics (or for unsupported sizes), and 
> C11 atomics will stop being schizophrenic. Clang will cease to ever 
> emit a call to a __sync_* function from any of its builtins or 
> language support. *This could theoretically cause an incompatibility 
> on some target. *
>
> However, I think it ought to be pretty safe: I can't imagine anyone 
> who will use an upgraded compiler has __sync_* functions implemented 
> for their platform, but doesn't have the __atomic_* library functions 
> implemented. The __atomic_* builtins (which already use those library 
> calls) are in widespread use, notably, both in libstdc++ and libc++, 
> as well as in a lot of third-party code. IMO it's worth having that 
> potential incompatibility, in order to simplify the situation for 
> users (only one set of atomic library calls to worry about), and to 
> have a single code-path for atomics in the compiler.
>
>
>
> *Alternative A*: Move all atomic libcall emission into LLVM
>
> In this alternative, LLVM will be responsible for lowering all atomic 
> operations (to inline code or libcalls as appropriate) for all three 
> code-paths listed at the beginning. Clang will emit no libcalls for 
> atomic operations itself.
>
>
> A1) In LLVM: when lowering "store atomic", "load atomic", "atomicrmw", 
> and "cmpxchg" instructions that aren't supported by the target, emit 
> libcalls to the new __atomic_* functions, (rather than the current 
> behavior of calling the legacy __sync_* functions.)
>
> Additionally, I'd like to move the decision to emit a libcall into 
> AtomicExpandPass, and remove the ability to use Expand in ISel to 
> create a libcall for ISD::ATOMIC_*. Putting the decision there allows 
> querying the target, up front, for its supported sizes, before any 
> other lowering decisions (either other parts of AtomicExpandPass and 
> in ISel). When choosing whether to inline or libcall, only the size 
> and alignment of the operation should be considered, and not the 
> operation itself, or the type. This will ensure that the "all or none" 
> property is adhered to.
>
> (Q: what about the implementation of 
> __atomic_is_lock_free/__atomic_always_lock_free in clang? The clang 
> frontend can't query target information from the llvm backend, can it? 
> Is there some other way to expose the information of what sizes are 
> supported by a backend so that the clang builtins -- the latter of 
> which must be a C++ constant expression -- can also use it? Or...just 
> continue duplicating the info in both places, as is done today...)
>
> A2) In clang, start removing the code that knows how to do optimized 
> library call lowering for atomics -- always emit llvm atomic ops. 
> Except for one case: clang will still need to emit library calls 
> itself for data not aligned naturally for its size. The LLVM atomic 
> instructions currently will not handle unaligned data, but unaligned 
> data is allowed for the four "slab of memory" builtins (__atomic_load, 
> __atomic_store, __atomic_compare_exchange, and __atomic_exchange).
>
> A3) In LLVM, add "align" attributes to cmpxchg and atomicrmw, and 
> allow specifying "align" values for "load atomic" and "store atomic" 
> (where the attribute currently exists but cannot be used). LLVM will 
> then be able to lower misaligned accesses to library calls. In clang, 
> get rid of special case for directly emitting libcalls for misaligned 
> atomics, and lower to llvm instructions there, as well.
>
> A4) Hopefully, now, the three codepaths in clang will be sufficiently 
> the same (as they're all now just creating llvm atomic instructions) 
> that they can be trivially merged, or else they are so trivial that 
> the remaining duplication is not worthwhile to merge.
>
>
>
>
>
> *Alternative B*: Move all libcall emission for atomics into Clang.
>
> In this alternative, LLVM will /never/ emit atomic libcalls from the 
> atomic IR instructions. If the operation requested is not possible on 
> a given target, that is an error. So, the cleanups here are to get 
> clang to stop emitting LLVM IR atomics that cannot be lowered without 
> libcalls.
>
> B1) In Clang: have the legacy __sync_* builtins become essentially 
> aliases for the __atomic_* builtins (thus they will emit calls to 
> __atomic_* library functions when applicable).
>
> B2) In Clang: Have the C11 _Atomic type support and openmp stuff call 
> into the __atomic builtins' code to do their dirty work, instead of 
> having its own separate implementation.
>
> B3) After those two changes, I believe clang will only ever emit 
> atomic IR instructions when they can be lowered. So then, in LLVM: get 
> rid of the fallback to libcalls to __sync_* from the atomic IR 
> instructions. If an atomic IR operation is requested and not 
> implementable lock-free, it will be an error, with no fallback.
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160128/02c89579/attachment-0001.html>