[PATCH] D97224: Use Address for CGBuilder's CreateAtomicRMW and CreateAtomicCmpXchg.

John McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 4 11:28:41 PST 2021


rjmccall added a comment.

In D97224#2604069 <https://reviews.llvm.org/D97224#2604069>, @jyknight wrote:

> In D97224#2596410 <https://reviews.llvm.org/D97224#2596410>, @rjmccall wrote:
>
>> Do we really consider the existing atomic intrinsics to not impose added alignment restrictions?  I'm somewhat concerned that we're going to produce IR here that's either really suboptimal or, worse, unimplementable, just because we over-interpreted some cue about alignment.  I guess it would only be a significant problem on a target where types are frequently under-aligned for what atomics need, which is not typical, or when the user is doing atomics on a field of something like a packed struct.
>
> For the `__atomic_*` intrinsics, we don't consider those as imposing additional alignment restrictions -- currently, we avoid generating the LLVM IR instruction if it's below natural alignment, since we couldn't specify alignment on the IR instruction. (Now that we have alignment on the LLVM IR operations, I'd like to eventually get rid of that logic from Clang, since it's also handled by LLVM.)

Frontends ultimately have the responsibility of making sure they ultimately emit code that follows the platform ABI for atomics.  In most other parts of the ABI, we usually find that it is possible (even necessary) to delegate *part* of that ABI responsibility down to LLVM — e.g. to emit inline atomic sequences, which I suppose frontends could do with inline assembly, but there are obvious reasons to prefer a more semantic IR — but that at least in some cases, there is information that we cannot pass down and so must handle in the frontend.  I am somewhat skeptical that atomics are going to prove an exception here.  At the very least, there will always be *some* operations that we have to expand into compare-exchange loops in the frontend, for want of a sufficiently powerful instruction/intrinsic.  That said, if you find that you can actually free Clang from having to make certain decisions here, that's great; I just want you to understand that usually we find that there are limits to what we can usefully delegate to LLVM, and ultimately the responsibility is ours.

> For other intrinsics (e.g. MSVCIntrin::_InterlockedAnd, Builtin::BI__sync_fetch_and_add_4, NVPTX::BI__nvvm_atom_add_gen_i, and the others in those 3 function families), we currently entirely ignore the alignment, and simply assume the argument is naturally-aligned. So, yes, this change could potentially affect the behavior for underaligned types.
>
> So, I could change these to explicitly increase the assumed alignment of their arguments, like I did for InterlockedCompareExchange128. My inclination is not to do so, however...it doesn't seem like a good idea in general to ignore type alignment. But, I'd not be opposed to doing that, if there's a good reason.

The idea here is not to "ignore type alignment".  `EmitPointerWithAlignment` will sometimes return an alignment for a pointer that's less than the alignment of the pointee type, e.g. because you're taking the address of a packed struct field.  The critical question is whether the atomic builtins ought to make an effort to honor that reduced alignment, even if it leads to terrible code, or if we should treat the use of the builtin as a user promise that the pointer is actually more aligned than you might think from the information statically available.  That has to be resolved by the semantics of the builtin, and unfortunately intrinsic documentation is often spotty on this point.  For example, the MSDN documentation for `InterlockedIncrement` says that it requires 32-bit alignment, but the documentation for `InterlockedAdd` doesn't.  It seems extremely unlikely that that is meant to be read as a statement that `InterlockedAdd` is actually more permissive.  I would say that all of the Interlocked APIs ought to be read as guaranteeing the natural, full-width alignment for their operation.  I'm less certain about what to do with the `__atomic_*` builtins, because maybe we should take this as an opportunity to try to "do the right thing" with under-aligned atomics; on the other hand, that assumes that we always *can* "do the right thing", and I don't want Clang to start miscompiling or refusing to compile code because we're trying to do something above and beyond the normal language semantics.  It might be more reasonable to always use the type alignment as a minimum.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97224/new/

https://reviews.llvm.org/D97224



More information about the cfe-commits mailing list