[llvm-dev] [CodeGen][AtomicExpandPass] Initial load not atomic?
David Chisnall via llvm-dev
llvm-dev at lists.llvm.org
Mon Jan 25 02:05:51 PST 2021
Hi,
I believe that this is still correct for your example. The compare and
exchange is sequentially consistent and so establishes ordering and and
happens-before relationships. The load is unordered but if it contains
any value other than the load that happens in the compare-and-exchange
then the compare fails and the loop retries. On the second iteration,
we have the value from the sequentially consistent atomic. I believe
this holds for other memory orderings as well.
This would be an invalid operation if it were done as a source-to-source
transform in C, because C states that mixing atomic and non-atomic
accesses to the same memory location is UB and so the transformed
version would suffer from UB but the source version would not. I can't
find any equivalent text in the LLVM LangRef (we should probably
explicitly clarify this) and my understanding is that LLVM semantics
permit tearing and do not guarantee any ordering, but give, at worse, an
unspecified value. Even if we statically propagate that, I would hope
that the worst thing that could happen is that the load is elided
entirely and we start the loop with a cmpxchg on an arbitrary expected
value which may succeed (if you are very lucky) but not in a way that
the guess is observable outside or, more likely, the first cmpxchg
around the loop fails and we are now in a completely valid and slightly
more strongly ordered than necessary code path.
That said, I can't find anything in a quick re-skim of the atomics part
of the spec to tell me if this is guaranteed. This code probably
originated on x86, where all variants of load (atomic or otherwise) are
lowered to the same instruction. I'm not sure what other architectures
use cmpxchg lowering, rather than an LL/SC primitive, and aren't TSO.
David
On 24/01/2021 16:42, Itay Bookstein via llvm-dev wrote:
> Hey all,
>
> While reading through AtomicExpand::insertRMWCmpXchgLoop I noticed
> that the initial load is documented to be atomic but the code below
> generates a non-atomic load. I'm not sure whether it's an actual
> problem or not - could that be mis-compiled or mis-optimized?
>
> Inline documentation comment here
> https://github.com/llvm/llvm-project/blob/8b9df70bf7e7b812715a3dc9772719188e0df06c/llvm/lib/CodeGen/AtomicExpandPass.cpp#L1394
>
> Example:
>
> define dso_local zeroext i16 @foo(i16* %0, i16 zeroext %1, i16 zeroext
> %2) local_unnamed_addr #0 {
> %4 = atomicrmw volatile nand i16* %0, i16 %1 seq_cst
> ret i16 %4
> }
>
> Running opt -atomic-expand gives:
>
> define dso_local zeroext i16 @foo(i16* %0, i16 zeroext %1, i16 zeroext
> %2) local_unnamed_addr #0 {
> %4 = load i16, i16* %0, align 2
> br label %atomicrmw.start
>
> atomicrmw.start: ; preds = %atomicrmw.start, %3
> %loaded = phi i16 [ %4, %3 ], [ %newloaded, %atomicrmw.start ]
> %5 = and i16 %loaded, %1
> %new = xor i16 %5, -1
> %6 = cmpxchg i16* %0, i16 %loaded, i16 %new seq_cst seq_cst
> %success = extractvalue { i16, i1 } %6, 1
> %newloaded = extractvalue { i16, i1 } %6, 0
> br i1 %success, label %atomicrmw.end, label %atomicrmw.start
>
> atomicrmw.end: ; preds = %atomicrmw.start
> ret i16 %newloaded
> }
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
More information about the llvm-dev
mailing list