[llvm-dev] cmpxchg on floats

Fri Aug 21 17:10:35 PDT 2020

> On Aug 21, 2020, at 2:51 PM, Nicolai Hähnle via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> On Tue, Aug 18, 2020 at 1:27 AM Joerg Sonnenberger via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> On Fri, Aug 14, 2020 at 10:42:02AM -0700, JF Bastien via llvm-dev wrote:
>>> We (C, C++, and LLVM) are generally moving towards supporting FP as a
>>> first-class thing with all atomic operations †, including cmpxchg. It’s
>>> indeed *usually* specified as a bitwise comparison, not a floating-point
>>> one, although IIRC AMD has an FP cmpxchg. Similarly, some of the
>>> operations are allowed to have separate FP state (say, atomic add won’t
>>> necessarily affect the scalar FP execution’s exception state, might
>>> have a different rounding mode, etc).
>> 
>> We don't really FP cmpxchg in hardware to implement it, do we? It can be
>> lowered as load, FP compare, if not equal cmpxchg load?
> 
> Two points here:
> 
> 1. Hardware with native fcmpxchg already exists.
> 2. It's incorrect even if I replace your "if not equal" by "if equal"
> (which I assume is what you meant).
> 
> On the latter, assume your float in memory is initially -0.0, thread 1
> does cmpxchg(-0.0, +0.0) and thread 2 does fcmpxchg(+0.0, 1.0). The
> memory location is guaranteed to be 1.0 after both threads have run,
> but this is no longer true with your replacement, because the
> following ordering of operations is possible:
> 
> - Thread 2 loads -0.0, compares to +0.0 => comparison is equal
> - Thread 1 does cmpxchg, memory value is now changed to +0.0
> - Thread 2 does cmpxchg(-0.0, 1.0) now, testing whether the memory
> location is unchanged --> this fails, so the memory location stays
> +0.0

Right, I agree.  I think this argues for this being a separate ‘fcmpxchg’ instruction, because the condition code is different.

-Chris