[LLVMdev] Why can't atomic loads and stores handle floats?

Sun May 25 04:31:39 PDT 2014

On 24 May 2014, at 23:18, Filip Pizlo <fpizlo at apple.com> wrote:

> What is the downside of the currently generated IR?  There ain't nothin' wrong with bitcasts, IMO. 

It's problematic because it means that you'll end up generating an integer store even if your hardware support load-linked / store-conditional (or cmpxchg) from floating point registers.  This means an extra floating point to integer register copy (and the reverse and back if it fails for an atomicrmw loop) and that typically involves some complex pipeline interlocking on modern processors so is very expensive.  It also means you need to allocate an extra integer register, even though the underlying hardware may support the original form.

You can hack around this a bit in the back end, but you end up with the back end trying to figure out what the front end meant (and, on some architectures, store ordering semantics from integer and floating point registers are different, so this isn't necessarily possible anyway, as you can be turning two constructs that have different semantics to the front and back ends into the same thing in the IR).

We're currently unable to express atomic pointer loads and stores on our architecture for a similar reason: the hardware supports separate fat pointer registers, which are wider than the widest integer registers (and interact with tag bits) and implements a load-linked and store-conditional for them, C11 supports atomic operations on pointers, but LLVM IR doesn't (or didn't, not sure if this is fixed now).

In short, it's only a problem if you care about architectures outside of ARM and x86.

David

P.S.  To correctly implement C11 semantics for atomic operations on floats, we also need to be able to model floating point environment state, which we can't.