[LLVMdev] AtomicRMW Additions

Fri Jan 3 01:42:40 PST 2014

On 3 Jan 2014, at 06:03, Philip Reames <listmail at philipreames.com> wrote:

> If all your concerned about is atomicity (and not progress or efficiency), any of these can be implemented using CAS and their normal non-atomic arithmetic instructions. As such, they would not require any extensions.

Although this is true, it's far harder for a back end to match a sequence of instructions and turn them into something sensible than it is to expand a single pseudo.  This is a problem for all architectures that are not x86 currently, because CAS trivially gets turned into a load-linked, compare, store-conditional, redo-on-failure sequence and any more complex operation (including atomic operations on floating point values, which C11 defines) get expanded in the IR as load, op, CAS, which then get first transformed into load, op, load-linked, compare, store-conditional, redo-on-failure, when the optimal encoding for the original expression would simply be load-linked, op, store-conditiona, redo-on-failure.  Each back end is then responsible for reimplementing the logic that tries to untangle the resulting mess.  

If the architecture has an atomic multiply (not sure of any architecture that do, as atomic operations with variable latency are not fun things to implement) then it's even worse, because they have to change a sequence of operations into a single instruction, possibly with stronger forward progress guarantees, which might not actually be a valid transformation in the general case.

David