[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

Sat May 10 09:38:20 PDT 2014

On 10 May 2014, at 16:18, Tim Northover <t.p.northover at gmail.com> wrote:

> Actually, I really agree there. I considered it recently, but decided
> to leave it as an intrinsic for now (the new IR expansion pass happens
> after most optimisations so there wouldn't be much benefit, but if we
> did it earlier and the mid-end understood what an ldrex/strex meant, I
> could see code getting much better).
> 
> Load linked would be fairly easy (perhaps even written as "load
> linked", a minor extension to "load atomic"). Store conditional would
> be a bigger change since stores don't return anything at the moment;
> passes may not be expecting to have to ReplaceAllUses on them.

The easiest solution would be to extend the cmpxchg instruction with a weak variant.  It is then trivial to map load, modify, weak-cmpxchg to load-linked, modify, store-conditional (that is what weak cmpxchg was intended for in the C[++]11 memory model).  

> I'm hoping to have some more time to spend on atomics soon, after this
> merge business is done. Perhaps then.
> 
> I don't suppose you have any plans to port Mips to the IR-level LL/SC
> expansion? Now that the infrastructure is present it's quite a
> simplification (r206490 in ARM64 for example, though you need existing
> target-specific intrinsics at the moment). It would be good to iron
> out any ARM-specific assumptions I've made.

I'd rather avoid it, because it doing it that late precludes a lot of optimisations that we're interested in.  I'd much rather extend the IR to support them at a generic level.

We have a couple of plans for variations of atomic operations in our architecture, so we'll likely end up trying and throwing away a few approaches over the next couple of years.

> But it would still be a construct that probably just couldn't be used
> on x86 efficiently, not really a step towards a target independent IR.

On x86, we could map weak cmpxchg to the same thing as a strong cmpxchg, so it would still generate the same code.  The same is true for all architectures with a non-blocking compare and exchange operation.

David