[LLVMdev] Proposal for atomic and synchronization instructions

Tue Jul 10 03:48:20 PDT 2007

On Monday 09 July 2007 23:26, Chandler Carruth wrote:
> I was trying to keep the set of operations as small as possible.
> Supporting all of the bitwise operations on the x86 architecture would
> be very difficult due to their number. (BTS, BTR, BTC...) Several of
> these also have no easy emulation available for other architectures.
> (Imagine doing an atomic test and set of a single bit, without
> affecting the rest of the bits, on SPARC..) Others do not fit well
> into the SSA representation of LLVM. This is particularly true of the
> non-exchanging operations on x86. Because x86 can work directly with
> memory, avoiding loads and stores, many locked operations are
> practical there without saving out the old value as part of the atomic
> instruction. Representing this in an SSA fashion would be very
> difficult I think, especially when trying to maintain atomicity across
> the entire LLVM instruction which isn't necessarily implementable as a
> single instruction on x86.

atomic_ops has code for more operations, you can probably find useful 
information there...

>
> All the same, if there is a demand for these bitwise operations, I am
> more than willing to try and work up ways to include them. Do other
> people have ideas for implementing them cleanly, and/or ideas about
> the demand for them over using "cas" to achieve the functionality?

I think specialized instructions (e.g., inc/dec, or, TAS, ...) are less 
costly on some architectures, compared to a full CAS. And AFAIK, CAS often 
includes a full barrier (or equivalent constraints), whereas with the 
specialized operations, you can select the kind of ordering constraints 
that you really need.

>
> > 2) You need a strategy for handling architectures that can't handle
> > atomic operations on certain LLVM data types.  For example, 64 bit
> > machines can operate atomically on 64 bit operands, but some 32 bit
> > machines cannot.  I think you can fix this with spin locking, but you
> > need to be careful on how you do it.  I think in order to do it
> > correctly, you need a separate spinlock variable for each individual
> > instruction that requires a spinlock.
>
> Indeed, which is very dangerous with potential for deadlock, etc.
> After discussing this on IRC, I have adjusted the proposal to reflect
> the idea that the target implementation can reject any instructions
> with operands which they cannot effectively lower to an atomic
> operation. This makes the available types for the instruction
> architecture dependent, but when relying on the atomic architecture
> implementation, this seems part of the package.

I would rather see it provide blocking implementations. That's not good 
because it is nonblocking. Deadlock is an issue though if you are handling 
signals (otherwise, you're not getting more than one lock, so you should be 
okay).

torvald