[LLVMdev] Proposal for atomic and synchronization instructions

Mon Jul 9 14:26:55 PDT 2007

On 7/9/07, John Criswell <criswell at cs.uiuc.edu> wrote:
> 1) You may want to consider adding atomic load-<bitwise operation>-store
> instructions in addition to load-<add/subtract> instructions.  The Linux
> kernel uses these for atomic bit setting/clearing, and on many systems
> they can be implemented more efficiently using special assembly
> instructions.  That said, I have never run experiments to verify that
> such instructions are critical to kernel performance.

I was trying to keep the set of operations as small as possible.
Supporting all of the bitwise operations on the x86 architecture would
be very difficult due to their number. (BTS, BTR, BTC...) Several of
these also have no easy emulation available for other architectures.
(Imagine doing an atomic test and set of a single bit, without
affecting the rest of the bits, on SPARC..) Others do not fit well
into the SSA representation of LLVM. This is particularly true of the
non-exchanging operations on x86. Because x86 can work directly with
memory, avoiding loads and stores, many locked operations are
practical there without saving out the old value as part of the atomic
instruction. Representing this in an SSA fashion would be very
difficult I think, especially when trying to maintain atomicity across
the entire LLVM instruction which isn't necessarily implementable as a
single instruction on x86.

All the same, if there is a demand for these bitwise operations, I am
more than willing to try and work up ways to include them. Do other
people have ideas for implementing them cleanly, and/or ideas about
the demand for them over using "cas" to achieve the functionality?

> 2) You need a strategy for handling architectures that can't handle
> atomic operations on certain LLVM data types.  For example, 64 bit
> machines can operate atomically on 64 bit operands, but some 32 bit
> machines cannot.  I think you can fix this with spin locking, but you
> need to be careful on how you do it.  I think in order to do it
> correctly, you need a separate spinlock variable for each individual
> instruction that requires a spinlock.

Indeed, which is very dangerous with potential for deadlock, etc.
After discussing this on IRC, I have adjusted the proposal to reflect
the idea that the target implementation can reject any instructions
with operands which they cannot effectively lower to an atomic
operation. This makes the available types for the instruction
architecture dependent, but when relying on the atomic architecture
implementation, this seems part of the package.

> 3) You say that your operations work on all first class LLVM data
> types.  The LLVM vector type is considered a first class type.  Should
> LLVM support atomic vector operations?

Thanks for pointing this out! Continuing with the changes above, I
have changed the proposal to state that these instructions only accept
integer types. This should provide the needed functionality of these
operations, while keeping a clear lowering path to the target
architecture.

> 4) For consistency, you may need to specify that the LLVM volatile
> keyword can be added to the atomic operations to prevent optimizations
> from modifying or removing them.   I'm not aware of any optimization
> that works on atomic ops in practice, but I don't see why they couldn't
> be written.

Why would they need this keyword? If the optimization pass works on
atomic ops, would it need to understand the semantics involved, and
only modify/remove them when those semantics allowed it...
specifically, constant propagation should be able to replace the
values in these operations with constants if it can do so, no? Perhaps
I don't understand what situation you're trying to avoid/prepare for
with this.

> 5) With the addition of membar instructions, it may be time to re-think
> what the LLVM volatile keyword means.  Currently, volatile prevents
> removing, modifying, or reordering of a load or store.  However, membar
> also prevents reordering.  Perhaps it would be useful to redefine
> volatile to mean that a load/store cannot be removed or modified but can
> be reordered, and membar alone prevents reordering.

I personally think this would be very interesting to do, as the
barriers provide a very fine grained level of control over load/store
ordering. However, they have implications if they are used for this
purpose -- should the order enforcement of loads and stores without
interprocess implications cause the bus events that these memory
barriers do? That could result in a huge slowdown.

More likely, these barriers should provide another piece of
information about load/store reordering, not superseding volatile. If
people see value in using a barrier-type set of constraints on
volatile reordering, this should be provided separately I think.

-Chandler Carruth

>
> -- John T.
> > -Chandler Carruth
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>