[LLVMdev] Proposal for atomic and synchronization instructions

Mon Jul 9 16:38:00 PDT 2007

Torvald Riegel wrote:
> On Monday 09 July 2007 19:33, Scott Michel wrote:
>> Torvald Riegel wrote:
>>> Hi,
>>>
>>> I'd like to see support for something like this. I have some comments,
>>> and I think there is existing work that you can reuse.
>> "reuse within the compiler."
> 
> within the LLVM compiler framework, to be precise.
> 
>>> "While the processor may spin and attempt the atomic operation more than
>>> once before it is successful, research indicates this is extremely
>>> uncommon." I don't understand this sentence, what do you mean?
>> I'm not sure I can pinpoint the paper from which the statement is based,
>> but I seem to recall something similar in the original LL-SC papers
>> (Maurice Herlihy, DEC Western Research Labs?) It's a foundation for
>> lock-free algorithms.
> 
> Well, the statement says that often you have low contention. But that's 
> something you want, not necessarily something you will get, and depends on 
> the workload/algorithm. I'm missing the context. Is the actual statement as 
> obvious as that you should try to use the atomic instructions offered by your 
> processor, instead of doing blocking algorithms?

As Chandler pointed out, LL/SC isn't blocking. It belongs to the
optimistic concurrency class of constructs. One of the earliest papers
(IIRC, the first paper) on LL/SC was:

Herlihy, M. 1993. A methodology for implementing highly concurrent data
objects. ACM Trans. Program. Lang. Syst. 15, 5 (Nov. 1993), 745-770.
DOI= http://doi.acm.org/10.1145/161468.161469

LL/SC on the various RISC architectures are used for spin locks, but
they don't have to be used that way. I suspect that current work on
software transactional memory is LL/SC-like on memory regions -- if you
look at the paper, there is a chunk of code in the examples that rolls
back or restarts a computation if the SC operation fails.

> Please have a real look at atomic_ops first. It does have a library part to 
> it -- but that's just for a nonblocking stack.

It's a lot like Apple's (and gcc's) work to reconcile the Intel and PPC
vector intrinsics. Nice work but an unnecessary dependency, in my
personal and not so humble opinion.

> Second, I guess there has been some serious effort put into selecting the 
> specific model. So, for example, if you look at some of Hans' published 
> slides etc., there are some arguments in favor of associating membars with 
> specific instructions. Do you know reasons why LLVM shouldn't do this?

You mean the papers that don't have to do with garbage collection? :-)

Seriously, I think that's the overall purpose for some of this work so
that llvm can do a better job in instruction-level parallelism.

> Has anyone looked at the memory models that are being in discussion for C/C++? 
> Although there is no consensus yet AFAIK, it should be good for LLVM to stay 
> close.

Even when consensus is achieved, it still has to be implemented on the
hardware. As you point out, LL/SC is used to create spinlocks. But LL/SC
is somewhat more powerful than that.

-scooter