[LLVMdev] interest in support for Transactional Memory?

Wed Oct 27 05:01:44 PDT 2010

On Tuesday 26 October 2010 14:33:02 Duncan Sands wrote:
> > transaction properties (eg, virtually atomic + isolated execution) for
> > ordinary program code. Thus, to make incrementing a counter thread-safe,
> > you could say __transaction { counter++; } and the compiler would
> > transform this code so that it uses a TM library, which in turn does
> > concurrency control for the memory accesses in a transaction. Recent
> > studies support the assumption that shared-memory synchronization with
> > transactions is supposed to be a lot easier than when using locking, for
> > example.
> 
> Why does this require special LLVM support rather than, say, having the
> front end lower everything to library calls and so forth, like gcc does
> for OpenMP?

There are different ways one could go there. First, if there is a frontend with 
TM support available, you only need to do a few things in LLVM:

1) Txn begin is like a setjmp call. You need to ensure that stack slots are 
restored to the original values when aborting and restarting a txn. (Or you 
can ensure that slots that are live-in into a txn begin do not get reused 
until a matching commit). LLVM currently skips stack slot coloring if setjmp 
is called in the function, so one could extend this to handling an returns-
twice attribute. However, this coarse approach is costly (testing it with a 
microbenchmark (accessing a tree with txns), it decreased performance by 30%).

2) Functions that are called from txns get cloned and the clones get 
instrumented. The ABI requirements regarding how to store the clone functions 
in native code and how to look them up are not finalized yet, but it may 
require LLVM support as well.

If developing TM support from scratch, I would not put it in the frontend 
because: 

1) Performance. Doing TM instrumentation after running other standard 
optimizations is worthwhile. Inlining, constant propagation, ... and LTO in 
general can all give you less loads and stores in txns (which either have a 
decent overhead for software TM libraries or can count towards hardware TM 
(HTM) capacity limits). You can potentially do better alias and dependency 
analysis after other optimizations in IR.

2) The TM support is not necessarily language specific, IR-level TM 
instrumentation could be used with light-weight TM support in several different 
frontends.

3) The instrumentation for the kind of HTM that we have worked with can be 
expressed with inline asm in library code. The library can then be linked and 
LTO'd, so there's no noticeable performance difference to directly transforming 
loads/stores to HTM transactional loads/stores. However, this might not be the 
case for each HTM. For example, transactionally accessed variables on the 
stack might have to be separated from nontransactionally accesses stack slots 
if they are on the same cache line, or the compiler has to detect this an 
instruct the TM to use STM instead of HTM.

Torvald