[LLVMdev] memory scopes in atomic instructions

Fri Nov 14 11:09:11 PST 2014

On 11/15/2014 12:08 AM, Tom Stellard wrote:
> Can you send a plain-text version of this email.  It's easier to read
> and reply to.

Sorry about that! Here's the plain text (I hope!):

Hi all,

OpenCL 2.0 introduced the notion of memory scope in atomic operations to 
global memory. These scopes are a hint to the underlying platform to 
optimize how synchronization is achieved. HSAIL also has a notion of 
memory scopes that is compatible with OpenCL 2.0. Currently, the LLVM IR 
uses a binary value (SingleThread/CrossThread) to represent 
synchronization scope on atomic instructions. This makes it difficult to 
translate OpenCL 2.0 atomic operations to LLVM IR, and also to implement 
HSAIL memory scopes in the proposed HSAIL backend for LLVM.

We would like to enhance the representation of memory scopes in LLVM IR 
to allow more values than just the current two. The intention of this 
email is to invite comments before we start prototyping. Here's what we 
have in mind:

 1. Update the synchronization scope field in atomic instructions from a
    single bit to a wider field, say 32-bit unsigned integer.
 2. Retain the current default of zero as "system scope", replacing the
    current "cross thread" scope.
 3. All other values are target-defined.
 4. The use of "single thread scope" is not clear. If it is required in
    target-independent transforms, then it could be encoded as just "1",
    or as "all ones" in the wider field. The latter option is a bit
    weird, because most targets will have very few scopes. But it is
    useful in case the next point is included in LLVM IR.
 5. Possibly add the following constraint on memory scopes: "The scope
    represented by a larger value is nested inside (is a proper subset
    of) the scope represented by a smaller value." This would also imply
    that the value used for single-thread scope must be the largest
    value used by the target.
    This constraint on "nesting" is easily satisfied by HSAIL (and also
    OpenCL), where synchronization scopes increase from a single
    work-item to the entire system. But it is conceivable that other
    targets do not have this constraint. For example, a platform may
    define synchronization scopes in terms of overlapping sets instead
    of proper subsets.
 6. The impact of this change is limited to increasing the number of
    bits used to store synchronization scope. Future optimizations on
    atomics may need to interpret scopes in target-defined ways. When
    the synchronization scopes of two atomic instructions do not match,
    these optimizations must query the target for validity.

*Relation with SPIR: *SPIR defines an enumeration for memory scopes, but 
it does not support LLVM atomic instructions. So memory scopes in SPIR 
are independent of the representation finally chosen in LLVM IR. A 
compiler that translates SPIR to native LLVM IR will have to translate 
memory scopes wherever appropriate.

Sameer.