<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Sameer,<div class=""><br class=""></div><div class="">I support this proposal, and have discussed more or less the same idea with various people in the past.  Regarding point #5, I believe address spaces may already provide the functionality needed to express overlapping constraints.  I’m also not aware of any systems that would really need that functionality anyways.</div><div class=""><br class=""></div><div class="">—Owen</div><div class=""><br class=""></div><div class=""><div><blockquote type="cite" class=""><div class="">On Nov 14, 2014, at 10:17 AM, Sahasrabuddhe, Sameer <<a href="mailto:Sameer.Sahasrabuddhe@amd.com" class="">Sameer.Sahasrabuddhe@amd.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">

    <meta http-equiv="content-type" content="text/html; charset=utf-8" class="">

  <div bgcolor="#FFFFFF" text="#000000" class="">

    Hi all,<br class="">

    <br class="">

    OpenCL 2.0 introduced the notion of memory scope in atomic

    operations to global memory. These scopes are a hint to the

    underlying platform to optimize how synchronization is achieved.

    HSAIL also has a notion of memory scopes that is compatible with

    OpenCL 2.0. Currently, the LLVM IR uses a binary value

    (SingleThread/CrossThread) to represent synchronization scope on

    atomic instructions. This makes it difficult to translate OpenCL 2.0

    atomic operations to LLVM IR, and also to implement HSAIL memory

    scopes in the proposed HSAIL backend for LLVM.<br class="">

    <br class="">

    We would like to enhance the representation of memory scopes in LLVM

    IR to allow more values than just the current two. The intention of

    this email is to invite comments before we start prototyping. Here's

    what we have in mind:<br class="">

    <ol class="">

      <li class="">Update the synchronization scope field in atomic instructions

        from a single bit to a wider field, say 32-bit unsigned integer.

      </li>

      <li class="">Retain the current default of zero as "system scope",

        replacing the current "cross thread" scope.<br class="">

      </li>

      <li class="">All other values are target-defined.</li>

      <li class="">The use of "single thread scope" is not clear. If it is

        required in target-independent transforms, then it could be

        encoded as just "1", or as "all ones" in the wider field. The

        latter option is a bit weird, because most targets will have

        very few scopes. But it is useful in case the next point is

        included in LLVM IR.</li>

      <li class="">Possibly add the following constraint on memory scopes: "The

        scope represented by a larger value is nested inside (is a

        proper subset of) the scope represented by a smaller value."

        This would also imply that the value used for single-thread

        scope must be the largest value used by the target.<br class="">

        This constraint on "nesting" is easily satisfied by HSAIL (and

        also OpenCL), where synchronization scopes increase from a

        single work-item to the entire system. But it is conceivable

        that other targets do not have this constraint. For example, a

        platform may define synchronization scopes in terms of

        overlapping sets instead of proper subsets. <br class="">

      </li>

      <li class="">The impact of this change is limited to increasing the number

        of bits used to store synchronization scope. Future

        optimizations on atomics may need to interpret scopes in

        target-defined ways. When the synchronization scopes of two

        atomic instructions do not match, these optimizations must query

        the target for validity. <br class="">

      </li>

    </ol>

    <b class="">Relation with SPIR: </b>SPIR defines an enumeration for memory

    scopes, but it does not support LLVM atomic instructions. So memory

    scopes in SPIR are independent of the representation finally chosen

    in LLVM IR. A compiler that translates SPIR to native LLVM IR will

    have to translate memory scopes wherever appropriate. <br class="">

    <br class="">

    Sameer.<br class="">

  </div>

_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" class="">http://llvm.cs.uiuc.edu</a><br class=""><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class=""></div></blockquote></div><br class=""></div></body></html>