[LLVMdev] [RFC][PATCH][OPENCL] synchronization scopes redux
Sahasrabuddhe, Sameer
sameer.sahasrabuddhe at amd.com
Tue Jan 6 20:06:02 PST 2015
On 1/7/2015 8:59 AM, Chandler Carruth wrote:
>
> Essentially, I think target-independent optimizations are still
> attractive, but we might want to just force them to go through an
> actual target-implemented API to interpret the scopes rather than
> making the interpretation work from first principles. I just worry
> that the targets are going to be too different and we may fail to
> accurately predict future targets' needs.
If we have a target-implemented API, then just opaque numbers should
also be sufficient, right? For the API, all we care about is queries
that interesting optimizations will want answered from the target. This
could be at the instruction level: "is it okay to remove this atomic
store with scope n1 that is immediately followed by atomic store with
scope n2?". Or it could be at the scope level: "does scope n2 include
scope n1"?
> I think the "strings" can be made relatively clean.
>
> What I'm imagining is something very much like the target-specific
> attributes which are just strings and left to the target to interpret,
> but are cleanly factored so that the strings are wrapped up in a nice
> opaque attribute that is used as the sigil everywhere in the IR. We
> could do this with metadata, and technically this fits the model of
> metadata if we make the interpretation of the absence of metadata be
> "system". However, I'm quite hesitant to rely on metadata here as it
> hasn't always ended up working so well for us. ;]
Metadata was the first thing to be considered internally at AMD. But it
was quickly shot down because the Research guys were unwilling to accept
the possibility of scope being lost and replaced by a default "system"
scope. Current models are useful only when all atomic accesses for a
given location use the same scope throughout the application, i.e., all
threads running on all agents. So it is not okay for the compiler to
"promote" the scope in just one kernel unless it has access to the
entire application; the result is undefined. This is true for OpenCL
source as well as HSAIL target. This may change in the near furture:
HRF-Relaxed: Adapting HRF to the complexities of industrial
heterogeneous memory models
http://benedictgaster.org/?page_id=278
But even then, it will be difficult to say if the same models can be
applied to heterogeneous systems that don't resemble OpenCL or HSAIL.
> I'd be interested in your thoughts and others' thoughts on how me
> might encode an opaque string-based scope effectively. If we can find
> a reasonably clean way of doing it, it seems like the best approach at
> this point:
>
> - It ensures we have no bitcode stability problems.
> - It makes it easy to define a small number of IR-specified values
> like system/crossthread/allthreads/whatever and singlethread, and
> doing so isn't ever awkward due to any kind of baked-in ordering.
> - In practice in the real world, every target is probably going to
> just take this and map it to an enum that clearly spells out the rank
> for their target, so I suspect it won't actually increase the
> complexity of things much.
I seem to be missing something here about the need for strings. If they
are opaque anyway, and they are represented by sigils, then the sigils
themselves are all that matter, right? Then the encoding is just a number...
> But while the topic is wide open, here's another possibly whacky
> approach: we let the scopes be integers, and add a "scope layout"
> string similar to data-layout. The string encodes the ordering of
> the integers. If it is empty, then simple numerical comparisons
> are sufficient. Else the string spells out the exact ordering to
> be used. Any known current target will be happy with the first
> option. If some target inserts an intermediate scope in the
> future, then that version switches from empty to a fully specified
> string. The best part is that we don't even need to do this right
> now, and only come up with a "scope layout" spec when we really
> hit the problem for some future target.
>
>
> This isn't a bad approach, but it seems even more complex. I think I'd
> rather go with the fairly boring one where the IR just encodes enough
> data for the target to answer queries about the relationship between
> scopes.
I am not really championing scope layout strings over a
target-implemented API, but it seems less work to me rather than more.
The relationship between scopes is just an SWO, and it can be
represented as a graph. A practical target will have a very small number
of scopes, say not more than 16. It should be possible to encode this
into a graphviz-style string. Then instead of having every target
implement an API, they just have to specify the relationship as a string.
Sameer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150107/41d2e27f/attachment.html>
More information about the llvm-dev
mailing list