[LLVMdev] [RFC][PATCH][OPENCL] synchronization scopes redux

Owen Anderson resistor at mac.com
Mon Jan 5 22:51:02 PST 2015


Hi Sameer, 

> On Jan 5, 2015, at 4:51 AM, Sahasrabuddhe, Sameer <Sameer.Sahasrabuddhe at amd.com> wrote:
> 
> Right. The second version of my patches fixes the bitcode encoding. But now I see another potential problem with future bitcode if we require an ordering on the scopes. What happens when a backend later introduces a new scope that goes into the middle of the order? If they renumber the scopes to accomodate this, then existing bitcode for that backend will no longer work. The bitcode reader/writer cannot compensate for this since the values are backend-specific. If we agree that this problem is real, then we cannot force an ordering on the scope numbers.

That’s an interesting consideration, and something I hadn’t thought of.  I’m unsure offhand of how much it matters in practice.  The alternative, I suppose, is having something like string-named scopes, but then we can’t do much with them at the IR level.

> So far, I have refrained from proposing a keyword for cross thread scope in the text format, because (a) there never was one and (b) it is not strictly needed since it is the default anyway. I am fine either way, but we will first have to decide what the new keyword should be. I find "allthreads" to be a decent counterpart for "singlethread" ... "crossthread" is not good enough since intermediate scopes have multiple threads too. 

This actually raises another question.  In principle, the “most visible” scope ought to be something like “system” or “device”, meaning a completely uncached memory access that is visible to all peripherals in a heterogeneous system.  However, this is almost certainly not what we want to have for typical memory accesses.

To summarize, a prototypical scope nest, from most to least visible (aka least to most cacheable) might look like:

System  —>  AllThreads  —>  Various target-specific local scopes —> SingleThread

If we wanted to go really gonzo, there could be a Network scope at the beginning for large-scale HPC systems, but I’m not sure how important that is to anyone.

As a related question, do we actually need the local scopes to be target specific?  Are there systems, real or planned, that *aren’t* captured by:

[Network —> ] System  —>  AllThreads  —>  ThreadGroup —> SingleThread ?

—Owen



More information about the llvm-dev mailing list