[LLVMdev] [RFC] OpenMP offload infrastructure

Mon Aug 11 06:36:57 PDT 2014

Sergey [et.al], thanks for putting this proposal together.  Overall, this looks like a pretty solid approach to providing relatively hardware agnostic omp target functionality.  I had several comments/questions as summarized below: 

Pros: 
- We [local colleagues and myself] like the concise target API.  We’re big fans of KISS development principles. 
- We believe this provides a good basis for future work in heterogeneous OMP support

Comments/Questions: 
- There doesn’t seem to be any mention of how mutable each runtime function is with respect to its target execution region.  The core OMP spec document notes in several places that certain user-visible runtime calls have “implementation defined” behavior depending upon where/how they’re used.  For example, what happens if the host runtime issues a __tgt_target_data_update() while the target is currently executing (__tgt_rtl_run_target_region() )?  Is this implementation defined?  I’m certainly ok with that answer, but I believe we need to explicitly state what the behavior is.  

- I noticed that Alexandre Eichenberger was one of the authors.  Has he mentioned any support/compatibility with the profiling interfaces he (JMC, et.al.) proposed?  How does one integrate the proposed profiling runtime logic with a target region (specifically the dispatch & data movement interfaces)?  This would be very handy.  

- I don’t see any mention of an interface to query the physical details of a device.  I know this strays a bit from the notion of portability, but it would be nice to have a simple interface (similar to ‘omp_get_max_threads’).  I stop short of querying information as detailed as provided by hwloc, but it would be nice for the user to have the ability to query the targets and see which ones are appropriate for execution.  This would essentially provide you the ability to build different implementations of a kernel and make a runtime decision on which one to execute.  EG, 
if( /* target of some specific type present */ ){ 
    /* use the omp target interface */
}else{ 
   /* use the normal worksharing or tasking interfaces */
}

(I realize this is more of an OMP spec question)

- It would be nice to define a runtime and/or environment mechanism that permits the user to enable/disable specific targets.  For example, if a system had four GPUs, but you only wanted to enable two, it would be convenient to do so using an environment variable.  I realize that one could do this using actual runtime calls in the code with some amount of intelligence, but this somewhat defeats the purpose of portability.  Again, this is more related to the 4.x spec, but it does have implications in the lower-level runtime.      

cheers
john 

On Aug 8, 2014, at 5:22 PM, Sergey Ostanevich <sergos.gnu at gmail.com> wrote:

> Hello everybody!
> 
> I would like to present a proposal for implementation of OpenMP
> offloading in LLVM. It was created by a list of authors and covers the
> runtime part at most and at a very high level. I believe it will be
> good to have input from community at this early stage before moving
> deeper in details.
> 
> The driver part is intentionally not touched, since we have no clear
> vision on how one can use 3rd party compiler for target code
> generation and incorporate its results into the final host link phase.
> I hope to hear from you more on this.
> 
> I invite you to take part in discussion of the document. Critics,
> proposals, updates - all are welcome!
> 
> Thank you,
> Sergey Ostanevich
> Open Source Compilers
> Intel Corporation
> <offload-proposal.pdf>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev