[Openmp-dev] [RFC] Device runtime library (re)design

Doerfert, Johannes via Openmp-dev openmp-dev at lists.llvm.org
Thu Aug 1 14:47:38 PDT 2019


Short summary of the phone call we had Wednesday for people that
couldn't make it, are not aware of the phone calls, or simply forgot:
  - We do split the requested changes into three different activities,
    thus separate patch sets:
    1) splitting common logic and target specific code
    2) improving documentation
    3) unifying coding style
  - We need to improve the testing situation, both CI testing and test
    coverage. Task include:
    * Setup buildbots that run OpenMP tests and report the results,
      detect failure, maybe later measure performance, etc.
    * Add OpenMP execution tests, both host and device, to the OpenMP
      subproject (see also below) and the LLVM Test Suite.
  - We need to agree on a directory structure and build model for the
    device runtime library or better libomptarget (potentially later
    also for other parts). As discussed, I outline what I think is a
    good solution below.


-------------------------------------------------------------------------------


libomptarget* directory structure
=================================
* everything currently contained in "openmp/libomptarget/"

Folder and file names (including file extensions) are subject to the
usual bikeshedding. Let's agree on the idea first and the color of the
shed later.


Sources
-------
For the device RTL it currently looks like this:
  openmp/libomptarget/deviceRTLs/nvptx/{docs,src,test,CMakeLists.txt}
The test structure is discussed later. docs is pretty much empty and it
should probably live with the common logic. src is discussed now:

I now assume we will have a common core (probably header only), and
device specific code that may or may not use common core parts. The
structure I imagine looks like this:
  openmp/libomptarget/deviceRTLs/common/include/{queue.h, target_impl.h,...}
  openmp/libomptarget/deviceRTLs/nvptx/{libomptarget.cu,CMakeLists.txt,...,include/target_impl.h,...}
  openmp/libomptarget/deviceRTLs/amdgpu/{libomptarget.hip,CMakeLists.txt,...,include/target_impl.h,...}
  openmp/libomptarget/deviceRTLs/CPU/{libomptarget.XYZ,CMakeLists.txt,...,include/target_impl.h,...}
  openmp/libomptarget/deviceRTLs/.../{libomptarget.XYZ,CMakeLists.txt,...,include/target_impl.h,...}
I added a CPU version above to ease testing (see below) and to provide a
target_impl.h declaration for better IDE support.

- The common files could be header only (debatable and not strictly necessary).
- All source files that need to be compiled and linked into a library
  for this device (as IR or object files) are listed in the device
  CMakeLists.txt file. This can include common source files if the
  common files are not header only. There is no need to include any
  common files if the device implementation is self-contained.
- To make target definitions available to common files we would:
    * put target specific function declarations, e.g.,
       `template <typename T> T __kmpc_impl_atomic_add(T *Ptr, T Val)`
      in common/include/target_impl.h.
    * put implementations in {nvptx,amdgpu,...}/include/target_impl.h.
    * have a `#include_next "target_impl.h"` in common/include/target_impl.h.
    * have the common/include and one device include folder in the
      include path (in this order) when we compile any file.


Tests
-----
So far, there are tests in
  openmp/libomptarget/test
and
  openmp/libomptarget/deviceRTLs/nvptx/test

I think we should reorganize this such that we have three distinct
kinds of tests living in distinct places:
  1) Functional/Execution tests written in C/C++/Fortran + OpenMP, living in
     openmp/libomptarget/test/{api,env,mapping,offloading,...}
  2) Unit tests for the common logic, living in
     openmp/libomptarget/unittests/{api,env,mapping,offloading,...}
  3) Unit tests for the device specific code, living in
     openmp/libomptarget/deviceRTLs/{nvptx,amdgpu,...}/unittests/{api,env,mapping,offloading,...}
Existing tests will be categorized accordingly and put into their
respective place.

To allow testing of common logic without offloading we can have a "CPU"
deviceRTL that implements the declarations in target_impl.h. It could
even offer us a way to vary the system parameters easily over vast
ranges not necessarily supported by existing hardware. This should be
discussed separately though.


-------------------------------------------------------------------------------

Comments welcome!

Thanks,
  Johannes



On 07/05, Doerfert, Johannes via Openmp-dev wrote:
> Tl;dr
>   We should extract device specific code out of the OpenMP deviceRTL such that we can reuse the common logic (>90%) for all devices.
>   We also need to improve the documentation and we should think about bringing the code into the LLVM coding style.
> 
> 
> Requested changes:
> I would like is to change the OpenMP device runtime library design (openmp/libomptarget/deviceRTLs) towards the following goals:
>  1) Allow reuse of common logic between different devices in a clean and extensible way.
>  2) Improve the documentation, e.g., doxygen comments and code comments, for the code.
>  3) Follow the same coding style as LLVM core.
> 
> Disclaimer:
> First, I do not want to say it currently is impossible the reuse the code for other devices or the code is not documented at all. What I
> think is that we can improve both substantially if we choose to do so. Also, a change in coding style is easier now than later, so if we
> decide to do refactoring, that can be included without adding to much churn.
> 
> Motivation:
> Now we can discuss if we should do any of the proposed changes but I guess most of them have clear benefits. I am also not the first
> to suggest them. Point 1 was mentioned with the initial drop of the device runtime [0], but it was rejected for time reasons. Point 2
> was recently discussed as a pressing issue in multiple reviews. Point 3 is a general observation as writing and reviewing code for the
> openmp sub project is unnecessarily hard for LLVM developers due to the different coding style.
> 
> Proposed structure:
> In order to ease the reuse by new devices we should have a common core with device independent logic, e.g., in
>   openmp/ibomptarget/deviceRTLs/common
> including an interface that declares all device specific methods needed by the core logic. The interface is then the
> only thing implemented in the device subfolders, e.g., 
>    openmp/ibomptarget/deviceRTLs/nvptx,   openmp/ibomptarget/deviceRTLs/amdgpu, ...
> To get to this goal, all device specific code has to be extracted from the core logic. The prototypes below show that this
> is fairly easy to do.
> 
> 
> Feasibility and prototypes:
> To showcase the direction I would like is to move to I "redesigned" three files (out of ~20) with the above goals in mind. The patches
> can be found here:
>   https://reviews.llvm.org/D64217
>   https://reviews.llvm.org/D64218
>   https://reviews.llvm.org/D64219
> Note that there is a vast design space even if we agree to the above three goals. As a consequence, I'd like us to use the patches to
> discuss general design decisions not specific ones until we agreed on a path forward.
>  
> 
> Please let me know what you think,
>   Johannes
> 
> 
> 
> [0] https://reviews.llvm.org/D14254#949985
> 
> 
> ---------------------------------------
> Johannes Doerfert
> Researcher
> 
> Argonne National Laboratory
> Lemont, IL 60439, USA
> 
> jdoerfert at anl.gov
> _______________________________________________
> Openmp-dev mailing list
> Openmp-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev

-- 

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20190801/3867b346/attachment.sig>


More information about the Openmp-dev mailing list