[Openmp-dev] [RFC] Device runtime library (re)design

Doerfert, Johannes via Openmp-dev openmp-dev at lists.llvm.org
Thu Jul 11 09:31:26 PDT 2019


Good point.

I would say we change it where it makes sense, e.g., we allow to override some parts with a different logic. Though, at the end of the day, it is the interface that needs to be implemented, how is up to the deviceRTLs. I want is to avoid duplication as much as possible, so not to force some logic, e.g., the current gpu-focused one, onto an implementation but to offer it. Maybe we will end up with multiple "common logics", but any reuse we can get out of it is worth separating the device specific parts, especially since there is no runtime cost to it.

Does that make sense and answer your question?

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Narayanaswamy, Ravi <ravi.narayanaswamy at intel.com>
Sent: Thursday, July 11, 2019 11:24:37 AM
To: Doerfert, Johannes; Bae, Hansang; Tian, Xinmin
Cc: Finkel, Hal J.; openmp-dev at lists.llvm.org
Subject: RE: [RFC] Device runtime library (re)design

Johannes,
   Suppose we define today some common logic based on existing deviceRTLs.  Are you requiring that all future deviceRTLs support this.  What if they cannot, are you going to change the common logic or claim that the new deviceRTLs is not compliant.
Thanks
Ravi

-----Original Message-----
From: Doerfert, Johannes [mailto:jdoerfert at anl.gov]
Sent: Wednesday, July 10, 2019 4:32 PM
To: Bae, Hansang <hansang.bae at intel.com>; Narayanaswamy, Ravi <ravi.narayanaswamy at intel.com>; Tian, Xinmin <xinmin.tian at intel.com>
Cc: Finkel, Hal J. <hfinkel at anl.gov>; openmp-dev at lists.llvm.org
Subject: Re: [RFC] Device runtime library (re)design

Hi Hansang,

I have the feeling the file ending and modeline (file comment) changes I made overshadow the important parts, namely the separation of device agnostic runtime logic and device specific implementation.

To exaggerate a bit:
   If people really want the modeline to say "CUDA" and the file ending to be ".cu", that is at the end of the day OK with me.
   After this rewrite, with the separation of common logic and device impl., I can simply interpret the common logic "CUDA" files if I need to, e.g., as HIP for AMDGPU.

I mentioned already in the other discussion [0], we can easily have a design in which none of the files has a ".cpp" ending.
Even if they have, that does not mean we actually interpret/compile them as C++ files ("clang -x cuda a.cpp" works just fine).
Calling them CUDA is simply misleading if they are not CUDA specific.
Calling them C++ might be misleading as well, I agree, but for now the logic part is basically C++ code.
During the rewrite we can make the code C, though I would prefer not to.

Thanks,
  Johannes

[0] http://lists.llvm.org/pipermail/openmp-dev/2019-July/002589.html

________________________________________
From: Bae, Hansang <hansang.bae at intel.com>
Sent: Wednesday, July 10, 2019 18:01
To: Narayanaswamy, Ravi; Doerfert, Johannes; Tian, Xinmin
Cc: Finkel, Hal J.; openmp-dev at lists.llvm.org
Subject: RE: [RFC] Device runtime library (re)design

I think there are already some discussions on the base language for this effort, and it is also one of my concern.

There can be a situation where C++ is not the best language to use for a certain device RTL development, so it may not be a good idea to write the "common" code in C++.

Thanks,
Hansang


-----Original Message-----
From: Narayanaswamy, Ravi
Sent: Wednesday, July 10, 2019 5:21 PM
To: Doerfert, Johannes <jdoerfert at anl.gov>; Tian, Xinmin <xinmin.tian at intel.com>
Cc: Finkel, Hal J. <hfinkel at anl.gov>; Bae, Hansang <hansang.bae at intel.com>
Subject: RE: [RFC] Device runtime library (re)design

I am responding to this mail to clarify that I mixed up my deviceRTL and plugin.
Hansang who implemented our library has some comments and will respond to the general mailing list.

-----Original Message-----
From: Doerfert, Johannes [mailto:jdoerfert at anl.gov]
Sent: Wednesday, July 10, 2019 3:11 PM
To: Narayanaswamy, Ravi <ravi.narayanaswamy at intel.com>; Tian, Xinmin <xinmin.tian at intel.com>
Cc: Finkel, Hal J. <hfinkel at anl.gov>
Subject: Re: [RFC] Device runtime library (re)design

We should have this conversation on the list but anyway.

deviceRTL contains code that is executed on the device. It doesn't need to be device specific, most of it at least. Once we refactor we can reuse the logic for all devices and provide small device specific parts (on a function level basis) where necessary.
libomptarget, depending on what you mean by it, contains code that is executed on the host to prepare the device and device execution. That code is also partially device specific and in parts generic.

________________________________________
From: Narayanaswamy, Ravi <ravi.narayanaswamy at intel.com>
Sent: Wednesday, July 10, 2019 16:37
To: Doerfert, Johannes; Tian, Xinmin
Cc: Finkel, Hal J.
Subject: RE: [RFC] Device runtime library (re)design

Johannes,
   My initial question is if it can be made common to all devices then why doesn't it belong in libomptarget.  The intent of deviceRTL is that the code there are device specific.
Thanks
Ravi

-----Original Message-----
From: Doerfert, Johannes [mailto:jdoerfert at anl.gov]
Sent: Wednesday, July 10, 2019 1:28 PM
To: Tian, Xinmin <xinmin.tian at intel.com>
Cc: Narayanaswamy, Ravi <ravi.narayanaswamy at intel.com>; Finkel, Hal J. <hfinkel at anl.gov>
Subject: Fw: [RFC] Device runtime library (re)design

Hi Xinmin, Ravi,

In case you haven't seen the RFC for the redesign of the OpenMP runtime (attached), please take a look and/or forward it to relevant people.
I think the proposed changes should it make easier for you guys to upstream as well. In any case, feedback through the list would be appreciated.

Thanks,
  Johannes


---------------------------------------
Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov

________________________________________
From: Openmp-dev <openmp-dev-bounces at lists.llvm.org> on behalf of Doerfert, Johannes via Openmp-dev <openmp-dev at lists.llvm.org>
Sent: Friday, July 5, 2019 12:16
To: openmp-dev at lists.llvm.org
Cc: Lingda Li; Hernandez, Oscar R.; Alexey Bataev; Gregory.Rodgers at amd.com; Xinmin Tian; Denny, Joel
Subject: [Openmp-dev] [RFC] Device runtime library (re)design

Tl;dr
  We should extract device specific code out of the OpenMP deviceRTL such that we can reuse the common logic (>90%) for all devices.
  We also need to improve the documentation and we should think about bringing the code into the LLVM coding style.


Requested changes:
I would like is to change the OpenMP device runtime library design (openmp/libomptarget/deviceRTLs) towards the following goals:
 1) Allow reuse of common logic between different devices in a clean and extensible way.
 2) Improve the documentation, e.g., doxygen comments and code comments, for the code.
 3) Follow the same coding style as LLVM core.

Disclaimer:
First, I do not want to say it currently is impossible the reuse the code for other devices or the code is not documented at all. What I think is that we can improve both substantially if we choose to do so. Also, a change in coding style is easier now than later, so if we decide to do refactoring, that can be included without adding to much churn.

Motivation:
Now we can discuss if we should do any of the proposed changes but I guess most of them have clear benefits. I am also not the first to suggest them. Point 1 was mentioned with the initial drop of the device runtime [0], but it was rejected for time reasons. Point 2 was recently discussed as a pressing issue in multiple reviews. Point 3 is a general observation as writing and reviewing code for the openmp sub project is unnecessarily hard for LLVM developers due to the different coding style.

Proposed structure:
In order to ease the reuse by new devices we should have a common core with device independent logic, e.g., in
  openmp/ibomptarget/deviceRTLs/common
including an interface that declares all device specific methods needed by the core logic. The interface is then the only thing implemented in the device subfolders, e.g.,
   openmp/ibomptarget/deviceRTLs/nvptx,   openmp/ibomptarget/deviceRTLs/amdgpu, ...
To get to this goal, all device specific code has to be extracted from the core logic. The prototypes below show that this is fairly easy to do.


Feasibility and prototypes:
To showcase the direction I would like is to move to I "redesigned" three files (out of ~20) with the above goals in mind. The patches can be found here:
  https://reviews.llvm.org/D64217
  https://reviews.llvm.org/D64218
  https://reviews.llvm.org/D64219
Note that there is a vast design space even if we agree to the above three goals. As a consequence, I'd like us to use the patches to discuss general design decisions not specific ones until we agreed on a path forward.


Please let me know what you think,
  Johannes



[0] https://reviews.llvm.org/D14254#949985


---------------------------------------
Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov
_______________________________________________
Openmp-dev mailing list
Openmp-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20190711/99f76a50/attachment-0001.html>


More information about the Openmp-dev mailing list