[Openmp-commits] [PATCH] D107656: [OpenMP] Use events and taskyield in target nowait task to unblock host threads

Ye Luo via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Sat Aug 7 16:19:14 PDT 2021


ye-luo marked an inline comment as done.
ye-luo added inline comments.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:1189
+    }
+    Err = cuEventDestroy(Event);
+    if (Err != CUDA_SUCCESS) {
----------------
tianshilei1992 wrote:
> Event destroy worths a separate function. We could add a new return value such as `OFFLOAD_NOT_DONE` to indicate the event is not fulfilled. It is not a good idea to mix event query and event destroy.
I'm trying to avoid adding stuff that is not immediately used.
Similar to  Queue, the manipulation of Event is within the plugin and there are no need of APIs to create/destroy events from outside.

recordEvent is responsible to create and record an event.
queryEvent is responsbile to query and destroy an event upon completion.

```
#define OFFLOAD_SUCCESS (0)
#define OFFLOAD_FAIL (~0)
```
This is what I found, not even enum. I don't see a clean way to extend OFFLOAD_NOT_DONE
I think fixing the return style is not in the scope of this patch. Some design is needed for return values between plugin and device class and between device to omptarget.

In this case,
OFFLOAD_FAIL is for errors reported by CUDA runtime. Event point to signal it is completed or still on going.


================
Comment at: openmp/libomptarget/src/device.cpp:545
+    return RTL->record_event(RTLDeviceID, AsyncInfo);
+  } else {
+    AsyncInfo.setEventSupported(false);
----------------
tianshilei1992 wrote:
> early return
Changed to early return.


================
Comment at: openmp/libomptarget/src/device.cpp:546
+  } else {
+    AsyncInfo.setEventSupported(false);
+    return OFFLOAD_SUCCESS;
----------------
RaviNarayanaswamy wrote:
> Isn't this initialized to false when the AsyncInfo is created.
yes. So I removed the call to setEventSupported


================
Comment at: openmp/libomptarget/src/device.cpp:552
+// when OFFLOAD_SUCCESS is returned, it means either the event has been
+// fullfiled without error or the event has not been not fullfiled and
+// AsyncInfo.Event is not nullptr.
----------------
grokos wrote:
> "fullfiled" --> "fulfilled"
> "has not been not fullfiled" --> "has not been fulfilled"
> 
Thank you for pointing these out. Corrected.


================
Comment at: openmp/libomptarget/src/interface.cpp:323
 
 EXTERN int __tgt_target_nowait_mapper(
     ident_t *loc, int64_t device_id, void *host_ptr, int32_t arg_num,
----------------
grokos wrote:
> This single-team API function needs the same patching you applied to `__tgt_target_teams_nowait_mapper`.
I tend to first get one case the "target teams nowait" case implemented and then extend to all the rest. Not just this case but also all the update. If you think it is better to enable this function as well in this initial patch, let me know and I will add it.


================
Comment at: openmp/libomptarget/src/omptarget.cpp:24
+  char *EnvStr = getenv("LIBOMPTARGET_USE_NOWAIT_EVENT");
+  return EnvStr ? std::stoi(EnvStr) : 0;
+}();
----------------
grokos wrote:
> Should we check for invalid values of this env var?
I wanted something like libomp. TRUE/1/ON all goes to 1. but I don't know how to handle it in libomptarget.


================
Comment at: openmp/libomptarget/src/omptarget.cpp:45
+        DP("No event support by the pluggin! Calling synchronize\n");
+        Result = Device.synchronize(*this);
+      } else {
----------------
tianshilei1992 wrote:
> use early return
I rewrote the whole function to do mostly early returns.


================
Comment at: openmp/libomptarget/src/omptarget.cpp:51
+          Ret = Device.queryEvent(*this);
+        } while (Ret == OFFLOAD_SUCCESS && AsyncInfo.Event);
+
----------------
tianshilei1992 wrote:
> I'm thinking we can actually do more here. For example, set a count for every task yield. When the count reaches a threshold, fall back to stream synchronize. The threshold can be configured via env and so on.
This looks like an optimization which should be explored separately. I think I may use cuEventSynchronize.


================
Comment at: openmp/libomptarget/src/omptarget.cpp:73
   }
   return Result;
 }
----------------
RaviNarayanaswamy wrote:
> ye-luo wrote:
> > RaviNarayanaswamy wrote:
> > > Result is not set on all paths
> > When leaving line 62, the return value is OFFLOAD_SUCCESS as line 28 sets it
> I missed that.
Call cleaned up and use early return. More readable.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107656/new/

https://reviews.llvm.org/D107656



More information about the Openmp-commits mailing list