[Openmp-commits] [PATCH] D77412: [OpenMP] Introduce stream pool to make sure the correctness of device synchronization

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Apr 10 10:03:19 PDT 2020


jdoerfert added inline comments.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:103
+  // Pointer to per-device context
+  std::vector<CUcontext> *ContextsPtr;
+
----------------
Later we can think abut making this a vector of structs instead as we tend to access them per device anyway. We can then align the structs such that different processes interacting with different devices don't really interfere with each other.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:118
+      DP("Error when setting current CUDA context\n");
+      CUDA_ERR_STRING(err);
+      // We will return if cannot switch to the right context in case of
----------------
Outline this into a helper. Also the error checking should be a helper.

So something like:
```
/// ....
bool checkResult(CUResult err, const char *ErrMsg) {
  if (err == CUDA_SUCCESS)
    return true;
   DP(ErrMsg);
   CUDA_ERR_STRING(err);
   return false;
}

/// ...
bool switchToDevice(DeviceId) {
  CUresult err = cuCtxSetCurrent((*ContextsPtr)[DeviceId]);
  return checkResult(err, "Error when creating CUDA stream to resize stream pool\n");
}
```

and then use it eveywhere. Should cut down the size and duplication.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:137
+  StreamManagerTy(const int NumberOfDevices, std::vector<CUcontext> *CtxPtr)
+      : NumberOfDevices(NumberOfDevices), ContextsPtr(CtxPtr) {
+    StreamPool.resize(NumberOfDevices);
----------------
If there is no reason ever to provide a nullptr as `CtxPtr` make it a reference everywhere instead.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:150
+    for (std::vector<CUstream> &S : StreamPool)
+      S.resize(EnvNumInitialStreams);
+
----------------
Don't you need to create the sreams, e.g., call resizeStreamPool instead?


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:540
       CUDA_ERR_STRING(err);
     }
   }
----------------
I was expecting this to be part of the StreamManager. If it can only happen at this point, maybe we want a `StreamManager::initializeDevice` method or similar. We can then also avoid exposing the stream pool to the outside. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77412/new/

https://reviews.llvm.org/D77412





More information about the Openmp-commits mailing list