[Openmp-commits] [PATCH] D77951: [OpenMP] Refined CUDA plugin to put all CUDA operations into class

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Sun Apr 12 01:02:22 PDT 2020


jdoerfert added inline comments.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:89
+std::list<KernelTy> KernelsList;
+
 /// Device environment data
----------------
tianshilei1992 wrote:
> jdoerfert wrote:
> > I don't know where this should go right now but a global list sounds wrong. Follow up.
> This is what it is originally. Basically I didn't change any logic in this patch. From its usage, looks like it is only for storage. Items will only be `emplace_back` to it and then get the pointer. That's it.
I figured it was not you, just that it seems weird. I should have made that clearer.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:117
+  int NumThreads = 0;
+};
 
----------------
tianshilei1992 wrote:
> jdoerfert wrote:
> > 1) Can we replace the list with vectors please. We do not add or delete elements once created.
> > 2) Can we make it a template class parametric in the Context type. e.g., `struct NVIDIADeviceDataTy : public DeviceDataTy<CUcontext> {}`.
> > 3) These should all be unsigned types, right? We also should add explicit documentation for each. I mean `NumThreads` is some maximum I guess.
> > 4) These types need to go into a common header (see also 2)). Follow up though
> 1. Can we replace the list with vectors please. We do not add or delete elements once created.
> Yes, we can do that.
> 2. Can we make it a template class parametric in the Context type. e.g., `struct NVIDIADeviceDataTy : public DeviceDataTy<CUcontext> {}`.
> I don't think so. Now we only know that CUDA has this context thing. We have no idea whether other platforms do. Other things like warp as well.
> 3. These should all be unsigned types, right? We also should add explicit documentation for each. I mean NumThreads is some maximum I guess.
> Yeah, they should be. Will change them correspondingly.
> 4. These types need to go into a common header (see also 2)). Follow up though.
> We should only put the most common part into a common header, but it seems that we only have three known common members: `FuncGblEntries`, `NumTeams`, and `NumThreads`.
If we need a "common/gpu" header we can do that as well. AMD will at least look similar enough to reuse the same code. They have some "context" even if it is a class we define in the AMD headers. You can even go as far and make

`common/Device.h` with the entries you mentioned and then `common/GPUDevice.h` with the template class `template<typename CtxTy> GPUDeviceDataTy : public DeviceDataTy`.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:172
     for (std::vector<CUstream> &S : StreamPool)
-      S.resize(EnvNumInitialStreams);
+      S.reserve(EnvNumInitialStreams);
 
----------------
tianshilei1992 wrote:
> jdoerfert wrote:
> > These three lines can be removed if you move the `getenv` stuff into init and call resizeStreamPool there.
> Correct. Actually I would leave the getenv stuff here because the env has already been determined during the library is initialized. I mean, the global variable is constructed.
OK


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:367
       DP("Parsed OMP_NUM_TEAMS=%d\n", EnvNumTeams);
-    } else {
-      EnvNumTeams = -1;
     }
 
----------------
tianshilei1992 wrote:
> jdoerfert wrote:
> > Style: I'd move the getenv calls into the condition `if (const ... = getenv) `
> But here `EnvStr` are used twice...You would like to have two local variables? :-)
If you do what I described the lifetime is limited to the conditional. There will be two variables, distinct but with the same name. That is for different reasons preferable, e.g., the variable is not available outside the conditional and it is not reused. Overall it is less complex.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:1044
+  assert(async_info_ptr && "async_info_ptr is nullptr");
+  assert(async_info_ptr->Queue && "async_info_ptr->Queue is nullptr");
 
----------------
tianshilei1992 wrote:
> jdoerfert wrote:
> > I like the validation in the user facing functions. Makes the assertion location better too.
> You mean take the validation from `isValidDeviceId` to this function?
I mean it is good to have the asserts here as that is where user provide input which we can validate. I'd leave it as it is in this version.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77951/new/

https://reviews.llvm.org/D77951





More information about the Openmp-commits mailing list