[Openmp-commits] [PATCH] D77951: [OpenMP] Refined CUDA plugin to put all CUDA operations into class
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Sat Apr 11 17:36:37 PDT 2020
jdoerfert added a comment.
First set of comments. We need to split this. I suggested some split points and follow ups.
================
Comment at: openmp/libomptarget/include/omptarget.h:21
#define OFFLOAD_SUCCESS (0)
-#define OFFLOAD_FAIL (~0)
+#define OFFLOAD_FAIL (~0U)
----------------
Split this off, LGTM on this part
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:71
NONE
};
----------------
I imagine the above is all clang-formated, if so split it off, LGTM on that part. (keep the namsepace as part of the other patch)
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:89
+std::list<KernelTy> KernelsList;
+
/// Device environment data
----------------
I don't know where this should go right now but a global list sounds wrong. Follow up.
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:95
+ int DebugLevel = 0;
};
----------------
The above is defined in a different header. We should not redefine it here but include the appropriate header. Follow up.
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:117
+ int NumThreads = 0;
+};
----------------
1) Can we replace the list with vectors please. We do not add or delete elements once created.
2) Can we make it a template class parametric in the Context type. e.g., `struct NVIDIADeviceDataTy : public DeviceDataTy<CUcontext> {}`.
3) These should all be unsigned types, right? We also should add explicit documentation for each. I mean `NumThreads` is some maximum I guess.
4) These types need to go into a common header (see also 2)). Follow up though
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:172
for (std::vector<CUstream> &S : StreamPool)
- S.resize(EnvNumInitialStreams);
+ S.reserve(EnvNumInitialStreams);
----------------
These three lines can be removed if you move the `getenv` stuff into init and call resizeStreamPool there.
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:256
+ }
+};
----------------
Once we have the common header, the StreamManager interface has to go there as well. We need generic functions for stream create/destory and set context can be done via a DeviceDataTy method.
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:287
+ if (it.addr == Addr)
return true;
----------------
Nit: `It` or better `Entry`
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:301
+ E.Table.EntriesBegin = E.Entries.data();
+ E.Table.EntriesEnd = E.Entries.data() + E.Entries.size();
----------------
(Off topic: Why do we have a table member that needs to be updated when we use it, that seems wrong.)
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:346
return;
}
----------------
No braces around return also above.
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:353
- FuncGblEntries.resize(NumberOfDevices);
- Contexts.resize(NumberOfDevices);
- ThreadsPerBlock.resize(NumberOfDevices);
- BlocksPerGrid.resize(NumberOfDevices);
- WarpSize.resize(NumberOfDevices);
- NumTeams.resize(NumberOfDevices);
- NumThreads.resize(NumberOfDevices);
+ DeviceData.resize(NumberOfDevices);
----------------
Nice
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:367
DP("Parsed OMP_NUM_TEAMS=%d\n", EnvNumTeams);
- } else {
- EnvNumTeams = -1;
}
----------------
Style: I'd move the getenv calls into the condition `if (const ... = getenv) `
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:902
+int32_t __tgt_rtl_is_valid_binary(__tgt_device_image *image) {
+ return elf_check_machine(image, 190); // EM_CUDA = 190.
+}
----------------
Nit: LLVM style "generally" avoids trailing comments. Maybe inline it: `, /* EM_CUDA */ 190`
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:1037
+ tgt_args, tgt_offsets, arg_num,
+ 1, 1, 0, async_info_ptr);
+}
----------------
Add comments for the constants or define hem as before. /* ThreadNum */ or something is fine.
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:1044
+ assert(async_info_ptr && "async_info_ptr is nullptr");
+ assert(async_info_ptr->Queue && "async_info_ptr->Queue is nullptr");
----------------
I like the validation in the user facing functions. Makes the assertion location better too.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D77951/new/
https://reviews.llvm.org/D77951
More information about the Openmp-commits
mailing list