[Openmp-commits] [PATCH] D138389: [OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior
Joseph Huber via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Mon Nov 21 11:09:08 PST 2022
jhuber6 added a subscriber: saiislam.
jhuber6 added a comment.
Looks good overall for me, mostly nits. I'm not the most qualifies to scrutinize the HSA usage, maybe @JonChesterfield or @saiislam could help on that front.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:43
+struct AMDGPUPluginTy;
+struct AMDGPUStreamTy;
+struct AMDGPUEventTy;
----------------
This is declared as a struct here then defined as a class later.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:64
+
+static_assert(sizeof(impl_implicit_args_t) == 56, "");
+
----------------
Empty argument isn't needed in C++17
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:148
+ if (Features.contains("sramecc+")) {
+ FeatureMap.insert(std::pair<std::string, bool>("sramecc", true));
+ } else if (Features.contains("sramecc-")) {
----------------
These can be `StringRef` since they're constants. Avoids the heap allocation.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:292
+
+ *PtrStorage = MemoryManager->allocate(Size, nullptr);
+ if (*PtrStorage == nullptr)
----------------
I'm not deeply familiar with the `MemoryManager`, but I thought that we didn't use it for AMD because the HSA plugin already handled its own caching in the memory pool? Maybe @JonChesterfield and @tianshilei1992 could comment further.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:415
+ // Publish the packet. Do not modify the package after this point.
+ __atomic_store_n(PacketPtr, Header | (Setup << 16), __ATOMIC_RELEASE);
+
----------------
Never looked into it much myself, but does `std::atomic` allow us to avoid using a compiler builtin for this? Not a big deal if not.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:536-537
+ // Kernel symbols have a ".kd" suffix.
+ std::string KernelName(getName());
+ KernelName += ".kd";
+
----------------
Small nit, can use LLVM's string, it's basically just a small vector of chars.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1009
+
+ /// Push an asynchronous memory copy host-to-device involving a unpinned
+ /// memory buffer. The operation consists of a two-step copy from the
----------------
typo
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1199
+
+ std::scoped_lock MultiLock(Mutex, RecordedStream.Mutex);
+
----------------
Probably best to specify the template arguments.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1572
+ DeviceImageTy &Image) override {
+ AMDGPUDeviceImageTy &AMDImage = static_cast<AMDGPUDeviceImageTy &>(Image);
+
----------------
Unused
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1712
+ Status =
+ hsa_amd_memory_lock((void *)HstPtr, Size, nullptr, 0, &PinnedHstPtr);
+ if (auto Err =
----------------
Using `const_cast` isn't great but otherwise the compiler will complain with warnings on.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1733
+
+ Status = hsa_amd_memory_unlock((void *)HstPtr);
+ if (auto Err =
----------------
Same here.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1760
+ return Stream.pushPinnedMemoryCopyAsync(HstPtr, TgtPtr, Size);
+ } else if (Size >= OMPX_MaxAsyncCopySize) {
+ // For large transfers use synchronous behavior.
----------------
Nit, no else after return
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:2048
+ const ELF64LE::Shdr &Section,
+ GlobalTy &ImageGlobal) {
+ // The global's address in AMDGPU is computed as the image begin + the ELF
----------------
Is this supposed to be marked override?
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:2176
+ // Setup the memory pools of available for the host.
+ if (Err = HostDevice->init())
+ return std::move(Err);
----------------
Best make a new one here as this was checked above.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:2297
+ return false;
+ } else if (EnvFeature->first() == ImgFeature.first() &&
+ EnvFeature->second != ImgFeature.second) {
----------------
Nit. no else after return.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138389/new/
https://reviews.llvm.org/D138389
More information about the Openmp-commits
mailing list