[Openmp-commits] [PATCH] D138389: [OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Tue Nov 29 16:06:35 PST 2022


JonChesterfield added inline comments.


================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:725
+    // Lock the queue during the packet publishing process.
+    std::lock_guard<AMDGPUQueueTy> Lock(Queue);
+
----------------
Confused by this, why are we locking the queue?


================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:968
+      hsa_signal_t InputSignalRaw = InputSignal->get();
+      Status = hsa_amd_memory_async_copy(Inter, Queue.getAgent(), Src,
+                                         Queue.getAgent(), CopySize, 1,
----------------
passing queue.getAgent() as two arguments here seems suspicious - shouldn't one of these be the GPU and one the host?


================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1931
+    bool Found = false;
+    HostAllocationsMutex.lock_shared();
+    if (!HostAllocations.empty()) {
----------------
I can't work out what's going on here. The corresponding logic for erase looks up the pointer directly, should found not be the same? Also can't tell why we're recording the size of the allocation next to the pointer, as opposed to a DenseSet<void*>


================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1973
+  /// with an intermediate pinned buffer.
+  std::map<const void *, size_t> HostAllocations;
+  mutable std::shared_mutex HostAllocationsMutex;
----------------
this is implemented as a tree - why std::map?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138389/new/

https://reviews.llvm.org/D138389



More information about the Openmp-commits mailing list