[Openmp-commits] [PATCH] D154523: [OpenMP][AMDGPU] Tracking of busy HSA queues

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Tue Jul 11 10:25:37 PDT 2023


jdoerfert added inline comments.


================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1449-1451
+    if (auto Err = Queues.front().init(Agent, QueueSize)) {
+      return Err;
+    }
----------------



================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1483
+    }
+
+    // Find an ideally idle queue, for the stream
----------------
As @kevinsala said, the above, and the `auto &Resource =` part below, can be replaced with `ResourceRef Resource = GenericDeviceResourceManagerTy::getResource();`


================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1501
+    (*Resource).Queue->removeUser();
+    ResourcePool[--NextAvailable] = Resource;
+  }
----------------
The above, except removeUser, should just be ` GenericDeviceResourceManagerTy::returnResource(Resource);`, no?


================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1515
+      Q = &Queues[i];
+      if (!Q->isBusy()) {
+        if (auto Err = Q->initLazy(Agent, QueueSize)) {
----------------
early exit if (busy) continue


================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1526
+    // Wrap around and continue search from the beginning
+    for (int i = 0; i < StartIndex; ++i) {
+      Q = &Queues[i];
----------------
kevinsala wrote:
> I believe we can simplify and merge these two loops into a single one
Yeah, my bad, I suggested this, we can do sth like
```
for (I = 0; < MaxNumQueues; ++I) {
  Idx = StartIndex++;
  if (StartIndex == MaxNumQueues) StartIndex = 0;
  // use Idx not I
...
}
```


================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1546
+  /// The queues which are assigned to requested streams.
+  std::vector<AMDGPUQueueTy> Queues;
+
----------------
kevinsala wrote:
> I feel this patch implements a Queue manager inside a Stream manager. Wouldn't it be better to define this logic inside a new `AMDGPUQueueManagerTy` and just have a reference of it in the `AMDGPUStreamManagerTy`?
We do not really manage the queues the same way. we can do more reuse, see above.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154523/new/

https://reviews.llvm.org/D154523



More information about the Openmp-commits mailing list