[Openmp-commits] [PATCH] D154523: [OpenMP][AMDGPU] Tracking of busy HSA queues
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Tue Jul 11 10:25:37 PDT 2023
jdoerfert added inline comments.
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1449-1451
+ if (auto Err = Queues.front().init(Agent, QueueSize)) {
+ return Err;
+ }
----------------
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1483
+ }
+
+ // Find an ideally idle queue, for the stream
----------------
As @kevinsala said, the above, and the `auto &Resource =` part below, can be replaced with `ResourceRef Resource = GenericDeviceResourceManagerTy::getResource();`
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1501
+ (*Resource).Queue->removeUser();
+ ResourcePool[--NextAvailable] = Resource;
+ }
----------------
The above, except removeUser, should just be ` GenericDeviceResourceManagerTy::returnResource(Resource);`, no?
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1515
+ Q = &Queues[i];
+ if (!Q->isBusy()) {
+ if (auto Err = Q->initLazy(Agent, QueueSize)) {
----------------
early exit if (busy) continue
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1526
+ // Wrap around and continue search from the beginning
+ for (int i = 0; i < StartIndex; ++i) {
+ Q = &Queues[i];
----------------
kevinsala wrote:
> I believe we can simplify and merge these two loops into a single one
Yeah, my bad, I suggested this, we can do sth like
```
for (I = 0; < MaxNumQueues; ++I) {
Idx = StartIndex++;
if (StartIndex == MaxNumQueues) StartIndex = 0;
// use Idx not I
...
}
```
================
Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1546
+ /// The queues which are assigned to requested streams.
+ std::vector<AMDGPUQueueTy> Queues;
+
----------------
kevinsala wrote:
> I feel this patch implements a Queue manager inside a Stream manager. Wouldn't it be better to define this logic inside a new `AMDGPUQueueManagerTy` and just have a reference of it in the `AMDGPUStreamManagerTy`?
We do not really manage the queues the same way. we can do more reuse, see above.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D154523/new/
https://reviews.llvm.org/D154523
More information about the Openmp-commits
mailing list