[Openmp-commits] [PATCH] D145831: [OpenMP][libomptarget] Add support for critical regions in AMD GPU device offloading
Dhruva Chakrabarti via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Thu Mar 16 10:16:39 PDT 2023
dhruvachak added inline comments.
================
Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:269
+void unsetLock(omp_lock_t *Lock) {
+ (void)atomicExchange((uint32_t *)Lock, UNSET, atomic::acq_rel);
+}
----------------
In theory, unset could simply do an atomic write to the Lock with release memorder. Any reason that's not used here? You are not using the prev value anyways.
================
Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:284
+ fenceKernel(atomic::release);
+ while (!atomicCAS((uint32_t *)Lock, UNSET, SET, atomic::relaxed,
+ atomic::relaxed) != UNSET) {
----------------
I am getting confused by this pattern. You have:
while (!atomicCAS(...) != UNSET)
Can't the above be written as
while (atomicCAS(...) == UNSET)
But what does atomicCAS return? Bool or the prev value? If it is the prev value, shouldn't the condition be?
while (atomicCAS(...) == SET) { sleep(); }
================
Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:285
+ while (!atomicCAS((uint32_t *)Lock, UNSET, SET, atomic::relaxed,
+ atomic::relaxed) != UNSET) {
+ __builtin_amdgcn_s_sleep(32);
----------------
I think the CAS should take an acq_rel as the success memorder and an acq as the failure memorder.
The acq in both the cases will ensure it sees an update from another thread. The release from the other updating thread is not sufficient for this thread to see that update. And when this thread succeeds, the release will ensure other threads see the update of this thread.
With the above memorders, I would think we could get rid of the fences.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D145831/new/
https://reviews.llvm.org/D145831
More information about the Openmp-commits
mailing list