[Openmp-commits] [PATCH] D145831: [OpenMP][libomptarget] Add support for critical regions in AMD GPU device offloading
Gheorghe-Teodor Bercea via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Mar 24 06:35:44 PDT 2023
doru1004 added inline comments.
================
Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:279
+ (void)atomicExchange((uint32_t *)Lock, UNSET, atomic::acq_rel);
+}
+
----------------
JonChesterfield wrote:
> I expected unset to be a relaxed store of unset, possibly surrounded by fences. I'm not totally confident an acq_rel ordering here synchronises with the fences in setCriticalLock, though it might do.
The acq_rel semantics apply to the lock variable (not the fences) and the atomic store assumed here synchronizes with the atomic load in setCriticalLock.
================
Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:290
+ fenceKernel(atomic::aquire);
+ }
+}
----------------
JonChesterfield wrote:
> The fences probably need to be outside the branch on active thread, assuming the intent is for the lock to cover all the threads in the warp. Though I'm not sure it'll make any difference to codegen.
>
> I think relaxed ordering on both is right here.
The fence here is to ensure all other ordinary stores happen before the fence so it is enough that the lane that gets through to execute it. Eventually all lanes will execute it.
================
Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:473
}
+void unsetCriticalLock(omp_lock_t *Lock) {
----------------
JonChesterfield wrote:
> These read like a copy/paste error - most functions nearby are calling impl:: with a similar name. Could you add a comment to the effect that nvptx uses the same lock implementation for critical sections as for other locks?
I can add the comment I am not sure what the copy / paste error is.
================
Comment at: openmp/libomptarget/test/offloading/target_critical_region.cpp:5
+// UNSUPPORTED: nvptx64-nvidia-cuda-LTO
+// UNSUPPORTED: x86_64-pc-linux-gnu
+// UNSUPPORTED: x86_64-pc-linux-gnu-LTO
----------------
JonChesterfield wrote:
> why not nvptx? also why not x64, but that seems less immediately worrying
The nvptx implementation is broken so we avoid enabling it for this test. x64 is disabled since this test is really meant to be for critical regions in target regions.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D145831/new/
https://reviews.llvm.org/D145831
More information about the Openmp-commits
mailing list