[Openmp-commits] [PATCH] D145831: [OpenMP][libomptarget] Add support for critical regions in AMD GPU device offloading
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Mar 24 10:34:38 PDT 2023
JonChesterfield accepted this revision.
JonChesterfield added a comment.
This revision is now accepted and ready to land.
OK, talked to some more people. Fences are fine inside the branch.
We need acquire/release fencing around the critical block so that code which expects to see writes from other threads going through the same critical block works.
We need mutual exclusion so that we actually have the critical semantics.
As written this patch should do that. Taking the mutex & doing device scope fencing for each lane in the wavefront is a slow thing to do but should work. Better would be to take the mutex once per warp, something like:
if (id == ffs(activemask)) {
while (atomicCAS(...)) builtin_sleep();
fence_acquire(agent)
}
for each lane in warp {
fence_acquire(workgroup);
critical-region-here
fence_release(workgroup);
}
drop-mutex(agent)
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D145831/new/
https://reviews.llvm.org/D145831
More information about the Openmp-commits
mailing list