[Openmp-commits] [PATCH] D145831: [OpenMP][libomptarget] Add support for critical regions in AMD GPU device offloading

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Mar 24 10:34:38 PDT 2023


JonChesterfield accepted this revision.
JonChesterfield added a comment.
This revision is now accepted and ready to land.

OK, talked to some more people. Fences are fine inside the branch.

We need acquire/release fencing around the critical block so that code which expects to see writes from other threads going through the same critical block works.

We need mutual exclusion so that we actually have the critical semantics.

As written this patch should do that. Taking the mutex & doing device scope fencing for each lane in the wavefront is a slow thing to do but should work. Better would be to take the mutex once per warp, something like:

  if (id == ffs(activemask)) {
    while (atomicCAS(...)) builtin_sleep();
    fence_acquire(agent)
  }
  for each lane in warp {
    fence_acquire(workgroup);
    critical-region-here
    fence_release(workgroup);
  }
  drop-mutex(agent)




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145831/new/

https://reviews.llvm.org/D145831



More information about the Openmp-commits mailing list