[Openmp-commits] [PATCH] D102407: [libomptarget][amdgpu] Fix truncation error for partial wavefront

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu May 13 08:50:58 PDT 2021


JonChesterfield created this revision.
JonChesterfield added reviewers: jdoerfert, dhruvachak, ronlieb.
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, jvesely, kzhuravl.
JonChesterfield requested review of this revision.
Herald added subscribers: openmp-commits, wdng.
Herald added a project: OpenMP.

[libomptarget][amdgpu] Fix truncation error for partial wavefront

The partial barrier implementation involves one wavefront resetting and N-1
waiting. This change future proofs against launching with a number of threads
that is not a multiple of the wavefront size.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D102407

Files:
  openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip


Index: openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
===================================================================
--- openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
+++ openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
@@ -56,7 +56,7 @@
 {
   __atomic_thread_fence(__ATOMIC_ACQUIRE);
 
-  uint32_t num_waves = num_threads / WARPSIZE;
+  uint32_t num_waves = (num_threads + WARPSIZE - 1) / WARPSIZE;
 
   // Partial barrier implementation for amdgcn.
   // Uses two 16 bit unsigned counters. One for the number of waves to have


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D102407.345158.patch
Type: text/x-patch
Size: 565 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20210513/6af294e8/attachment.bin>


More information about the Openmp-commits mailing list