[Openmp-commits] [openmp] 2240b41 - [libomptarget] [amdgpu] Fix default setting of max flat workgroup size

Dhruva Chakrabarti via Openmp-commits openmp-commits at lists.llvm.org
Tue Jun 29 13:47:54 PDT 2021


Author: Dhruva Chakrabarti
Date: 2021-06-29T13:47:24-07:00
New Revision: 2240b41ee4f30fe938975677a0a5a2c5c26d271b

URL: https://github.com/llvm/llvm-project/commit/2240b41ee4f30fe938975677a0a5a2c5c26d271b
DIFF: https://github.com/llvm/llvm-project/commit/2240b41ee4f30fe938975677a0a5a2c5c26d271b.diff

LOG: [libomptarget] [amdgpu] Fix default setting of max flat workgroup size

When max flat workgroup size is not specified, it is set to the default
workgroup size. This prevents kernel launch with a workgroup size larger
than the default. The fix is to ignore a size of 0 and treat it as
unspecified.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D105073

Added: 
    

Modified: 
    openmp/libomptarget/plugins/amdgpu/src/rtl.cpp

Removed: 
    


################################################################################
diff  --git a/openmp/libomptarget/plugins/amdgpu/src/rtl.cpp b/openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
index 9a07d26546bbc..aaa68121db105 100644
--- a/openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
+++ b/openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
@@ -1707,10 +1707,9 @@ __tgt_target_table *__tgt_rtl_load_binary_locked(int32_t device_id,
       // Get ExecMode
       ExecModeVal = KernDescVal.Mode;
       DP("ExecModeVal %d\n", ExecModeVal);
-      if (KernDescVal.WG_Size == 0) {
-        KernDescVal.WG_Size = RTLDeviceInfoTy::Default_WG_Size;
-        DP("Setting KernDescVal.WG_Size to default %d\n", KernDescVal.WG_Size);
-      }
+      // If KernDescVal.WG_Size is 0, it is equivalent to not
+      // specified. Hence, max_flat_workgroup_size is filtered out in
+      // getLaunchVals
       WGSizeVal = KernDescVal.WG_Size;
       DP("WGSizeVal %d\n", WGSizeVal);
       check("Loading KernDesc computation property", err);
@@ -1920,7 +1919,7 @@ void getLaunchVals(int &threadsPerGroup, int &num_groups, int ConstWGSize,
     }
   }
   // check flat_max_work_group_size attr here
-  if (threadsPerGroup > ConstWGSize) {
+  if (ConstWGSize > 0 && threadsPerGroup > ConstWGSize) {
     threadsPerGroup = ConstWGSize;
     DP("Reduced threadsPerGroup to flat-attr-group-size limit %d\n",
        threadsPerGroup);


        


More information about the Openmp-commits mailing list