[PATCH] D98832: [libomptarget] Tune the number of teams and threads for kernel launch.

Tue Jun 29 06:25:51 PDT 2021

JonChesterfield added a comment.

This stuff definitely needs to be tested.

================
Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:99
 /// For AMDGPU GPUs
 static constexpr unsigned AMDGPUGpuGridValues[] = {
     448,       // GV_Threads
----------------
Also, this should be a struct, not an array of unsigned with enums for looking up fields

================
Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:113
     1024 / 64, // GV_Max_WG_Size / GV_WarpSize
-    63         // GV_Warp_Size_Log2_MaskL
+    63,        // GV_Warp_Size_Log2_MaskL
+    64 * 1024, // GV_Vector_Register_Count
----------------
I don't think these should be added to grid values.

They're not used by clang or LLVM, so don't need to be shared. I'm not convinced they're architecture independent constants (I think something has 32k LDS), and it looks like the plugin could use values discovered at runtime.

I think we're better off minimising the quantity of shared magic numbers so suggest we write the 64k / 3200 / 64 k as constants in the plugin instead.

================
Comment at: openmp/libomptarget/plugins/amdgpu/src/rtl.cpp:506
   static const int DefaultNumTeams = 128;
-  static const int Max_Teams =
-      llvm::omp::AMDGPUGpuGridValues[llvm::omp::GVIDX::GV_Max_Teams];
   static const int Warp_Size =
       llvm::omp::AMDGPUGpuGridValues[llvm::omp::GVIDX::GV_Warp_Size];
----------------
this is (usually) 32 for gfx10, and the plugin is architecture-agnostic, so this probably can't be a compile time enum

should all be unsigned too, negative numbers don't make sense for any of these

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98832/new/

https://reviews.llvm.org/D98832