[Openmp-commits] [PATCH] D99003: [libomptarget] [amdgpu] Change default number of teams per computation unit

Dhruva Chakrabarti via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Mar 19 19:00:28 PDT 2021

dhruvachak created this revision.
dhruvachak added reviewers: JonChesterfield, jdoerfert, ronlieb.
Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl.
dhruvachak requested review of this revision.
Herald added subscribers: openmp-commits, wdng.
Herald added a project: OpenMP.

This patch is related to https://reviews.llvm.org/D98832. Based on discussions there, I decided to separate out the teams default as this patch. This change is to increase the number of teams per computation unit so as to provide more wavefronts for hiding latency. This change improves performance for some programs, including 20-50% for some Stream benchmarks.

  rG LLVM Github Monorepo



Index: openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
--- openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
+++ openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
@@ -70,6 +70,10 @@
+// Heuristic parameters used for kernel launch
+// Number of teams per CU to allow scheduling flexibility
+static const unsigned DefaultTeamsPerCU = 4;
 int print_kernel_trace;
 // Size of the target call stack struture
@@ -790,7 +794,7 @@
   } else {
     char *TeamsPerCUEnvStr = getenv("OMP_TARGET_TEAMS_PER_PROC");
-    int TeamsPerCU = 1; // default number of teams per CU is 1
+    int TeamsPerCU = DefaultTeamsPerCU;
     if (TeamsPerCUEnvStr) {
       TeamsPerCU = std::stoi(TeamsPerCUEnvStr);

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D99003.332057.patch
Type: text/x-patch
Size: 798 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20210320/f3f53043/attachment.bin>

More information about the Openmp-commits mailing list