[PATCH] D29974: [AMDGPU] Fix MaxWorkGroupsPerCU for large workgroups

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 14 17:02:40 PST 2017


rampitec created this revision.
Herald added subscribers: tpr, tony-tye, yaxunl, nhaehnle, wdng, kzhuravl, arsenm.

This patch corrects the maximum workgroups per CU if we have big
workgroups (more than 128). This calculation contributes to the
occupancy calculation in respect to LDS size.


Repository:
  rL LLVM

https://reviews.llvm.org/D29974

Files:
  lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
  test/CodeGen/AMDGPU/large-work-group-promote-alloca.ll


Index: test/CodeGen/AMDGPU/large-work-group-promote-alloca.ll
===================================================================
--- test/CodeGen/AMDGPU/large-work-group-promote-alloca.ll
+++ test/CodeGen/AMDGPU/large-work-group-promote-alloca.ll
@@ -69,7 +69,8 @@
 }
 
 ; ALL-LABEL: @occupancy_0(
-; ALL: alloca [5 x i32]
+; CI-NOT: alloca [5 x i32]
+; SI: alloca [5 x i32]
 define void @occupancy_0(i32 addrspace(1)* nocapture %out, i32 addrspace(1)* nocapture %in) #3 {
 entry:
   %stack = alloca [5 x i32], align 4
@@ -91,7 +92,8 @@
 }
 
 ; ALL-LABEL: @occupancy_max(
-; ALL: alloca [5 x i32]
+; CI-NOT: alloca [5 x i32]
+; SI: alloca [5 x i32]
 define void @occupancy_max(i32 addrspace(1)* nocapture %out, i32 addrspace(1)* nocapture %in) #4 {
 entry:
   %stack = alloca [5 x i32], align 4
Index: lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
===================================================================
--- lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+++ lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
@@ -151,7 +151,11 @@
                                unsigned FlatWorkGroupSize) {
   if (!Features.test(FeatureGCN))
     return 8;
-  return getWavesPerWorkGroup(Features, FlatWorkGroupSize) == 1 ? 40 : 16;
+  unsigned N = getWavesPerWorkGroup(Features, FlatWorkGroupSize);
+  if (N == 1)
+    return 40;
+  N = 40 / N;
+  return std::min(N, 16u);
 }
 
 unsigned getMaxWavesPerCU(const FeatureBitset &Features) {


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D29974.88467.patch
Type: text/x-patch
Size: 1423 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170215/d93f6dd0/attachment.bin>


More information about the llvm-commits mailing list