[PATCH] D29423: [AMDGPU] Account workgroup size in LDS occupancy limits

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 1 13:58:04 PST 2017


rampitec created this revision.
Herald added a reviewer: tstellarAMD.
Herald added subscribers: tpr, tony-tye, yaxunl, nhaehnle, wdng, kzhuravl.

Functions matching LDS use to occupancy return results for a workgroup
of 64 workitems. The numbers has to be adjusted for bigger workgroups.
For example a workgroup of size 256 already occupies 4 waves just by
itself. Given that all numbers of LDS use in the compiler are per
workgroup, occupancy shall be multiplied by 4 in this case. Each 64
workitems still limited by the same number, but 4 subrgoups 64 workitems
each can afford 4 times more LDS to get the same occupancy.

In addition change initializes LDS size in the subtarget to a real value
for SI+ targets. This is required since LDS size is a variable in these
calculations.


Repository:
  rL LLVM

https://reviews.llvm.org/D29423

Files:
  lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
  lib/Target/AMDGPU/AMDGPUSubtarget.cpp
  lib/Target/AMDGPU/AMDGPUSubtarget.h
  lib/Target/AMDGPU/GCNSchedStrategy.cpp
  test/CodeGen/AMDGPU/indirect-private-64.ll
  test/CodeGen/AMDGPU/large-work-group-promote-alloca.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D29423.86711.patch
Type: text/x-patch
Size: 10175 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170201/5f310e55/attachment.bin>


More information about the llvm-commits mailing list