[PATCH] D131276: AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true

Changpeng Fang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 5 11:28:12 PDT 2022


cfang created this revision.
cfang added reviewers: bcahoon, arsenm, b-sumner, AMDGPU.
Herald added subscribers: kosarev, foad, jdoerfert, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl.
Herald added a project: All.
cfang requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.

Under code object version 5, __ockl_get_local_size returns the value computed by the expression:
 workgroup_id < hidden_block_count ? hidden_group_size : hidden_remainder
For functions with the attribute uniform-work-group-size=true. we can evaluate workgroup_id < hidden_block_count
 as true, and thus hidden_group_size is returned for __ockl_get_local_size.

With uniform-workgroup-size=true, this work also set all remainders to zero, and if there
is reqd_work_group_size, we also set work-group-size to the required value from the metadata.


https://reviews.llvm.org/D131276

Files:
  llvm/lib/Target/AMDGPU/AMDGPULowerKernelAttributes.cpp
  llvm/test/CodeGen/AMDGPU/implicit-arg-v5-opt.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D131276.450333.patch
Type: text/x-patch
Size: 19720 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220805/e4b0bda0/attachment.bin>


More information about the llvm-commits mailing list