[all-commits] [llvm/llvm-project] 3ae4c3: AMDGPU: Implicit kernel arguments related optimiz...
Changpeng Fang via All-commits
all-commits at lists.llvm.org
Tue Sep 20 17:27:09 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 3ae4c3589ec7336d363fc1779c4a99360164c8f4
https://github.com/llvm/llvm-project/commit/3ae4c3589ec7336d363fc1779c4a99360164c8f4
Author: Changpeng Fang <changpeng.fang at amd.com>
Date: 2022-09-20 (Tue, 20 Sep 2022)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPULowerKernelAttributes.cpp
A llvm/test/CodeGen/AMDGPU/implicit-arg-v5-opt.ll
Log Message:
-----------
AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true
Summary:
Under code object version 5, ockl_get_local_size returns the value computed by the expression:
workgroup_id < hidden_block_count ? hidden_group_size : hidden_remainder
For functions with the attribute uniform-work-group-size=true. we can evaluate workgroup_id < hidden_block_count
as true, and thus hidden_group_size is returned for ockl_get_local_size.
With uniform-workgroup-size=true, this work also set all remainders to zero, and if there
is reqd_work_group_size, we also set work-group-size to the required value from the metadata.
Reviewers:
arsenm and bcahoon
Differential Revision:
https://reviews.llvm.org/D131276
More information about the All-commits
mailing list