[all-commits] [llvm/llvm-project] 3ae4c3: AMDGPU: Implicit kernel arguments related optimiz...

Changpeng Fang via All-commits all-commits at lists.llvm.org
Tue Sep 20 17:27:09 PDT 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 3ae4c3589ec7336d363fc1779c4a99360164c8f4
      https://github.com/llvm/llvm-project/commit/3ae4c3589ec7336d363fc1779c4a99360164c8f4
  Author: Changpeng Fang <changpeng.fang at amd.com>
  Date:   2022-09-20 (Tue, 20 Sep 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPULowerKernelAttributes.cpp
    A llvm/test/CodeGen/AMDGPU/implicit-arg-v5-opt.ll

  Log Message:
  -----------
   AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true

 Summary:
   Under code object version 5, ockl_get_local_size returns the value computed by the expression:
workgroup_id < hidden_block_count ? hidden_group_size : hidden_remainder
For functions with the attribute uniform-work-group-size=true. we can evaluate workgroup_id < hidden_block_count
as true, and thus hidden_group_size is returned for ockl_get_local_size.
  With uniform-workgroup-size=true, this work also set all remainders to zero, and if there
is reqd_work_group_size, we also set work-group-size to the required value from the metadata.

Reviewers:
  arsenm and bcahoon

Differential Revision:
  https://reviews.llvm.org/D131276




More information about the All-commits mailing list