[all-commits] [llvm/llvm-project] fd1f8c: [AMDGPU] Limit TID / wavefrontsize uniformness to ...

Tue Aug 30 12:22:34 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: fd1f8c85f2c0b989d1ac3a05b920f5b5fa355645
      https://github.com/llvm/llvm-project/commit/fd1f8c85f2c0b989d1ac3a05b920f5b5fa355645
  Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
  Date:   2022-08-30 (Tue, 30 Aug 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
    M llvm/test/CodeGen/AMDGPU/uniform-load-from-tid.ll

  Log Message:
  -----------
  [AMDGPU] Limit TID / wavefrontsize uniformness to 1D kernels

If a kernel has uneven dimensions we can have a value of workitem-id-x
divided by the wavefrontsize non-uniform. For example dimensions (65, 2)
will have workitems with address (64, 0) and (0, 1) packed into a same
wave which gives 1 and 0 after the division by 64 respectively.

Unfortunately, this limits the optimization to OpenCL only and only if
reqd_work_group_size attribute is set. This patch limits it to 1D kernels,
although that shall be possible to perform this optimization is the size
of the X dimension is a power of 2, we just do not currently have
infrastructure to query it.

Note that presence of amdgpu-no-workitem-id-y attribute does not help
as it only hints the lack of the workitem-id-y query, but not the absence
of the actual 2nd dimension, therefore affecting just the SGPR allocation.

Differential Revision: https://reviews.llvm.org/D132879