[PATCH] D103225: [AMDGPU] Replace non-kernel function uses of LDS globals by pointers.
Brian Sumner via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 8 07:24:48 PDT 2021
b-sumner added a comment.
Two approaches for limiting the stores to lane 0 of each wave:
1. Write 1 to exec mask, store, and write -1 to exec mask. This works since the exec mask at the start of the wave when this happens is zero
2. Check for lane == 0 and branch. The lane can be computed by a) wave64: __builtin_amdgcn_mbcnt_hi(~0u, __builtin_amdgcn_mbcnt_lo(~0u, 0u)) b) wave32: __builtin_amdgcn_mbcnt_lo(~0u, 0u)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103225/new/
https://reviews.llvm.org/D103225
More information about the llvm-commits
mailing list