[PATCH] D98953: [AMDGPU] Use reductions instead of scans in the atomic optimizer
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 19 11:37:13 PDT 2021
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:299
+ B.CreateCall(UpdateDPP,
+ {Identity, V, B.getInt32(DPP::ROW_XMASK0 | 1 << Idx),
+ B.getInt32(0xf), B.getInt32(0xf), B.getFalse()}));
----------------
foad wrote:
> b-sumner wrote:
> > This requires all lanes to be active. Are we guaranteed that the work group size will be a integer multiple of the wave size?
> The reduction or scan runs in whole wave mode. All lanes are active.
... and lanes that weren't active to start with are set to an appropriate identity value for the operation.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D98953/new/
https://reviews.llvm.org/D98953
More information about the llvm-commits
mailing list