[PATCH] D98953: [AMDGPU] Use reductions instead of scans in the atomic optimizer

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 19 11:37:13 PDT 2021


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:299
+        B.CreateCall(UpdateDPP,
+                     {Identity, V, B.getInt32(DPP::ROW_XMASK0 | 1 << Idx),
+                      B.getInt32(0xf), B.getInt32(0xf), B.getFalse()}));
----------------
foad wrote:
> b-sumner wrote:
> > This requires all lanes to be active.  Are we guaranteed that the work group size will be a integer multiple of the wave size?
> The reduction or scan runs in whole wave mode. All lanes are active.
... and lanes that weren't active to start with are set to an appropriate identity value for the operation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98953/new/

https://reviews.llvm.org/D98953



More information about the llvm-commits mailing list