[PATCH] D98953: [AMDGPU] Use reductions instead of scans in the atomic optimizer

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 19 11:36:29 PDT 2021


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:299
+        B.CreateCall(UpdateDPP,
+                     {Identity, V, B.getInt32(DPP::ROW_XMASK0 | 1 << Idx),
+                      B.getInt32(0xf), B.getInt32(0xf), B.getFalse()}));
----------------
b-sumner wrote:
> This requires all lanes to be active.  Are we guaranteed that the work group size will be a integer multiple of the wave size?
The reduction or scan runs in whole wave mode. All lanes are active.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98953/new/

https://reviews.llvm.org/D98953



More information about the llvm-commits mailing list