[PATCH] D98953: [AMDGPU] Use reductions instead of scans in the atomic optimizer

Fri Mar 19 11:25:06 PDT 2021

b-sumner added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:299
+        B.CreateCall(UpdateDPP,
+                     {Identity, V, B.getInt32(DPP::ROW_XMASK0 | 1 << Idx),
+                      B.getInt32(0xf), B.getInt32(0xf), B.getFalse()}));
----------------
This requires all lanes to be active.  Are we guaranteed that the work group size will be a integer multiple of the wave size?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98953/new/

https://reviews.llvm.org/D98953