[PATCH] D98953: [AMDGPU] Use reductions instead of scans in the atomic optimizer

Carl Ritson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 23 01:14:13 PDT 2021


critson accepted this revision.
critson added a comment.
This revision is now accepted and ready to land.

LGTM - this seems like a good use of GFX10 row_xmask.

Please see minor comments.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:358
     // 48..63).
     Value *const PermX = B.CreateIntrinsic(
         Intrinsic::amdgcn_permlanex16, {},
----------------
Should this V_PERMLANEX16 be guarded as well?
Or at least have an assert?


================
Comment at: llvm/lib/Target/AMDGPU/SIDefines.h:674
 
 enum DppCtrl : unsigned {
   QUAD_PERM_FIRST   = 0,
----------------
Is it worth silencing linting of this enum?



================
Comment at: llvm/lib/Target/AMDGPU/SIDefines.h:710
   ROW_NEWBCAST_LAST = 0x15F,
+  ROW_SHARE0        = 0x150,
   ROW_SHARE_FIRST   = 0x150,
----------------
ROW_SHARE0 is defined, but not used?
Not that I am against it existing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98953/new/

https://reviews.llvm.org/D98953



More information about the llvm-commits mailing list