[PATCH] D98953: [AMDGPU] Use reductions instead of scans in the atomic optimizer
Carl Ritson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 23 01:14:13 PDT 2021
critson accepted this revision.
critson added a comment.
This revision is now accepted and ready to land.
LGTM - this seems like a good use of GFX10 row_xmask.
Please see minor comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:358
// 48..63).
Value *const PermX = B.CreateIntrinsic(
Intrinsic::amdgcn_permlanex16, {},
----------------
Should this V_PERMLANEX16 be guarded as well?
Or at least have an assert?
================
Comment at: llvm/lib/Target/AMDGPU/SIDefines.h:674
enum DppCtrl : unsigned {
QUAD_PERM_FIRST = 0,
----------------
Is it worth silencing linting of this enum?
================
Comment at: llvm/lib/Target/AMDGPU/SIDefines.h:710
ROW_NEWBCAST_LAST = 0x15F,
+ ROW_SHARE0 = 0x150,
ROW_SHARE_FIRST = 0x150,
----------------
ROW_SHARE0 is defined, but not used?
Not that I am against it existing.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D98953/new/
https://reviews.llvm.org/D98953
More information about the llvm-commits
mailing list