[PATCH] D98953: [AMDGPU] Use reductions instead of scans in the atomic optimizer
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 19 07:52:17 PDT 2021
foad created this revision.
foad added reviewers: rampitec, critson, piotr.
Herald added subscribers: kerbowa, jfb, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
foad requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.
If the result of an atomic operation is not used then it can be more
efficient to build a reduction across all lanes instead of a scan. Do
this for GFX10, where the permlanex16 instruction makes it viable. For
wave64 this saves a couple of dpp operations. For wave32 it saves one
readlane (which are generally bad for performance) and one dpp
operation.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D98953
Files:
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
llvm/lib/Target/AMDGPU/GCNSubtarget.h
llvm/lib/Target/AMDGPU/SIDefines.h
llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D98953.331877.patch
Type: text/x-patch
Size: 13663 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210319/d1729e7c/attachment.bin>
More information about the llvm-commits
mailing list