[all-commits] [llvm/llvm-project] 1186e9: [LLVM][AMDGPU] Specialize 32-bit atomic fadd instr...
Shilei Tian via All-commits
all-commits at lists.llvm.org
Fri Nov 4 11:11:18 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 1186e9d59fea662292cdf62fdd1544b5b27d7d37
https://github.com/llvm/llvm-project/commit/1186e9d59fea662292cdf62fdd1544b5b27d7d37
Author: Shilei Tian <i at tianshilei.me>
Date: 2022-11-04 (Fri, 04 Nov 2022)
Changed paths:
M llvm/include/llvm/CodeGen/TargetLowering.h
M llvm/lib/CodeGen/AtomicExpandPass.cpp
M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
M llvm/lib/Target/AMDGPU/SIISelLowering.h
A llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll
A llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd-flat-specialization.ll
M llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll
Log Message:
-----------
[LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space
The 32-bit floating-point atomic add instructions on AMDGPUs does not support a
"flat" or "generic" address space. So, if the address space cannot be determined
statically, the AMDGPU backend will fall back to a CAS loop (which does support
"flat" addressing). Instead, this patch emits runtime address-space checks to
allow native FP atomic add instructions for global and LDS memory (and non-atomic
FP add instructions for private/scratch memory).
In order to do that, this patch introduces a new interface function
`emitExpandAtomicRMW`. It is expected to be called when a common atomic expand
doesn't work for a specific target, such as the case we discussed here.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D129690
More information about the All-commits
mailing list