[llvm] [AMDGPU][AtomicExpand] Use full flat emulation if a target supports f64 global atomic add instruction (PR #142859)
Shilei Tian via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 4 17:54:58 PDT 2025
================
@@ -17541,9 +17541,11 @@ void SITargetLowering::emitExpandAtomicAddrSpacePredicate(
// where we only insert a check for private and still use the flat instruction
// for global and shared.
- bool FullFlatEmulation = RMW && RMW->getOperation() == AtomicRMWInst::FAdd &&
- Subtarget->hasAtomicFaddInsts() &&
- RMW->getType()->isFloatTy();
+ bool FullFlatEmulation =
+ RMW && RMW->getOperation() == AtomicRMWInst::FAdd &&
+ ((Subtarget->hasAtomicFaddInsts() && RMW->getType()->isFloatTy()) ||
+ (Subtarget->hasFlatBufferGlobalAtomicFaddF64Inst() &&
+ RMW->getType()->isDoubleTy()));
----------------
shiltian wrote:
thought that was fine. updated.
https://github.com/llvm/llvm-project/pull/142859
More information about the llvm-commits
mailing list