[llvm] [AMDGPU] Re-enable atomic optimization of uniform fadd/fsub with result (PR #97604)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 15 02:39:18 PDT 2024
================
@@ -937,18 +930,25 @@ void AMDGPUAtomicOptimizerImpl::optimizeAtomic(Instruction &I,
break;
case AtomicRMWInst::FAdd:
case AtomicRMWInst::FSub: {
- // FIXME: This path is currently disabled in visitAtomicRMWInst because
- // of problems calculating the first active lane of the result (where
- // Mbcnt is 0):
- // - If V is infinity or NaN we will return NaN instead of BroadcastI.
- // - If BroadcastI is -0.0 and V is positive we will return +0.0 instead
- // of -0.0.
LaneOffset = B.CreateFMul(V, Mbcnt);
break;
}
}
}
- Value *const Result = buildNonAtomicBinOp(B, Op, BroadcastI, LaneOffset);
+ Value *Result = buildNonAtomicBinOp(B, Op, BroadcastI, LaneOffset);
+ if (isAtomicFloatingPointTy) {
----------------
arsenm wrote:
Can you skip this if the function has the no-signed-zeros attribute?
https://github.com/llvm/llvm-project/pull/97604
More information about the llvm-commits
mailing list