[PATCH] D156301: [WIP] Support FP global atomics in AMDGPUAtomicOptimizer.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 26 07:11:42 PDT 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:650
+ case AtomicRMWInst::FMax:
+ return APFloat::getSmallest(APFloat::IEEEsingle(), false);
+ case AtomicRMWInst::FMin:
----------------
arsenm wrote:
> This would be +infinity for fmax.
>
> For fadd you there isn't really an identity value since fadd -0, 0 -> -0. You probably can't do this without nsz, which we don't have a way of representing.
>
> I have a draft patch for unsafe FP atomic metadata I don't have time to pick up.
For fadd you can use -0 as the identify value. For fsub I think 0 works:
Check instcombine:
define float @fsub_fold(float %x) {
%add = fsub float %x, 0.0
ret float %add
}
define float @fadd_fold_n0(float %x) {
%add = fadd float %x, -0.0
ret float %add
}
This is of course ignoring signaling nan quieting and denormal flushes
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D156301/new/
https://reviews.llvm.org/D156301
More information about the llvm-commits
mailing list