[PATCH] D156301: [WIP] Support FP global atomics in AMDGPUAtomicOptimizer.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 26 07:11:42 PDT 2023


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:650
+  case AtomicRMWInst::FMax:
+    return APFloat::getSmallest(APFloat::IEEEsingle(), false);
+  case AtomicRMWInst::FMin:
----------------
arsenm wrote:
> This would be +infinity for fmax.
> 
> For fadd you there isn't really an identity value since fadd -0, 0 -> -0. You probably can't do this without nsz, which we don't have a way of representing.
> 
> I have a draft patch for unsafe FP atomic metadata I don't have time to pick up.
For fadd you can use -0 as the identify value. For fsub I think 0 works:


Check instcombine:

define float @fsub_fold(float %x) {
  %add = fsub float %x, 0.0
  ret float %add
}

define float @fadd_fold_n0(float %x) {
  %add = fadd float %x, -0.0
  ret float %add
}

This is of course ignoring signaling nan quieting and denormal flushes


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156301/new/

https://reviews.llvm.org/D156301



More information about the llvm-commits mailing list