[PATCH] D157388: [AMDGPU] Support FMin/FMax in AMDGPUAtomicOptimizer.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 8 06:09:41 PDT 2023


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:405
+  case AtomicRMWInst::FMax:
+    return B.CreateSelect(B.CreateFCmp(FCmpInst::FCMP_UGT, LHS, RHS), LHS, RHS);
+  case AtomicRMWInst::FMin:
----------------
pravinjagtap wrote:
> arsenm wrote:
> > arsenm wrote:
> > > pravinjagtap wrote:
> > > > @arsenm you earlier suggested to use minnum/maxnum intrinsics for this. This also seems to give correct behavior. I am not sure what I am missing here 
> > > This is incorrect, you should create minnum/maxnum
> > Yes, it is wrong to use fcmp and select here. For example for fmax what you have returns the wrong result if LHS is a nan.
> > 
> >  select (ugt nan, rhs), nan, rhs -> nan
> >  maxnum(nan, rhs) -> rhs
> >you should create minnum/maxnum
> 
> Are you referring to `@llvm.amdgcn.fcmp.f32(float, float, i32)` intrinsic here right ?
No, IRBuilder.CreateMinNum and CreateMaxNum


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157388/new/

https://reviews.llvm.org/D157388



More information about the llvm-commits mailing list