[PATCH] D157388: [AMDGPU] Support FMin/FMax in AMDGPUAtomicOptimizer.
Pravin Jagtap via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 8 05:58:40 PDT 2023
pravinjagtap added a comment.
For FMin and FMax cases, `clang` itself is emitting CAS loop for both
__device__ inline float atomicMax(float* addr, float val) and
__device__ inline float unsafeAtomicMax(float* addr, float val)
I am not sure how to potentially avoid this CAS loop before we reach atomic optimization pass.
CC: @b-sumner @arsenm
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:405
+ case AtomicRMWInst::FMax:
+ return B.CreateSelect(B.CreateFCmp(FCmpInst::FCMP_UGT, LHS, RHS), LHS, RHS);
+ case AtomicRMWInst::FMin:
----------------
@arsenm you earlier suggested to use minnum/maxnum intrinsics for this. This also seems to give correct behavior. I am not sure what I am missing here
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D157388/new/
https://reviews.llvm.org/D157388
More information about the llvm-commits
mailing list