[PATCH] D156301: [AMDGPU] Support FAdd/FSub global atomics in AMDGPUAtomicOptimizer.
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 11 02:26:49 PDT 2023
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:78
Value *buildReduction(IRBuilder<> &B, AtomicRMWInst::BinOp Op, Value *V,
- Value *const Identity) const;
+ Value *const Identity,
+ bool isAtomicFloatingPointTy) const;
----------------
No need to pass in isAtomicFloatingPointTy to all these functions. It is just V->getType()->isFloatingPointTy().
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:317
- const unsigned ValIdx = 0;
-
+ unsigned ValIdx = 0;
const bool ValDivergent = UA->isDivergentUse(I.getOperandUse(ValIdx));
----------------
Don't need to change this
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:662
-static APInt getIdentityValueForAtomicOp(AtomicRMWInst::BinOp Op,
- unsigned BitWidth) {
+static Constant *getIdentityValueForAtomicOp(LLVMContext &C, Type *const Ty,
+ AtomicRMWInst::BinOp Op,
----------------
You can derive C from Ty, and BitWidth from Ty, so the arguments should just be: `AtomicRMWInst::BinOp Op, Type *Ty`
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:843
+ Value *const CtpopFP = B.CreateUIToFP(Ctpop, Ty);
+ NewV = B.CreateFMul(V, CtpopFP);
+ break;
----------------
In general fmul will not give the exact same answer as a sequence of fadds, so you probably need to check some fast math flags before doing this.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D156301/new/
https://reviews.llvm.org/D156301
More information about the llvm-commits
mailing list