[PATCH] D156301: [WIP] Support FAdd/FSub global atomics in AMDGPUAtomicOptimizer.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 28 11:29:42 PDT 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:219
+ // TODO: Support for double type
+ if (!isScanStrategyIterative() || I.getType()->isDoubleTy()) {
+ return;
----------------
I think this is a bad interpretation of the strategy option. Doing nothing just because you wanted something else is worse than just using an implemented path. Also you can just implement this with dpp?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:329
+ // TODO: Support for double type
+ if (!isScanStrategyIterative() || I.getType()->isDoubleTy()) {
+ return;
----------------
Doesn't consider half
Should also handle <2 x half>, but atomicrmw doesn't support vectors now (you need the intrinsics for those)
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:610
+ if (isAtomicFloatingPointTy) {
+ AccumulatorBitCasted = B.CreateBitCast(Accumulator, B.getInt32Ty());
+ OldValuePhiBitCasted = B.CreateBitCast(OldValuePhi, B.getInt32Ty());
----------------
You shouldn't need a cast after D147732
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D156301/new/
https://reviews.llvm.org/D156301
More information about the llvm-commits
mailing list