[PATCH] D156301: [WIP] Support FAdd/FSub global atomics in AMDGPUAtomicOptimizer.

Fri Jul 28 21:13:44 PDT 2023

pravinjagtap added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:219
+    // TODO: Support for double type
+    if (!isScanStrategyIterative() || I.getType()->isDoubleTy()) {
+      return;
----------------
pravinjagtap wrote:
> arsenm wrote:
> > I think this is a bad interpretation of the strategy option. Doing nothing just because you wanted something else is worse than just using an implemented path. Also you can just implement this with dpp?
> > Also you can just implement this with dpp?
> 
> If I understand correctly, current dpp intrinsics that we need for reduction & scan(`llvm.amdgcn.update.dpp`) can return only `integer` types (accepts inputs with any types). @foad Is it possible to extend current dpp implementation for float types as well ? 
> > Also you can just implement this with dpp?
> 
> If I understand correctly, current dpp intrinsics that we need for reduction & scan(`llvm.amdgcn.update.dpp`) can return only `integer` types (accepts inputs with any types). 

I am wrong, this intrinsic is lowered to V_MOV_B32_dpp when matched with i32 types. I think, we should be able to implement dpp for floats with bitcasts noise.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156301/new/

https://reviews.llvm.org/D156301