[PATCH] D129690: [LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space

Wed Jul 13 14:06:17 PDT 2022

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13014
+  FunctionCallee AddressShared = M->getOrInsertFunction(
+      "llvm.amdgcn.is.shared", Builder.getInt1Ty(), Builder.getInt8PtrTy());
+  Value *IsShared = Builder.CreateCall(AddressShared, {AddrInt8Ptr});
----------------
tianshilei1992 wrote:
> arsenm wrote:
> > Should use getIntrinsic with the enum, not refer to the intrinsic by name (or CreateIntrinsic)
> Well, I agree, but that intrinsic is not in llvm yet. clang directly lowers the compiler built-in to this. As a result, directly using the name is a WA.
Yes it is, the intrinsic wouldn't work at all if it weren't

================
Comment at: llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll:4-8
+ ; CHECK: global_atomic_add_f32
+ ; CHECK: flat_load_dword
+ ; CHECK: v_add_f32_e32
+ ; CHECK: flat_store_dword
+ ; CHECK: ds_add_f32
----------------
tianshilei1992 wrote:
> arsenm wrote:
> > This doesn't demonstrate any of the looping structure
> There is no loop.
I mean branching

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129690/new/

https://reviews.llvm.org/D129690