[PATCH] D129690: [LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space

Mon Aug 1 13:49:37 PDT 2022

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13055-13056
+                                AI->getOrdering(), AI->getSyncScopeID());
+    if (MDNode *Node = AI->getMetadata(LLVMContext::MD_tbaa))
+      OldVal->setMetadata(LLVMContext::MD_tbaa, Node);
+    return OldVal;
----------------
There are other metadata nodes, maybe there is a helper for it?

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13062-13063
+  Builder.SetInsertPoint(BB);
+  if (!PtrTy->isOpaquePointerTy())
+    Int8Ptr = Builder.CreatePointerCast(Addr, Builder.getInt8PtrTy());
+  Builder.CreateBr(CheckSharedBB);
----------------
Should be able to unconditionally call CreateBitCast

================
Comment at: llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd-flat-specialization.ll:119
+
+attributes #0 = { "amdgpu-unsafe-fp-atomics"="true" }
----------------
Also should test with this off to make sure it's appropriately expanded. The pass may need something to re-visit the newly emitted atomicrmw

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129690/new/

https://reviews.llvm.org/D129690