[PATCH] D129690: [LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space

Fri Sep 23 04:43:46 PDT 2022

Petar.Avramovic added a comment.

I am not sure about changes in SIISelLowering.cpp, it looks correct for gfx90a but not for gfx908. Can you rebase on top of D131560 <https://reviews.llvm.org/D131560>?
There are some additions to when rmw fadd atomics are expanded.
If I am reading this correctly, flat f32 fadd that is non-system scope and function has "amdgpu-unsafe-fp-atomics"="true" will use expand from this patch on gfx908 no-rtn fadd and gfx90a?
Remaining two targets (gfx940 and gfx11) that have global fadd f32 also have flat fadd f32 instructions.
Can you also update summary, there are a few targets that have flat/global fadd.
You are changing way of expansion on targets that have global fadd but does not have flat fadd instruction (if atomic is non-system scope and function has "amdgpu-unsafe-fp-atomics"="true" attribute)?
Also there is no check if target has hasLDSFPAtomicAdd before using  AtomicExpansionKind::Expand (targets affected by this change have it but should probably add feature check before expanding)

================
Comment at: llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd-flat-specialization.ll:27
+; GFX908-NEXT:    [[TMP5:%.*]] = addrspacecast float* [[ADDR]] to float addrspace(1)*
+; GFX908-NEXT:    [[TMP6:%.*]] = atomicrmw fadd float addrspace(1)* [[TMP5]], float [[VAL]] seq_cst, align 4
+; GFX908-NEXT:    br label [[ATOMICRMW_PHI]]
----------------
There are some changes in D131560, this will have to be expanded for gfx908.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129690/new/

https://reviews.llvm.org/D129690