[all-commits] [llvm/llvm-project] 1186e9: [LLVM][AMDGPU] Specialize 32-bit atomic fadd instr...

Fri Nov 4 11:11:18 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 1186e9d59fea662292cdf62fdd1544b5b27d7d37
      https://github.com/llvm/llvm-project/commit/1186e9d59fea662292cdf62fdd1544b5b27d7d37
  Author: Shilei Tian <i at tianshilei.me>
  Date:   2022-11-04 (Fri, 04 Nov 2022)

  Changed paths:
    M llvm/include/llvm/CodeGen/TargetLowering.h
    M llvm/lib/CodeGen/AtomicExpandPass.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.h
    A llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll
    A llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd-flat-specialization.ll
    M llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll

  Log Message:
  -----------
  [LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space

The 32-bit floating-point atomic add instructions on AMDGPUs does not support a
"flat" or "generic" address space. So, if the address space cannot be determined
statically, the AMDGPU backend will fall back to a CAS loop (which does support
"flat" addressing). Instead, this patch emits runtime address-space checks to
allow native FP atomic add instructions for global and LDS memory (and non-atomic
FP add instructions for private/scratch memory).

In order to do that, this patch introduces a new interface function
`emitExpandAtomicRMW`. It is expected to be called when a common atomic expand
doesn't work for a specific target, such as the case we discussed here.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D129690