[all-commits] [llvm/llvm-project] e03d36: [AMDGPU] Add FeatureFlatAtomicFaddF32Inst

Fri Sep 23 09:01:35 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: e03d36d4aec2727fffeecc5c02ced2dc71a7283e
      https://github.com/llvm/llvm-project/commit/e03d36d4aec2727fffeecc5c02ced2dc71a7283e
  Author: Petar Avramovic <Petar.Avramovic at amd.com>
  Date:   2022-09-23 (Fri, 23 Sep 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPU.td
    M llvm/lib/Target/AMDGPU/FLATInstructions.td
    M llvm/lib/Target/AMDGPU/GCNSubtarget.h
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.atomic.fadd.ll

  Log Message:
  -----------
  [AMDGPU] Add FeatureFlatAtomicFaddF32Inst

Feature used by targets that have flat_atomic_add_f32 instruction
(gfx940 and gfx11). Remove isGFX940GFX11Plus.
Add hasFlatAtomicFaddF32Inst Subtarget check for codegen.

Differential Revision: https://reviews.llvm.org/D134532

  Commit: 48968c47b0a15f8c21d54043100f3ee6bf4847e5
      https://github.com/llvm/llvm-project/commit/48968c47b0a15f8c21d54043100f3ee6bf4847e5
  Author: Petar Avramovic <Petar.Avramovic at amd.com>
  Date:   2022-09-23 (Fri, 23 Sep 2022)

  Changed paths:
    A llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-no-rtn.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-rtn.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f64.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-no-rtn.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-rtn.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/flat-atomic-fadd.f32.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/flat-atomic-fadd.f64.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/flat-atomic-fadd.v2f16.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f32-no-rtn.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f32-rtn.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f64.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.v2f16-no-rtn.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.v2f16-rtn.ll
    A llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-no-rtn.ll
    A llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-rtn.ll
    A llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f64.ll
    A llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-no-rtn.ll
    A llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-rtn.ll
    A llvm/test/CodeGen/AMDGPU/flat-atomic-fadd.f32.ll
    A llvm/test/CodeGen/AMDGPU/flat-atomic-fadd.f64.ll
    A llvm/test/CodeGen/AMDGPU/flat-atomic-fadd.v2f16.ll
    A llvm/test/CodeGen/AMDGPU/global-atomic-fadd.f32-no-rtn.ll
    A llvm/test/CodeGen/AMDGPU/global-atomic-fadd.f32-rtn.ll
    A llvm/test/CodeGen/AMDGPU/global-atomic-fadd.f64.ll
    A llvm/test/CodeGen/AMDGPU/global-atomic-fadd.v2f16-no-rtn.ll
    A llvm/test/CodeGen/AMDGPU/global-atomic-fadd.v2f16-rtn.ll
    R llvm/test/CodeGen/AMDGPU/llvm.amdgcn.atomic.fadd.rtn_no-rtn.ll
    M llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll

  Log Message:
  -----------
  AMDGPU: Add detailed buffer, global and flat atomic fadd tests

Precommit for D130579 that will remove manual selection and use
patterns from td files. Tests are grouped based on target features.

All patterns have rtn and no-rtn versions.

buffer atomics patterns are selected based on the intrinsic used
(raw or struct) and the offset operand (imm or vgpr):
_offset raw with imm offset
_offen raw with vgpr offset (or large imm offset)
_idxen struct with imm offset
_bothen struct with vgpr offset (or large imm offset)

global and flat atomics are selected via intrinsic or the atomicrmw fadd.
atomicrmw tests have amdgpu-unsafe-fp-atomics=true and non-system scope
since they get expanded otherwise. atomicrmw fadd does not support vector
type, test float and double.

global atomics patterns are selected based on address type via (global or
flat) intrinsic or atomicrmw fadd with global address(addrspace(1)*).
'no suffix' vgpr addrspace(1)* address
_saddr sgpr addrspace(1)* address

flat atomics patterns are selected via (flat)intrinsic or atomicrmw fadd
with flat address (* - address space 0).

Differential Revision: https://reviews.llvm.org/D131561

  Commit: 5cee9047d5ffa6a5d5b9e045b570c32ae4444f53
      https://github.com/llvm/llvm-project/commit/5cee9047d5ffa6a5d5b9e045b570c32ae4444f53
  Author: Petar Avramovic <Petar.Avramovic at amd.com>
  Date:   2022-09-23 (Fri, 23 Sep 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/flat-atomic-fadd.f32.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/fp-atomics-gfx940.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.fadd-with-ret.ll
    M llvm/test/CodeGen/AMDGPU/flat-atomic-fadd.f32.ll
    M llvm/test/CodeGen/AMDGPU/fp-atomics-gfx940.ll
    M llvm/test/CodeGen/AMDGPU/global-atomics-fp.ll
    M llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll

  Log Message:
  -----------
  AMDGPU: Improve atomicrmw fadd selection

Use same atomicrmw fadd expansion rules for gfx908, gfx940 and gfx11
as for gfx90a. Add missing globalisel legalizer support for flat
atomicrmw fadd f32 on gfx940 and gfx11.
Isel support for gfx11 will be added in D130579.

Differential Revision: https://reviews.llvm.org/D131560

  Commit: 6db7921b65d961f9561878f6d468f0177d657edb
      https://github.com/llvm/llvm-project/commit/6db7921b65d961f9561878f6d468f0177d657edb
  Author: Petar Avramovic <Petar.Avramovic at amd.com>
  Date:   2022-09-23 (Fri, 23 Sep 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h
    M llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/lib/Target/AMDGPU/BUFInstructions.td
    M llvm/lib/Target/AMDGPU/FLATInstructions.td
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f32-no-rtn.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f32-rtn.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f64.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.v2f16-no-rtn.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.v2f16-rtn.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.fadd.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.fadd-with-ret.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.fadd-with-ret.ll
    M llvm/test/CodeGen/AMDGPU/global-atomic-fadd.f32-rtn.ll
    M llvm/test/CodeGen/AMDGPU/global-atomics-fp.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.atomic.fadd.gfx90a.ll

  Log Message:
  -----------
  AMDGPU: Use tablegen patterns for buffer global and flat atomic fadd

Remove manual selection for atomic fadd from global-isel.
Stop pre-isel translation to AtomicLoadFAdd/G_ATOMICRMW_FADD
which corresponds to llvm-ir's atomicrmw fadd instruction.

global and flat atomic fadd patterns changes:
Split rtn/no-rtn patterns
Add missing patterns or fix predicates
Remove atomicrmw patterns for v2f16 (atomic rmw doesn't support vectors).
Patterns now check addrspace of pointer, added patterns for flat intrinsic.
with global addrspace pointer that selects into global atomic instruction.

buffer atomic fadd patterns changes:
Rdit patterns to import into global-isel.
Remove gfx6/gfx7 _addr64 and _offset patterns.
Remove patterns that can't be reached (same pattern but different feature).

Differential Revision: https://reviews.llvm.org/D130579

Compare: https://github.com/llvm/llvm-project/compare/42ef5720493e...6db7921b65d9