[all-commits] [llvm/llvm-project] 038d88: [AMDGPU] Use flat scratch instructions where avail...

Mon Oct 26 14:41:04 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 038d884a50a4e72d3f65a315d28d35bda024cb4a
      https://github.com/llvm/llvm-project/commit/038d884a50a4e72d3f65a315d28d35bda024cb4a
  Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
  Date:   2020-10-26 (Mon, 26 Oct 2020)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPU.td
    M llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
    M llvm/lib/Target/AMDGPU/BUFInstructions.td
    M llvm/lib/Target/AMDGPU/FLATInstructions.td
    M llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
    M llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SIInstrInfo.h
    M llvm/lib/Target/AMDGPU/SIInstrInfo.td
    M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
    M llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
    M llvm/lib/Target/AMDGPU/SIRegisterInfo.h
    M llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll
    M llvm/test/CodeGen/AMDGPU/callee-frame-setup.ll
    M llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
    M llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll
    A llvm/test/CodeGen/AMDGPU/flat-scratch.ll
    M llvm/test/CodeGen/AMDGPU/frame-index-elimination.ll
    M llvm/test/CodeGen/AMDGPU/load-hi16.ll
    M llvm/test/CodeGen/AMDGPU/load-lo16.ll
    M llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll
    M llvm/test/CodeGen/AMDGPU/memcpy-fixed-align.ll
    M llvm/test/CodeGen/AMDGPU/multi-dword-vgpr-spill.ll
    M llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll
    M llvm/test/CodeGen/AMDGPU/pei-scavenge-sgpr-gfx9.mir
    M llvm/test/CodeGen/AMDGPU/pei-scavenge-vgpr-spill.mir
    M llvm/test/CodeGen/AMDGPU/scratch-simple.ll
    M llvm/test/CodeGen/AMDGPU/sgpr-spill.mir
    M llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
    M llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll
    M llvm/test/CodeGen/AMDGPU/store-hi16.ll

  Log Message:
  -----------
  [AMDGPU] Use flat scratch instructions where available

The support is disabled by default. So far there is instruction
selection, spilling, and frame elimination. It also changes SP
from unswizzled to swizzled as used by flat scratch instructions,
so it cannot be mixed with MUBUF stack access.

At the very least missing:

- GlobalISel;
- Some optimizations in frame elimination in between vector
  and scalar ALU;
- It shall finally allow to always materialize frame index
  as an SGPR, but that is not implemented and frame elimination
  cannot handle it yet;
- Unaligned and/or multidword flat scratch shall work, but it
  is legalized now for MUBUF;
- Operand folding cannot optimize FI like with MUBUF yet;
- It will need scaling the value of the SP/FP in the DWARF
  expression to recover the unswizzled scratch address;

Differential Revision: https://reviews.llvm.org/D89170