[PATCH] D89170: [AMDGPU] Select flat scratch instructions where available

Fri Oct 9 16:20:44 PDT 2020

rampitec created this revision.
rampitec added a reviewer: arsenm.
Herald added subscribers: kerbowa, arphaman, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl.
Herald added a project: LLVM.
rampitec requested review of this revision.
Herald added a subscriber: wdng.

The support is incomplete and disabled by default. So far
there is instruction selection and frame elimination. It
also changes SP from unswizzled to swizzled as used by
flat scratch instructions, so it cannot be mixed with
MUBUF stack access.

At the very least missing:

- Spilling using scratch opcodes;
- GlobalISel;
- Some optimizations in frame elimination in between vector and scalar ALU;
- It shall finally allow to always materialize frame index as an SGPR, but that is not implemented and frame elimination cannot handle it yet;
- Unaligned and/or multidword flat scratch shall work, but it is legalized now for MUBUF;
- Operand folding cannot optimize FI like with MUBUF yet;

However, I want to verify the general idea and make it working
even if not yet optimal. Then run the testing to have it
functionally working.

https://reviews.llvm.org/D89170

Files:
  llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
  llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
  llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
  llvm/lib/Target/AMDGPU/FLATInstructions.td
  llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
  llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
  llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
  llvm/lib/Target/AMDGPU/SIRegisterInfo.h
  llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll
  llvm/test/CodeGen/AMDGPU/callee-frame-setup.ll
  llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
  llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll
  llvm/test/CodeGen/AMDGPU/flat-scratch.ll
  llvm/test/CodeGen/AMDGPU/frame-index-elimination.ll
  llvm/test/CodeGen/AMDGPU/load-hi16.ll
  llvm/test/CodeGen/AMDGPU/load-lo16.ll
  llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll
  llvm/test/CodeGen/AMDGPU/memcpy-fixed-align.ll
  llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll
  llvm/test/CodeGen/AMDGPU/scratch-simple.ll
  llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll
  llvm/test/CodeGen/AMDGPU/store-hi16.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D89170.297359.patch
Type: text/x-patch
Size: 250517 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20201009/63e98332/attachment-0001.bin>