[PATCH] D30227: AMDGPU: Change m0 initialization handling to help LDS

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 21 14:43:49 PST 2017


arsenm created this revision.
Herald added subscribers: tpr, dstuttard, tony-tye, yaxunl, nhaehnle, wdng, kzhuravl, qcolombet, MatzeB.

Initialize m0 to the default value for LDS in the entry block, 
and remove the initialization around DS instruction uses.

Treat the LDS value as the default, and insert writes of the default around other uses.
Spills need to still do save restore, since we don't know the point where it is being spilled (and could be spilled in a sequence involving inlineasm).

This isn't an ideal solution. Unfortunately this needs to add m0 as a physreg live in to every block for now right after instruction selection which is discouraged. Inserting a copy from the initial value to m0 in each block works, but misses many of the cases where we want to eliminate m0 usage. The live ins are added too aggressively, making more defs appear alive than they really are. Better would be to always use save/restore, but there are missing optimizations to eliminate redundant ones. Also missing are optimizations to generally hoist the same m0 def into predecessor blocks. MachineLICM handles some, but it doesn't handle all loops, or diamonds and other simple control flow. The worst code quality regressions are around SGPR spills at -O0 when using scalar stores, but I'm not sure how much of a concern that is.


https://reviews.llvm.org/D30227

Files:
  lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
  lib/Target/AMDGPU/AMDGPUISelLowering.cpp
  lib/Target/AMDGPU/AMDGPUISelLowering.h
  lib/Target/AMDGPU/AMDGPUInstrInfo.td
  lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
  lib/Target/AMDGPU/SIFixSGPRCopies.cpp
  lib/Target/AMDGPU/SIISelLowering.cpp
  lib/Target/AMDGPU/SIISelLowering.h
  lib/Target/AMDGPU/SIInstrInfo.cpp
  lib/Target/AMDGPU/SIInstrInfo.h
  lib/Target/AMDGPU/SIInstructions.td
  lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
  lib/Target/AMDGPU/SIMachineFunctionInfo.h
  lib/Target/AMDGPU/SIRegisterInfo.td
  test/CodeGen/AMDGPU/control-flow-fastregalloc.ll
  test/CodeGen/AMDGPU/indirect-addressing-si-noopt.ll
  test/CodeGen/AMDGPU/lds-m0-init-in-loop.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.interp.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.sendmsg.ll
  test/CodeGen/AMDGPU/regcoalesce-dbg.mir
  test/CodeGen/AMDGPU/shl_add_ptr.ll
  test/CodeGen/AMDGPU/shrink-vop3-carry-out.mir
  test/CodeGen/AMDGPU/spill-m0.ll
  test/CodeGen/MIR/AMDGPU/fold-imm-f16-f32.mir

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D30227.89282.patch
Type: text/x-patch
Size: 56838 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170221/f3bee90e/attachment-0001.bin>


More information about the llvm-commits mailing list