[PATCH] D26104: AMDGPU: Use wider scalar spills for SGPR spilling

Fri Oct 28 16:11:37 PDT 2016

arsenm created this revision.
arsenm added a subscriber: llvm-commits.
Herald added subscribers: tony-tye, yaxunl, nhaehnle, wdng, kzhuravl, qcolombet.
Herald added a reviewer: tstellarAMD.

Since the spill is for the whole wave, these
don't have the swizzling problems that vector stores do
and a single 4-byte allocation is enough to spill a 64 element
register. This should reduce the number of spill instructions and
put all the spills for a register in the same cacheline.

This should save allocated private size, but for now it doesn't.
The extra slots are allocated for each component, but never used
because the frame layout is essentially finalized before frame
indices are replaced. For always using the scalar store path,
this should probably be moved into processFunctionBeforeFrameFinalized.

https://reviews.llvm.org/D26104

Files:
  lib/Target/AMDGPU/SIRegisterInfo.cpp
  test/CodeGen/AMDGPU/si-spill-sgpr-stack.ll
  test/CodeGen/AMDGPU/spill-wide-sgpr.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D26104.76269.patch
Type: text/x-patch
Size: 14587 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161028/cd751da0/attachment.bin>