[PATCH] D124195: [AMDGPU] Separate out custom SGPR spills to VGPR during PEI

Mon Apr 25 13:44:03 PDT 2022

arsenm added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/assert-align.ll:12
 ; CHECK-NEXT:    buffer_store_dword v40, off, s[0:3], s32 ; 4-byte Folded Spill
+; CHECK-NEXT:    buffer_store_dword v41, off, s[0:3], s32 offset:4 ; 4-byte Folded Spill
 ; CHECK-NEXT:    s_mov_b64 exec, s[16:17]
----------------
cdevadas wrote:
> foad wrote:
> > Seems like a regression. Does this get fixed by a later patch?
> Yes, it is.
> With spilling SGPRs into virtual VPGR lanes, it won't directly be possible to track the unused lanes of the physical VGPR allocated for the last virtual register created during `SILowerSGPRSpills` pass. 
> Going to insert a custom pass in the VGPR regalloc pipeline to map the physReg from virtRegMap. In that way, we can reuse the VGPR for any custom SGPR spills during PEI if free lanes are available. 
> However, this regression can only be avoided for higher optimization levels. The `regallocfast`doesn't provide a way to correctly map a virtual to PhysReg and we can't avoid this extra VGPR usage when compiled for -O0. 
I'm not sure a separate pass using VirtRegMap is the right solution to merging spill VGPRs of different SGPRs, but either way this is a separate optimization that needs to be re-implemented. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124195/new/

https://reviews.llvm.org/D124195