[PATCH] D87704: [AMDGPU] Reduce stack pointer alignment

Tue Sep 15 08:48:04 PDT 2020

Flakebi added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h:944-946
+  // the pointer. Set this to something above the minimum for HSA to avoid
+  // needing dynamic realignment in common cases.
+  Align getStackAlignment() const {
----------------
arsenm wrote:
> This isn't a real ABI requirement and I don't think should vary based on the triple. Why do you want to re-reduce this?
As far as I understand it, an alignment of 16 is needed for OpenCL but otherwise not required (as real memory addresses for lanes are aligned to 4 only anyway).
Forcing an alignment of 16 means that spilling one vgpr will reserve 1 kiB of scratch memory where only 256 Byte are needed.
If a single sgpr needs to be spilled to a vgpr (which is then in turn spilled to scratch), we need 1 kiB of memory for a 4 Byte value, which seems quite a lot to me.

Is there a downside of requiring an alignment of 4? Are there cases outside OpenCL where a higher alignment is required and the stack would need to be realigned?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87704/new/

https://reviews.llvm.org/D87704