[llvm] [WIP][AMDGPU] Change `CC_AMDGPU_Func` to only use SGPR0 to SGPR27 for `inreg` argument passing (PR #115753)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 11 10:36:03 PST 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-amdgpu
Author: Shilei Tian (shiltian)
<details>
<summary>Changes</summary>
In `emitCSRSpillStores`, a caller-saved SGPR is required to save `exec`, which
limits us to using SGPR0 through SGPR29
Currently, we assume that one is always available; however, this isn’t always
the case, as SGPR0 to SGPR29 are also used for inreg argument passing.
This PR is trying to fix this issue by not using all caller-saved SGPRs for
`inreg` argument passing. This will make sure that we will always have at least
two SGPRs available when it needs CSR spilling.
Fixes #<!-- -->113782.
---
Full diff: https://github.com/llvm/llvm-project/pull/115753.diff
2 Files Affected:
- (modified) llvm/docs/AMDGPUUsage.rst (+2-2)
- (modified) llvm/lib/Target/AMDGPU/AMDGPUCallingConv.td (+1-1)
``````````diff
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 5b83ea428c0bff..b17226ccde712b 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -16545,7 +16545,7 @@ On entry to a function:
:ref:`amdgpu-amdhsa-kernel-prolog-m0`.
4. The EXEC register is set to the lanes active on entry to the function.
5. MODE register: *TBD*
-6. VGPR0-31 and SGPR4-29 are used to pass function input arguments as described
+6. VGPR0-31 and SGPR4-27 are used to pass function input arguments as described
below.
7. SGPR30-31 return address (RA). The code address that the function must
return to when it completes. The value is undefined if the function is *no
@@ -16796,7 +16796,7 @@ The input and result arguments are assigned in order in the following manner:
How are overly aligned structures allocated on the stack?
* SGPR arguments are assigned to consecutive SGPRs starting at SGPR0 up to
- SGPR29.
+ SGPR27.
If there are more arguments than will fit in these registers, the remaining
arguments are allocated on the stack in order on naturally aligned
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCallingConv.td b/llvm/lib/Target/AMDGPU/AMDGPUCallingConv.td
index 80969fce3d77fb..1787fe76a31ea5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCallingConv.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCallingConv.td
@@ -127,7 +127,7 @@ def CC_AMDGPU_Func : CallingConv<[
CCIfType<[i8, i16], CCIfExtend<CCPromoteToType<i32>>>,
CCIfInReg<CCIfType<[f32, i32, f16, i16, v2i16, v2f16, bf16, v2bf16] , CCAssignToReg<
- !foreach(i, !range(0, 30), !cast<Register>("SGPR"#i)) // SGPR0-29
+ !foreach(i, !range(0, 28), !cast<Register>("SGPR"#i)) // SGPR0-27
>>>,
CCIfType<[i32, f32, i16, f16, v2i16, v2f16, i1, bf16, v2bf16], CCAssignToReg<
``````````
</details>
https://github.com/llvm/llvm-project/pull/115753
More information about the llvm-commits
mailing list