[llvm] [AMDGPU] Fix for 131386 by reducing implicit definitions on register restoration (PR #133986)
Ryan Buchner via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 30 09:29:58 PDT 2025
================
@@ -0,0 +1,43 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn--amdhsa -mcpu=gfx942 -verify-machineinstrs -run-pass prologepilog,machine-cp -o - %s | FileCheck -check-prefix=GFX942 %s
+
+--- |
+ define amdgpu_kernel void @agpr_spill_copy() #0 { ret void }
+
+ attributes #0 = { "amdgpu-num-vgpr"="32" }
----------------
bababuck wrote:
Without a budget, the registers will spill from `vgpr` into available `agpr` registers rather than onto the stack. The bug only occurs when there is enough register pressure such that some of the registers in the register group spill onto the stack, and some spill into the `agprs`. Technically, I could remove the budget restriction, but that would require adding code to the test to put all of the `agprs` into use (that seems much messier to me).
I made the change to using `amdgpu-waves-per-eu` in this latest push, just curious as to why this is preferred? It seems like a more indirect way of getting the behavior that I want (which is limiting the number of `vgpr`'s to `32`.
https://github.com/llvm/llvm-project/pull/133986
More information about the llvm-commits
mailing list