[llvm] [AMDGPU] Enable GCNRewritePartialRegUses pass by default. (PR #72975)

Carl Ritson via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 22 01:00:57 PST 2023


================
@@ -136,13 +136,13 @@ define amdgpu_kernel void @max_256_vgprs_spill_9x32(ptr addrspace(1) %p) #1 {
 ; GFX908-DAG: v_accvgpr_read_b32
 
 ; GFX900: NumVgprs: 256
-; GFX908: NumVgprs: 254
-; GFX900: ScratchSize: 1796
+; GFX908: NumVgprs: 252
+; GFX900: ScratchSize: 132
----------------
perlfu wrote:

I think this is probably fine.
It matches the test without a branch above.

More concerningly, the spill behaviour is so bad originally because rename-independent-subregs creates a new vreg_1024 for each disconnected dwordx4 load (from the 32 x float loads).
The register allocator then spills each of these as a vreg_1024 (with only 4 lanes used).
Looks like rename-independent-subregs should be shrinking/narrowing registers, but will need a target hook to be able to find if that is possible and which register class to use.

https://github.com/llvm/llvm-project/pull/72975


More information about the llvm-commits mailing list