[llvm] [AMDGPU] Architected SGPRs for GFX12 (PR #76140)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 4 03:30:17 PST 2024
================
@@ -0,0 +1,34 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
+; RUN: llc -mtriple=amdgcn-amd-amdpal -mcpu=gfx900 -mattr=+architected-sgprs --verify-machineinstrs < %s | FileCheck -check-prefixes=GFX9,GFX9-SDAG %s
+; RUN: llc -mtriple=amdgcn-amd-amdpal -mcpu=gfx900 -mattr=+architected-sgprs -global-isel --verify-machineinstrs < %s | FileCheck -check-prefixes=GFX9,GFX9-GISEL %s
+; RUN: llc -mtriple=amdgcn-amd-amdpal -mcpu=gfx1200 --verify-machineinstrs < %s | FileCheck -check-prefixes=GFX12,GFX12-SDAG %s
+; RUN: llc -mtriple=amdgcn-amd-amdpal -mcpu=gfx1200 -global-isel --verify-machineinstrs < %s | FileCheck -check-prefixes=GFX12,GFX12-GISEL %s
+
+define amdgpu_cs void @test_wave_id(ptr addrspace(1) %out) {
+; GFX9-LABEL: test_wave_id:
+; GFX9: ; %bb.0:
+; GFX9-NEXT: s_bfe_u32 s0, ttmp8, 0x50019
+; GFX9-NEXT: v_mov_b32_e32 v2, s0
+; GFX9-NEXT: global_store_dword v[0:1], v2, off
+; GFX9-NEXT: s_endpgm
+;
+; GFX12-LABEL: test_wave_id:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_bfe_u32 s0, ttmp8, 0x50019
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: v_mov_b32_e32 v2, s0
+; GFX12-NEXT: global_store_b32 v[0:1], v2, off
+; GFX12-NEXT: s_nop 0
+; GFX12-NEXT: s_sendmsg sendmsg(MSG_DEALLOC_VGPRS)
+; GFX12-NEXT: s_endpgm
+ %waveid = call i32 @llvm.amdgcn.wave.id()
+ store i32 %waveid, ptr addrspace(1) %out
+ ret void
+}
+
----------------
arsenm wrote:
OK, I see what you mean. But that means this is routing through a lot of complexity in order to support function ABI handling. This case is much simpler and can bypass all of it. The intrinsic lowering just needs to directly copy from the TTMP register. The existing inputs are a poor analogy if we have one dedicated, constant register
https://github.com/llvm/llvm-project/pull/76140
More information about the llvm-commits
mailing list