[llvm] [AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index (PR #125086)
Jon Chesterfield via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 30 11:40:31 PST 2025
================
@@ -0,0 +1,38 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx90a -verify-machineinstrs | FileCheck %s
+
+define amdgpu_kernel void @copy_to_reg_frameindex(ptr addrspace(1) %out, i32 %a, i32 %b, i32 %c) {
+; CHECK-LABEL: copy_to_reg_frameindex:
+; CHECK: ; %bb.0: ; %entry
+; CHECK-NEXT: ; implicit-def: $vgpr0
+; CHECK-NEXT: .LBB0_1: ; %loop
+; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: s_cmp_lt_u32 0, 16
+; CHECK-NEXT: s_set_gpr_idx_on 0, gpr_idx(DST)
+; CHECK-NEXT: v_mov_b32_e32 v0, 0
+; CHECK-NEXT: s_set_gpr_idx_off
+; CHECK-NEXT: s_cbranch_scc1 .LBB0_1
+; CHECK-NEXT: ; %bb.2: ; %done
+; CHECK-NEXT: s_load_dwordx2 s[0:1], s[4:5], 0x24
+; CHECK-NEXT: v_mov_b32_e32 v1, 0
+; CHECK-NEXT: s_waitcnt lgkmcnt(0)
+; CHECK-NEXT: global_store_dword v1, v0, s[0:1]
+; CHECK-NEXT: s_endpgm
+entry:
+ %B = srem i32 %c, -1
+ %alloca = alloca [16 x i32], align 4, addrspace(5)
----------------
JonChesterfield wrote:
Yeah, that seems to be the case. Promote alloca gives a <16 x i32> which then turns into an unrolled loop which makes a bit of a mess of the test case, agreed on revising it to the vector input
https://github.com/llvm/llvm-project/pull/125086
More information about the llvm-commits
mailing list