[llvm] [AMDGPU] Add backward compatibility layer for kernarg preloading (PR #119167)

Janek van Oirschot via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 12 03:59:54 PST 2024


================
@@ -1,23 +1,52 @@
-; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 -amdgpu-kernarg-preload-count=1 -asm-verbose=0 < %s | FileCheck -check-prefixes=GCN,HSA,ASM %s
-; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 -amdgpu-kernarg-preload-count=1 -filetype=obj < %s | llvm-objdump --arch=amdgcn --mcpu=gfx940 --disassemble - | FileCheck -check-prefixes=GCN,HSA,OBJ %s
-; RUN: llc -mtriple=amdgcn -mcpu=gfx940 -amdgpu-kernarg-preload-count=1 -filetype=obj < %s | llvm-objdump --arch=amdgcn --mcpu=gfx940 --disassemble - | FileCheck -check-prefixes=GCN,NON-HSA,OBJ %s
-; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 -amdgpu-kernarg-preload-count=1 -asm-verbose=0 < %s | llvm-mc -triple amdgcn-amd-amdhsa -mcpu=gfx940 -filetype=obj | llvm-objdump --arch=amdgcn --mcpu=gfx940 --disassemble - | FileCheck -check-prefixes=GCN,HSA,OBJ %s
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -asm-verbose=0 < %s | FileCheck -check-prefixes=ASM %s
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -filetype=obj < %s | llvm-objdump --arch=amdgcn --mcpu=gfx942 --disassemble - | FileCheck -check-prefixes=OBJ %s
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -amdgpu-kernarg-preload-count=1 -asm-verbose=0 < %s | llvm-mc -triple amdgcn-amd-amdhsa -mcpu=gfx942 -filetype=obj | llvm-objdump --arch=amdgcn --mcpu=gfx942 --disassemble - | FileCheck -check-prefixes=OBJ %s
 
-; GCN: preload_kernarg_header
-; HSA: s_trap 2
-; NON-HSA: s_endpgm
-; ASM: .fill 63, 4, 0xbf800000 ; s_nop 0
-; OBJ-COUNT-63: s_nop 0
-define amdgpu_kernel void @preload_kernarg_header(ptr inreg %arg) {
+; OBJ: preload_ptr_kernarg_header
+; OBJ-COUNT-60: s_nop 0
+define amdgpu_kernel void @preload_ptr_kernarg_header(ptr inreg %arg) {
+; ASM-LABEL: preload_ptr_kernarg_header:
+; ASM:         s_load_dwordx2 s[8:9], s[4:5], 0x0
+; ASM-NEXT:    s_waitcnt lgkmcnt(0)
+; ASM-NEXT:    s_branch .LBB0_0
+; ASM-NEXT:    .p2align 8
----------------
JanekvO wrote:

Is the alignment padding giving the intended behavior? I thought the padding was supposed to be a constant 256 bytes rather than padding whatever amount of bytes until the alignment boundary.

https://github.com/llvm/llvm-project/pull/119167


More information about the llvm-commits mailing list