[llvm] [AMDGPU] ISel & PEI for whole wave functions (PR #145858)
Carl Ritson via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 9 02:00:34 PDT 2025
================
@@ -0,0 +1,448 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1200 -run-pass=prologepilog -o - %s | FileCheck %s
+
+---
+name: save_inactive_lanes_non_csr_vgpr
+alignment: 1
+tracksRegLiveness: true
+noPhis: true
+isSSA: false
+noVRegs: true
+hasFakeUses: false
+tracksDebugUserValues: true
+frameInfo:
+ maxAlignment: 1
+ isCalleeSavedInfoValid: true
+machineFunctionInfo:
+ maxKernArgAlign: 1
+ frameOffsetReg: '$sgpr33'
+ stackPtrOffsetReg: '$sgpr32'
+ returnsVoid: false
+ occupancy: 16
+ sgprForEXECCopy: '$sgpr105'
+ isWholeWaveFunction: true
+body: |
+ bb.0:
+ ; CHECK-LABEL: name: save_inactive_lanes_non_csr_vgpr
+ ; CHECK: liveins: $vgpr0
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: $sgpr0 = S_XOR_SAVEEXEC_B32 -1, implicit-def $exec, implicit-def dead $scc, implicit $exec
+ ; CHECK-NEXT: SCRATCH_STORE_DWORD_SADDR $vgpr0, $sgpr32, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into %stack.0, addrspace 5)
+ ; CHECK-NEXT: $exec_lo = S_MOV_B32 -1
+ ; CHECK-NEXT: $vgpr0 = V_MOV_B32_e32 14, implicit $exec
+ ; CHECK-NEXT: $exec_lo = S_XOR_B32 $sgpr0, -1, implicit-def $scc
+ ; CHECK-NEXT: $vgpr0 = SCRATCH_LOAD_DWORD_SADDR $sgpr32, 0, 0, implicit $exec, implicit $flat_scr, implicit $vgpr0(tied-def 0) :: (load (s32) from %stack.0, addrspace 5)
+ ; CHECK-NEXT: $exec_lo = S_MOV_B32 $sgpr0
----------------
perlfu wrote:
It seems that CSR epilog has already restored exec to entry value?
Presumably this means `SI_WHOLE_WAVE_FUNC_RETURN` will insert a second redundant restore?
However, I don't see the duplicate restore in the other test ISA below.
Is this just being cleaned up by a later pass?
https://github.com/llvm/llvm-project/pull/145858
More information about the llvm-commits
mailing list