[llvm-branch-commits] [llvm] [AMDGPU] Add machine-level inliner pass (PR #169476)

Fri Mar 6 04:15:59 PST 2026

================
@@ -0,0 +1,20 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc -mtriple=amdgcn--amdpal -mcpu=gfx1200 -amdgpu-enable-machine-level-inliner=0 < %s | FileCheck %s --check-prefixes=CHECK-ASM,NOINLINE-ASM
+; RUN: llc -mtriple=amdgcn--amdpal -mcpu=gfx1200 -amdgpu-enable-machine-level-inliner=1 < %s | FileCheck %s --check-prefixes=CHECK-ASM,INLINE-ASM
+; RUN: llc -mtriple=amdgcn--amdpal -mcpu=gfx1200 -amdgpu-enable-machine-level-inliner=0 -stop-after=amdgpu-machine-level-inliner < %s | FileCheck %s --check-prefixes=CHECK-MIR,NOINLINE-MIR
+; RUN: llc -mtriple=amdgcn--amdpal -mcpu=gfx1200 -amdgpu-enable-machine-level-inliner=1 -stop-after=amdgpu-machine-level-inliner < %s | FileCheck %s --check-prefixes=CHECK-MIR,INLINE-MIR
+
+define amdgpu_cs void @caller(i32 %input, ptr addrspace(1) %output) {
+  %result = call i32 (ptr, ...) @llvm.amdgcn.call.whole.wave(ptr @inlined_callee, i32 %input)
----------------
rovka wrote:

That would kind of defeat the whole purpose of having all the WWM code in one place. We're trying to avoid the WWM intrinsics since they're unclear about where the WWM region starts. With the whole wave functions, all the WWM code is in the function.

https://github.com/llvm/llvm-project/pull/169476