[llvm] [AMDGPU] Add backward compatibility layer for kernarg preloading (PR #119167)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 12 19:57:56 PST 2024
================
@@ -0,0 +1,217 @@
+//===- AMDGPUPreloadKernArgProlog.cpp - Preload KernArg Prolog ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+/// \file This pass creates a backward compatibility layer for kernel argument
+/// preloading in situations where code is compiled with kernel argument
+/// preloading enabled but executed on hardware without firmware support for it.
+///
+/// To avoid recompilation, the pass inserts a block at the beginning of the
+/// program that loads the kernel arguments into SGPRs using s_load
+/// instructions. This sets up the registers exactly as they would be on systems
+/// with compatible firmware.
+///
+/// This effectively creates two entry points for the kernel. Firmware that
+/// supports the feature will automatically jump past the first 256 bytes of the
+/// program, skipping the compatibility layer and directly starting execution on
+/// the optimized code path.
+///
+/// This pass should be run as late as possible to prevent any optimizations
+/// that might assume the padding is dead code or that the added prologue is a
+/// true predecessor of the kernel entry block.
+//
+//===----------------------------------------------------------------------===//
+
+#include "AMDGPUPreloadKernArgProlog.h"
+#include "AMDGPU.h"
+#include "GCNSubtarget.h"
+#include "SIMachineFunctionInfo.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/TargetParser/TargetParser.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "amdgpu-preload-kern-arg-prolog"
+
+namespace {
+
+// Used to build s_loads maping user SGPRs to kernel arguments
+struct LoadConfig {
+ unsigned Size;
+ const TargetRegisterClass *RegClass;
+ unsigned Opcode;
+ Register LoadReg;
+
+ // Constructor for the static config array
+ constexpr LoadConfig(unsigned S, const TargetRegisterClass *RC, unsigned Op)
+ : Size(S), RegClass(RC), Opcode(Op), LoadReg(AMDGPU::NoRegister) {}
+
+ // Constructor for the return value
+ constexpr LoadConfig(unsigned S, const TargetRegisterClass *RC, unsigned Op,
+ Register Reg)
+ : Size(S), RegClass(RC), Opcode(Op), LoadReg(Reg) {}
----------------
arsenm wrote:
These both amount to the default constructor
https://github.com/llvm/llvm-project/pull/119167
More information about the llvm-commits
mailing list