[llvm] [AMDGPU] Add backward compatibility layer for kernarg preloading (PR #119167)

Janek van Oirschot via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 12 03:59:54 PST 2024


================
@@ -0,0 +1,229 @@
+//===- AMDGPUPreloadKernargHeader.cpp - Preload Kernarg Header ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+/// \file This pass handles the creation of the backwards compatability layer
+/// for kernarg prealoding. Code may be compiled with the feature enabled, while
+/// the kernel is executed on hardware without firmware support.
+///
+/// To avoid the need for recompilation, we insert a block at the beginning of
+/// the kernel that is responsible for loading the kernel arguments into SGPRs
+/// using s_load instructions which setup the registers exactly as they would be
+/// by firmware if the code were executed on a system that supported kernarg
+/// preladoing.
+///
+/// This essentially allows for two entry points for the kernel. Firmware that
+/// supports the feature will automatically jump past the first 256 bytes of the
+/// program, skipping the backwards compatibility layer and directly beginning
+/// execution on the fast code path.
+///
+/// This pass should be run as late as possible, to avoid any optimization that
+/// may assume that padding is dead code or that the prologue added here is a
+/// true predecessor of the kernel entry block.
+//===----------------------------------------------------------------------===//
+
+#include "AMDGPUPreloadKernargHeader.h"
+#include "AMDGPU.h"
+#include "GCNSubtarget.h"
+#include "SIMachineFunctionInfo.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/TargetParser/TargetParser.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "amdgpu-preload-kernarg-header"
+
+namespace {
+
+struct LoadConfig {
+  unsigned Size;
+  const TargetRegisterClass *RegClass;
+  unsigned Opcode;
+  Register LoadReg;
+
+  // Constructor for the static config array
----------------
JanekvO wrote:

I would've expected the default constructor (+ explicitly initialized `LoadReg`) to play nice with the existing way you're initializing the static array, just that you might've had to be more intentional with calling the copy-constructor at times (e.g., instead of `return LoadConfig(Config.Size, Config.RegClass, Config.Opcode, LoadReg);`, something along the lines of `LoadConfig RetConfig(Config); RetConfig.LoadReg = LoadReg; return RetConfig;`).

But like said, there may just be some C++ reason I'm missing that inhibits this. I'm at peace with it if the default constructors aren't sufficient in this context.

https://github.com/llvm/llvm-project/pull/119167


More information about the llvm-commits mailing list