[llvm] [CodeGen] Refactor and document ThunkInserter (PR #97468)
Kristof Beyls via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 3 05:44:06 PDT 2024
================
@@ -7,34 +7,104 @@
//===----------------------------------------------------------------------===//
///
/// \file
-/// Contains a base class for Passes that inject an MI thunk.
+/// Contains a base ThunkInserter class that simplifies injection of MI thunks
+/// as well as a default implementation of MachineFunctionPass wrapping
+/// several `ThunkInserter`s for targets to extend.
///
//===----------------------------------------------------------------------===//
#ifndef LLVM_CODEGEN_INDIRECTTHUNKS_H
#define LLVM_CODEGEN_INDIRECTTHUNKS_H
#include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Module.h"
namespace llvm {
+/// This class assists in inserting MI thunk functions into the module and
+/// rewriting the existing machine functions to call these thunks.
+///
+/// One of the common cases is implementing security mitigations that involve
+/// replacing some machine code patterns with calls to special thunk functions.
+///
+/// Inserting a module pass late in the codegen pipeline may increase memory
+/// usage, as it serializes the transformations and forces preceding passes to
+/// produce machine code for all functions before running the module pass.
+/// For that reason, ThunkInserter can be driven by a MachineFunctionPass by
+/// passing one MachineFunction at a time to its `run(MMI, MF)` method.
+/// Then, the derived class should
+/// * call createThunkFunction from its insertThunks method exactly once for
+/// each of the thunk functions to be inserted
+/// * populate the thunk in its populateThunk method
+///
+/// Note that if some other pass is responsible for rewriting the functions,
+/// insertThunks method can simply create all possible thunks at once, probably
+/// postponed until the first occurrence of possibly affected MF.
+///
+/// Alternatively, insertThunks method can rewrite MF by itself and only insert
+/// the thunks being called. In that case InsertedThunks variable can be used
+/// to track which thunks were already inserted.
+///
+/// In any case, the thunk function has to be inserted on behalf of some other
+/// function and then populated on its own "iteration" later - this is because
+/// MachineFunctionPass will see the newly created functions, but they first
+/// have to go through the preceding passes from the same pass manager,
+/// possibly even through the instruction selector.
+//
+// FIXME Maybe implement a documented and less surprising way of modifying
+// the module from a MachineFunctionPass that is restricted to inserting
+// completely new functions to the module.
template <typename Derived, typename InsertedThunksTy = bool>
class ThunkInserter {
Derived &getDerived() { return *static_cast<Derived *>(this); }
-protected:
// A variable used to track whether (and possible which) thunks have been
// inserted so far. InsertedThunksTy is usually a bool, but can be other types
// to represent more than one type of thunk. Requires an |= operator to
// accumulate results.
InsertedThunksTy InsertedThunks;
- void doInitialization(Module &M) {}
+
+protected:
+ // Interface for subclasses to use.
+
+ /// Create an empty thunk function.
+ ///
+ /// The new function will eventually be passed to populateThunk. If multiple
+ /// thunks are created, populateThunk can distinguish them by their names.
void createThunkFunction(MachineModuleInfo &MMI, StringRef Name,
bool Comdat = true, StringRef TargetAttrs = "");
+protected:
+ // Interface for subclasses to implement.
+ //
+ // Note: all functions are non-virtual and are called via getDerived().
+ // Note: only doInitialization() has an implementation.
+
+ /// Initializes thunk inserter.
+ void doInitialization(Module &M) {}
+
+ /// Returns common prefix for thunk function's names.
+ const char *getThunkPrefix(); // undefined
+
+ /// Checks if MF may use thunks (true - maybe, false - definitely not).
+ bool mayUseThunk(const MachineFunction &MF); // undefined
+
+ /// Rewrites the function if necessary, returns the set of thunks added.
+ InsertedThunksTy insertThunks(MachineModuleInfo &MMI, MachineFunction &MF,
+ InsertedThunksTy ExistingThunks); // undefined
+
+ /// Populate the thunk function with instructions.
+ ///
+ /// If multiple thunks are created, inspect the thunk's name.
----------------
kbeyls wrote:
I find this comment a bit cryptic. I guess the intent is for it to point out the following:
"The content that must be inserted in the thunk function body should be derived from the `MF`s name."?
https://github.com/llvm/llvm-project/pull/97468
More information about the llvm-commits
mailing list