[llvm] [AllocToken] Introduce AllocToken instrumentation pass (PR #156838)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 9 06:02:51 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-llvm-transforms
Author: Marco Elver (melver)
<details>
<summary>Changes</summary>
Introduce `AllocToken`, an instrumentation pass designed to provide
tokens to memory allocators enabling various heap organization
strategies, such as heap partitioning.
Initially, the pass instruments functions marked with a new attribute
`sanitize_alloc_token` by rewriting allocation calls to include a token
ID, appended as a function argument with the default ABI.
The design aims to provide a flexible framework for implementing
different token generation schemes. It currently supports the following
token modes:
- TypeHash (default): token IDs based on a hash of the allocated type
- Random: statically-assigned pseudo-random token IDs
- Increment: incrementing token IDs per TU
For the `TypeHash` mode introduce support for `!alloc_token_hint`
metadata: the metadata can be attached to allocation calls to provide
richer semantic information to be consumed by the AllocToken pass.
Optimization remarks can be enabled to show where no metadata was
available.
An alternative "fast ABI" is provided, where instead of passing the
token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the
token ID is directly encoded into the name of the called function (e.g.,
`__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this
offers more efficient instrumentation by avoiding the overhead of
passing an additional argument at each allocation site.
Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1]
---
This change is part of the following series:
1. https://github.com/llvm/llvm-project/pull/156838
2. https://github.com/llvm/llvm-project/pull/156839
3. https://github.com/llvm/llvm-project/pull/156840
4. https://github.com/llvm/llvm-project/pull/156841
5. https://github.com/llvm/llvm-project/pull/156842
---
Patch is 47.70 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156838.diff
29 Files Affected:
- (modified) llvm/docs/LangRef.rst (+3)
- (modified) llvm/docs/ReleaseNotes.md (+4)
- (modified) llvm/include/llvm/Bitcode/LLVMBitCodes.h (+1)
- (modified) llvm/include/llvm/IR/Attributes.td (+3)
- (modified) llvm/include/llvm/IR/FixedMetadataKinds.def (+1)
- (added) llvm/include/llvm/Transforms/Instrumentation/AllocToken.h (+46)
- (modified) llvm/lib/Bitcode/Reader/BitcodeReader.cpp (+2)
- (modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+2)
- (modified) llvm/lib/Passes/PassBuilder.cpp (+1)
- (modified) llvm/lib/Passes/PassRegistry.def (+1)
- (added) llvm/lib/Transforms/Instrumentation/AllocToken.cpp (+484)
- (modified) llvm/lib/Transforms/Instrumentation/CMakeLists.txt (+1)
- (modified) llvm/lib/Transforms/Utils/CodeExtractor.cpp (+1)
- (modified) llvm/lib/Transforms/Utils/Local.cpp (+4)
- (modified) llvm/test/Bitcode/attributes.ll (+6)
- (modified) llvm/test/Bitcode/compatibility.ll (+6-2)
- (added) llvm/test/Instrumentation/AllocToken/basic.ll (+84)
- (added) llvm/test/Instrumentation/AllocToken/extralibfuncs.ll (+32)
- (added) llvm/test/Instrumentation/AllocToken/fast.ll (+39)
- (added) llvm/test/Instrumentation/AllocToken/ignore.ll (+30)
- (added) llvm/test/Instrumentation/AllocToken/invoke.ll (+86)
- (added) llvm/test/Instrumentation/AllocToken/nonlibcalls.ll (+63)
- (added) llvm/test/Instrumentation/AllocToken/remark.ll (+27)
- (modified) llvm/test/Transforms/Inline/attributes.ll (+42)
- (modified) llvm/utils/emacs/llvm-mode.el (+1-1)
- (modified) llvm/utils/gn/secondary/llvm/lib/Transforms/Instrumentation/BUILD.gn (+1)
- (modified) llvm/utils/llvm.grm (+1)
- (modified) llvm/utils/vim/syntax/llvm.vim (+1)
- (modified) llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml (+1)
``````````diff
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index e64b9343b7622..4791527a4b86b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -2427,6 +2427,9 @@ For example:
if the attributed function is called during invocation of a function
attributed with ``sanitize_realtime``.
This attribute is incompatible with the ``sanitize_realtime`` attribute.
+``sanitize_alloc_token``
+ This attributes indicates that implicit allocation token instrumentation
+ is enabled for this function.
``speculative_load_hardening``
This attribute indicates that
`Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index ff92d7390ecfd..eae9a73bedc34 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -166,6 +166,10 @@ Changes to Sanitizers
Other Changes
-------------
+* Introduces the `AllocToken` pass, an instrumentation pass designed to provide
+ tokens to memory allocators enabling various heap organization strategies,
+ such as heap partitioning.
+
External Open Source Projects Using LLVM {{env.config.release}}
===============================================================
diff --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
index 1c7d3462b6bae..464f475098ec5 100644
--- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h
+++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
@@ -800,6 +800,7 @@ enum AttributeKindCodes {
ATTR_KIND_SANITIZE_TYPE = 101,
ATTR_KIND_CAPTURES = 102,
ATTR_KIND_DEAD_ON_RETURN = 103,
+ ATTR_KIND_SANITIZE_ALLOC_TOKEN = 104,
};
enum ComdatSelectionKindCodes {
diff --git a/llvm/include/llvm/IR/Attributes.td b/llvm/include/llvm/IR/Attributes.td
index ef816fb86ed1d..8e7d9dcebfe2a 100644
--- a/llvm/include/llvm/IR/Attributes.td
+++ b/llvm/include/llvm/IR/Attributes.td
@@ -342,6 +342,9 @@ def SanitizeRealtime : EnumAttr<"sanitize_realtime", IntersectPreserve, [FnAttr]
/// during a real-time sanitized function (see `sanitize_realtime`).
def SanitizeRealtimeBlocking : EnumAttr<"sanitize_realtime_blocking", IntersectPreserve, [FnAttr]>;
+/// Allocation token instrumentation is on.
+def SanitizeAllocToken : EnumAttr<"sanitize_alloc_token", IntersectPreserve, [FnAttr]>;
+
/// Speculative Load Hardening is enabled.
///
/// Note that this uses the default compatibility (always compatible during
diff --git a/llvm/include/llvm/IR/FixedMetadataKinds.def b/llvm/include/llvm/IR/FixedMetadataKinds.def
index d09cc15d65ff6..a5a8a5663df06 100644
--- a/llvm/include/llvm/IR/FixedMetadataKinds.def
+++ b/llvm/include/llvm/IR/FixedMetadataKinds.def
@@ -55,3 +55,4 @@ LLVM_FIXED_MD_KIND(MD_mmra, "mmra", 40)
LLVM_FIXED_MD_KIND(MD_noalias_addrspace, "noalias.addrspace", 41)
LLVM_FIXED_MD_KIND(MD_callee_type, "callee_type", 42)
LLVM_FIXED_MD_KIND(MD_nofree, "nofree", 43)
+LLVM_FIXED_MD_KIND(MD_alloc_token_hint, "alloc_token_hint", 44)
diff --git a/llvm/include/llvm/Transforms/Instrumentation/AllocToken.h b/llvm/include/llvm/Transforms/Instrumentation/AllocToken.h
new file mode 100644
index 0000000000000..b1391cb04302c
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Instrumentation/AllocToken.h
@@ -0,0 +1,46 @@
+//===- AllocToken.h - Allocation token instrumentation --------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file declares the AllocTokenPass, an instrumentation pass that
+// replaces allocation calls with ones including an allocation token.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_INSTRUMENTATION_ALLOCTOKEN_H
+#define LLVM_TRANSFORMS_INSTRUMENTATION_ALLOCTOKEN_H
+
+#include "llvm/IR/Analysis.h"
+#include "llvm/IR/PassManager.h"
+#include <optional>
+
+namespace llvm {
+
+class Module;
+
+struct AllocTokenOptions {
+ std::optional<uint64_t> MaxTokens;
+ bool FastABI = false;
+ bool Extended = false;
+ AllocTokenOptions() = default;
+};
+
+/// A module pass that rewrites heap allocations to use token-enabled
+/// allocation functions based on various source-level properties.
+class AllocTokenPass : public PassInfoMixin<AllocTokenPass> {
+public:
+ LLVM_ABI explicit AllocTokenPass(AllocTokenOptions Opts = {});
+ LLVM_ABI PreservedAnalyses run(Module &M, ModuleAnalysisManager &MAM);
+ static bool isRequired() { return true; }
+
+private:
+ const AllocTokenOptions Options;
+};
+
+} // namespace llvm
+
+#endif // LLVM_TRANSFORMS_INSTRUMENTATION_ALLOCTOKEN_H
diff --git a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
index 22a0d0ffdbaab..67ad4a2655ecd 100644
--- a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+++ b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
@@ -2203,6 +2203,8 @@ static Attribute::AttrKind getAttrFromCode(uint64_t Code) {
return Attribute::SanitizeRealtime;
case bitc::ATTR_KIND_SANITIZE_REALTIME_BLOCKING:
return Attribute::SanitizeRealtimeBlocking;
+ case bitc::ATTR_KIND_SANITIZE_ALLOC_TOKEN:
+ return Attribute::SanitizeAllocToken;
case bitc::ATTR_KIND_SPECULATIVE_LOAD_HARDENING:
return Attribute::SpeculativeLoadHardening;
case bitc::ATTR_KIND_SWIFT_ERROR:
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index a1d5b36bde64d..742c1836c32cd 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -883,6 +883,8 @@ static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) {
return bitc::ATTR_KIND_STRUCT_RET;
case Attribute::SanitizeAddress:
return bitc::ATTR_KIND_SANITIZE_ADDRESS;
+ case Attribute::SanitizeAllocToken:
+ return bitc::ATTR_KIND_SANITIZE_ALLOC_TOKEN;
case Attribute::SanitizeHWAddress:
return bitc::ATTR_KIND_SANITIZE_HWADDRESS;
case Attribute::SanitizeThread:
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index d75304b5e11f6..221c6a6363613 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -237,6 +237,7 @@
#include "llvm/Transforms/IPO/WholeProgramDevirt.h"
#include "llvm/Transforms/InstCombine/InstCombine.h"
#include "llvm/Transforms/Instrumentation/AddressSanitizer.h"
+#include "llvm/Transforms/Instrumentation/AllocToken.h"
#include "llvm/Transforms/Instrumentation/BoundsChecking.h"
#include "llvm/Transforms/Instrumentation/CGProfile.h"
#include "llvm/Transforms/Instrumentation/ControlHeightReduction.h"
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 4b462b9c6845c..00896b2a8f3fc 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -124,6 +124,7 @@ MODULE_PASS("openmp-opt", OpenMPOptPass())
MODULE_PASS("openmp-opt-postlink",
OpenMPOptPass(ThinOrFullLTOPhase::FullLTOPostLink))
MODULE_PASS("partial-inliner", PartialInlinerPass())
+MODULE_PASS("alloc-token", AllocTokenPass())
MODULE_PASS("pgo-icall-prom", PGOIndirectCallPromotion())
MODULE_PASS("pgo-instr-gen", PGOInstrumentationGen())
MODULE_PASS("pgo-instr-use", PGOInstrumentationUse())
diff --git a/llvm/lib/Transforms/Instrumentation/AllocToken.cpp b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
new file mode 100644
index 0000000000000..4ea68470ca684
--- /dev/null
+++ b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
@@ -0,0 +1,484 @@
+//===- AllocToken.cpp - Allocation token instrumentation ------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements AllocToken, an instrumentation pass that
+// replaces allocation calls with token-enabled versions.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Instrumentation/AllocToken.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/SmallPtrSet.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/Statistic.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Analysis/MemoryBuiltins.h"
+#include "llvm/Analysis/OptimizationRemarkEmitter.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
+#include "llvm/IR/Analysis.h"
+#include "llvm/IR/Attributes.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/DerivedTypes.h"
+#include "llvm/IR/Function.h"
+#include "llvm/IR/GlobalValue.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InstIterator.h"
+#include "llvm/IR/InstrTypes.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/Metadata.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/PassManager.h"
+#include "llvm/IR/Type.h"
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/RandomNumberGenerator.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Support/xxhash.h"
+#include <cassert>
+#include <cstddef>
+#include <cstdint>
+#include <limits>
+#include <memory>
+#include <optional>
+#include <string>
+#include <utility>
+#include <variant>
+
+using namespace llvm;
+
+#define DEBUG_TYPE "alloc-token"
+
+namespace {
+
+//===--- Constants --------------------------------------------------------===//
+
+enum class TokenMode : unsigned {
+ /// Incrementally increasing token ID.
+ Increment = 0,
+
+ /// Simple mode that returns a statically-assigned random token ID.
+ Random = 1,
+
+ /// Token ID based on allocated type hash.
+ TypeHash = 2,
+
+ // Mode count - keep last
+ ModeCount
+};
+
+//===--- Command-line options ---------------------------------------------===//
+
+struct ModeParser : public cl::parser<unsigned> {
+ ModeParser(cl::Option &O) : cl::parser<unsigned>(O) {}
+ bool parse(cl::Option &O, StringRef ArgName, StringRef Arg, unsigned &Value) {
+ if (cl::parser<unsigned>::parse(O, ArgName, Arg, Value))
+ return true;
+ if (Value >= static_cast<unsigned>(TokenMode::ModeCount))
+ return O.error("'" + Arg + "' value invalid");
+ return false;
+ }
+};
+
+cl::opt<unsigned, false, ModeParser>
+ ClMode("alloc-token-mode", cl::desc("Token assignment mode"), cl::Hidden,
+ cl::init(static_cast<unsigned>(TokenMode::TypeHash)));
+
+cl::opt<std::string> ClFuncPrefix("alloc-token-prefix",
+ cl::desc("The allocation function prefix"),
+ cl::Hidden, cl::init("__alloc_token_"));
+
+cl::opt<uint64_t> ClMaxTokens("alloc-token-max",
+ cl::desc("Maximum number of tokens (0 = no max)"),
+ cl::Hidden, cl::init(0));
+
+cl::opt<bool>
+ ClFastABI("alloc-token-fast-abi",
+ cl::desc("The token ID is encoded in the function name"),
+ cl::Hidden, cl::init(false));
+
+// Instrument libcalls only by default - compatible allocators only need to take
+// care of providing standard allocation functions. With extended coverage, also
+// instrument non-libcall allocation function calls with !alloc_token_hint
+// metadata.
+cl::opt<bool>
+ ClExtended("alloc-token-extended",
+ cl::desc("Extend coverage to custom allocation functions"),
+ cl::Hidden, cl::init(false));
+
+// C++ defines ::operator new (and variants) as replaceable (vs. standard
+// library versions), which are nobuiltin, and are therefore not covered by
+// isAllocationFn(). Cover by default, as users of AllocToken are already
+// required to provide token-aware allocation functions (no defaults).
+cl::opt<bool> ClCoverReplaceableNew("alloc-token-cover-replaceable-new",
+ cl::desc("Cover replaceable operator new"),
+ cl::Hidden, cl::init(true));
+
+// strdup-family functions only operate on strings, covering them does not make
+// sense in most cases.
+cl::opt<bool>
+ ClCoverStrdup("alloc-token-cover-strdup",
+ cl::desc("Cover strdup-family allocation functions"),
+ cl::Hidden, cl::init(false));
+
+cl::opt<uint64_t> ClFallbackToken(
+ "alloc-token-fallback",
+ cl::desc("The default fallback token where none could be determined"),
+ cl::Hidden, cl::init(0));
+
+//===--- Statistics -------------------------------------------------------===//
+
+STATISTIC(NumFunctionsInstrumented, "Functions instrumented");
+STATISTIC(NumAllocations, "Allocations found");
+
+//===----------------------------------------------------------------------===//
+
+/// Returns the !alloc_token_hint metadata if available.
+///
+/// Expected format is: !{<type-name>}
+MDNode *getAllocTokenHintMetadata(const CallBase &CB) {
+ MDNode *Ret = CB.getMetadata(LLVMContext::MD_alloc_token_hint);
+ if (!Ret)
+ return nullptr;
+ assert(Ret->getNumOperands() == 1 && "bad !alloc_token_hint");
+ assert(isa<MDString>(Ret->getOperand(0)));
+ return Ret;
+}
+
+class ModeBase {
+public:
+ explicit ModeBase(uint64_t MaxTokens) : MaxTokens(MaxTokens) {}
+
+protected:
+ uint64_t boundedToken(uint64_t Val) const {
+ return MaxTokens ? Val % MaxTokens : Val;
+ }
+
+ const uint64_t MaxTokens;
+};
+
+/// Implementation for TokenMode::Increment.
+class IncrementMode : public ModeBase {
+public:
+ using ModeBase::ModeBase;
+
+ uint64_t operator()(const CallBase &CB, OptimizationRemarkEmitter &) {
+ return boundedToken(Counter++);
+ }
+
+private:
+ uint64_t Counter = 0;
+};
+
+/// Implementation for TokenMode::Random.
+class RandomMode : public ModeBase {
+public:
+ RandomMode(uint64_t MaxTokens, std::unique_ptr<RandomNumberGenerator> RNG)
+ : ModeBase(MaxTokens), RNG(std::move(RNG)) {}
+ uint64_t operator()(const CallBase &CB, OptimizationRemarkEmitter &) {
+ return boundedToken((*RNG)());
+ }
+
+private:
+ std::unique_ptr<RandomNumberGenerator> RNG;
+};
+
+/// Implementation for TokenMode::TypeHash. The implementation ensures
+/// hashes are stable across different compiler invocations. Uses xxHash as the
+/// hash function.
+class TypeHashMode : public ModeBase {
+public:
+ using ModeBase::ModeBase;
+
+ uint64_t operator()(const CallBase &CB, OptimizationRemarkEmitter &ORE) {
+ if (MDNode *N = getAllocTokenHintMetadata(CB)) {
+ MDString *S = cast<MDString>(N->getOperand(0));
+ return boundedToken(xxHash64(S->getString()));
+ }
+ remarkNoHint(CB, ORE);
+ return ClFallbackToken;
+ }
+
+ /// Remark that there was no precise type information.
+ void remarkNoHint(const CallBase &CB, OptimizationRemarkEmitter &ORE) {
+ ORE.emit([&] {
+ ore::NV FuncNV("Function", CB.getParent()->getParent());
+ const Function *Callee = CB.getCalledFunction();
+ ore::NV CalleeNV("Callee", Callee ? Callee->getName() : "<unknown>");
+ return OptimizationRemark(DEBUG_TYPE, "NoAllocTokenHint", &CB)
+ << "Call to '" << CalleeNV << "' in '" << FuncNV
+ << "' without source-level type token";
+ });
+ }
+};
+
+// Apply opt overrides.
+AllocTokenOptions &&transformOptionsFromCl(AllocTokenOptions &&Opts) {
+ if (!Opts.MaxTokens.has_value())
+ Opts.MaxTokens = ClMaxTokens;
+ Opts.FastABI |= ClFastABI;
+ Opts.Extended |= ClExtended;
+ return std::move(Opts);
+}
+
+class AllocToken {
+public:
+ explicit AllocToken(AllocTokenOptions Opts, Module &M,
+ ModuleAnalysisManager &MAM)
+ : Options(transformOptionsFromCl(std::move(Opts))), Mod(M),
+ FAM(MAM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager()),
+ Mode(IncrementMode(*Options.MaxTokens)) {
+ switch (static_cast<TokenMode>(ClMode.getValue())) {
+ case TokenMode::Increment:
+ break;
+ case TokenMode::Random:
+ Mode.emplace<RandomMode>(*Options.MaxTokens, M.createRNG(DEBUG_TYPE));
+ break;
+ case TokenMode::TypeHash:
+ Mode.emplace<TypeHashMode>(*Options.MaxTokens);
+ break;
+ case TokenMode::ModeCount:
+ llvm_unreachable("");
+ break;
+ }
+ }
+
+ bool instrumentFunction(Function &F);
+
+private:
+ /// Returns true for !isAllocationFn() functions that are also eligible for
+ /// instrumentation.
+ bool isInstrumentableLibFunc(LibFunc Func) const;
+
+ /// Returns true for isAllocationFn() functions that we should ignore.
+ bool ignoreInstrumentableLibFunc(LibFunc Func) const;
+
+ /// Replace a call/invoke with a call/invoke to the allocation function
+ /// with token ID.
+ void replaceAllocationCall(CallBase *CB, LibFunc Func,
+ OptimizationRemarkEmitter &ORE,
+ const TargetLibraryInfo &TLI);
+
+ /// Return replacement function for a LibFunc that takes a token ID.
+ FunctionCallee getTokenAllocFunction(const CallBase &CB, uint64_t TokenID,
+ LibFunc OriginalFunc);
+
+ /// Return the token ID from metadata in the call.
+ uint64_t getToken(const CallBase &CB, OptimizationRemarkEmitter &ORE) {
+ return std::visit([&](auto &&Mode) { return Mode(CB, ORE); }, Mode);
+ }
+
+ const AllocTokenOptions Options;
+ Module &Mod;
+ FunctionAnalysisManager &FAM;
+ // Cache for replacement functions.
+ DenseMap<std::pair<LibFunc, uint64_t>, FunctionCallee> TokenAllocFunctions;
+ // Selected mode.
+ std::variant<IncrementMode, RandomMode, TypeHashMode> Mode;
+};
+
+bool AllocToken::instrumentFunction(Function &F) {
+ // Do not apply any instrumentation for naked functions.
+ if (F.hasFnAttribute(Attribute::Naked))
+ return false;
+ if (F.hasFnAttribute(Attribute::DisableSanitizerInstrumentation))
+ return false;
+ // Don't touch available_externally functions, their actual body is elsewhere.
+ if (F.getLinkage() == GlobalValue::AvailableExternallyLinkage)
+ return false;
+ // Only instrument functions that have the sanitize_alloc_token attribute.
+ if (!F.hasFnAttribute(Attribute::SanitizeAllocToken))
+ return false;
+
+ auto &ORE = FAM.getResult<OptimizationRemarkEmitterAnalysis>(F);
+ auto &TLI = FAM.getResult<TargetLibraryAnalysis>(F);
+ SmallVector<std::pair<CallBase *, LibFunc>, 4> AllocCalls;
+
+ // Collect all allocation calls to avoid iterator invalidation.
+ for (Instruction &I : instructions(F)) {
+ auto *CB = dyn_cast<CallBase>(&I);
+ if (!CB)
+ continue;
+ const Function *Callee = CB->getCalledFunction();
+ if (!Callee)
+ continue;
+ // Ignore nobuiltin of the CallBase, so that we can cover nobuiltin libcalls
+ // if requested via isInstrumentableLibFunc(). Note that isAllocationFn() is
+ // returning false for nobuiltin calls.
+ LibFunc Func;
+ if (TLI.getLibFunc(*Callee, Func)) {
+ if (ignoreInstrumentableLibFunc(Func))
+ continue;
+ if (isInstrumentableLibFunc(Func) || isAllocationFn(CB, &TLI))
+ AllocCalls.emplace_back(CB, Func);
+ } else if (Options.Extended && getAllocTokenHintMetadata(*CB)) {
+ AllocCalls.emplace_back(CB, NotLibFunc);
+ }
+ }
+
+ bool Modified = false;
+
+ if (!AllocCalls.empty()) {
+ for (auto &[CB, Func] : AllocCalls) {
+ replaceAllocationCall(CB, Func, ORE, TLI);
+ }
+ NumAllocations += AllocCalls.size();
+ NumFunctionsInstrumented++;
+ Modified = true;
+ }
+
+ return Modified;
+}
+
+bool AllocToken::isInstrumentableLibFunc(LibFunc Func) const {
+ switch (Func) {
+ case LibFunc_posix_memalign:
+ case LibFunc_size_returning_new:
+ case LibFunc_size_returning_new_hot_cold:
+ case LibFunc_size_returning_new_aligned:
+ case LibFunc_size_returning_new_aligned_hot_cold:
+ return true;
+ case LibFunc_Znwj:
+ case LibFunc_ZnwjRKSt9nothrow_t:
+ case LibFunc_ZnwjSt11align_val_t:
+ case LibFunc_ZnwjSt11align_val_tRKSt9nothrow_t:
+ case LibFunc_Znwm:
+ case LibFunc_Znwm12__hot_cold_t:
+ case LibFunc_ZnwmRKSt9nothrow_t:
+ case LibFunc_ZnwmRKSt9nothrow_t12__hot_cold_t:
+ case LibFunc_ZnwmSt11align_val_t:
+ case LibFunc_ZnwmSt11align_val_t12__hot_cold_t:
+ case LibFunc_ZnwmSt11align_val_tRKSt9nothrow_t:
+ case LibFunc_ZnwmSt...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/156838
More information about the llvm-commits
mailing list