[llvm] [AllocToken] Introduce AllocToken instrumentation pass (PR #156843)

Marco Elver via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 4 02:54:51 PDT 2025


https://github.com/melver created https://github.com/llvm/llvm-project/pull/156843

Introduce `AllocToken`, an instrumentation pass designed to provide
tokens to memory allocators enabling various heap organization
strategies, such as heap partitioning.

Initially, the pass instruments functions marked with a new attribute
`sanitize_alloc_token` by rewriting allocation calls to include a token
ID, appended as a function argument with the default ABI.

The design aims to provide a flexible framework for implementing
different token generation schemes. It currently supports the following
token modes:

- TypeHash (default): token IDs based on a hash of the allocated type
- Random: statically-assigned pseudo-random token IDs
- Increment: incrementing token IDs per TU

For the `TypeHash` mode introduce support for `!alloc_token_hint`
metadata: the metadata can be attached to allocation calls to provide
richer semantic information to be consumed by the AllocToken pass.
Optimization remarks can be enabled to show where no metadata was
available.

An alternative "fast ABI" is provided, where instead of passing the
token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the
token ID is directly encoded into the name of the called function (e.g.,
`__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this
offers more efficient instrumentation by avoiding the overhead of
passing an additional argument at each allocation site.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1]


>From d2259967d82e5a987d7b2a7a662f2d3ddbdc80a4 Mon Sep 17 00:00:00 2001
From: Marco Elver <elver at google.com>
Date: Thu, 4 Sep 2025 11:54:34 +0200
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20initia?=
 =?UTF-8?q?l=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.8-beta.1
---
 llvm/docs/LangRef.rst                         |   3 +
 llvm/docs/ReleaseNotes.md                     |   4 +
 llvm/include/llvm/Bitcode/LLVMBitCodes.h      |   1 +
 llvm/include/llvm/IR/Attributes.td            |   3 +
 .../Transforms/Instrumentation/AllocToken.h   |  46 ++
 llvm/lib/Bitcode/Reader/BitcodeReader.cpp     |   2 +
 llvm/lib/Bitcode/Writer/BitcodeWriter.cpp     |   2 +
 llvm/lib/Passes/PassBuilder.cpp               |   1 +
 llvm/lib/Passes/PassRegistry.def              |   1 +
 .../Transforms/Instrumentation/AllocToken.cpp | 484 ++++++++++++++++++
 .../Transforms/Instrumentation/CMakeLists.txt |   1 +
 llvm/lib/Transforms/Utils/CodeExtractor.cpp   |   1 +
 llvm/test/Bitcode/attributes.ll               |   6 +
 llvm/test/Bitcode/compatibility.ll            |   8 +-
 llvm/test/Instrumentation/AllocToken/basic.ll |  84 +++
 .../AllocToken/extralibfuncs.ll               |  32 ++
 llvm/test/Instrumentation/AllocToken/fast.ll  |  39 ++
 .../test/Instrumentation/AllocToken/ignore.ll |  30 ++
 .../test/Instrumentation/AllocToken/invoke.ll |  86 ++++
 .../Instrumentation/AllocToken/nonlibcalls.ll |  63 +++
 .../test/Instrumentation/AllocToken/remark.ll |  27 +
 llvm/test/Transforms/Inline/attributes.ll     |  42 ++
 llvm/utils/emacs/llvm-mode.el                 |   2 +-
 .../lib/Transforms/Instrumentation/BUILD.gn   |   1 +
 llvm/utils/llvm.grm                           |   1 +
 llvm/utils/vim/syntax/llvm.vim                |   1 +
 .../vscode/llvm/syntaxes/ll.tmLanguage.yaml   |   1 +
 27 files changed, 969 insertions(+), 3 deletions(-)
 create mode 100644 llvm/include/llvm/Transforms/Instrumentation/AllocToken.h
 create mode 100644 llvm/lib/Transforms/Instrumentation/AllocToken.cpp
 create mode 100644 llvm/test/Instrumentation/AllocToken/basic.ll
 create mode 100644 llvm/test/Instrumentation/AllocToken/extralibfuncs.ll
 create mode 100644 llvm/test/Instrumentation/AllocToken/fast.ll
 create mode 100644 llvm/test/Instrumentation/AllocToken/ignore.ll
 create mode 100644 llvm/test/Instrumentation/AllocToken/invoke.ll
 create mode 100644 llvm/test/Instrumentation/AllocToken/nonlibcalls.ll
 create mode 100644 llvm/test/Instrumentation/AllocToken/remark.ll

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index e64b9343b7622..4791527a4b86b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -2427,6 +2427,9 @@ For example:
     if the attributed function is called during invocation of a function
     attributed with ``sanitize_realtime``.
     This attribute is incompatible with the ``sanitize_realtime`` attribute.
+``sanitize_alloc_token``
+    This attributes indicates that implicit allocation token instrumentation
+    is enabled for this function.
 ``speculative_load_hardening``
     This attribute indicates that
     `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index ff92d7390ecfd..eae9a73bedc34 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -166,6 +166,10 @@ Changes to Sanitizers
 Other Changes
 -------------
 
+* Introduces the `AllocToken` pass, an instrumentation pass designed to provide
+  tokens to memory allocators enabling various heap organization strategies,
+  such as heap partitioning.
+
 External Open Source Projects Using LLVM {{env.config.release}}
 ===============================================================
 
diff --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
index 1c7d3462b6bae..464f475098ec5 100644
--- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h
+++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
@@ -800,6 +800,7 @@ enum AttributeKindCodes {
   ATTR_KIND_SANITIZE_TYPE = 101,
   ATTR_KIND_CAPTURES = 102,
   ATTR_KIND_DEAD_ON_RETURN = 103,
+  ATTR_KIND_SANITIZE_ALLOC_TOKEN = 104,
 };
 
 enum ComdatSelectionKindCodes {
diff --git a/llvm/include/llvm/IR/Attributes.td b/llvm/include/llvm/IR/Attributes.td
index ef816fb86ed1d..8e7d9dcebfe2a 100644
--- a/llvm/include/llvm/IR/Attributes.td
+++ b/llvm/include/llvm/IR/Attributes.td
@@ -342,6 +342,9 @@ def SanitizeRealtime : EnumAttr<"sanitize_realtime", IntersectPreserve, [FnAttr]
 /// during a real-time sanitized function (see `sanitize_realtime`).
 def SanitizeRealtimeBlocking : EnumAttr<"sanitize_realtime_blocking", IntersectPreserve, [FnAttr]>;
 
+/// Allocation token instrumentation is on.
+def SanitizeAllocToken : EnumAttr<"sanitize_alloc_token", IntersectPreserve, [FnAttr]>;
+
 /// Speculative Load Hardening is enabled.
 ///
 /// Note that this uses the default compatibility (always compatible during
diff --git a/llvm/include/llvm/Transforms/Instrumentation/AllocToken.h b/llvm/include/llvm/Transforms/Instrumentation/AllocToken.h
new file mode 100644
index 0000000000000..b1391cb04302c
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Instrumentation/AllocToken.h
@@ -0,0 +1,46 @@
+//===- AllocToken.h - Allocation token instrumentation --------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file declares the AllocTokenPass, an instrumentation pass that
+// replaces allocation calls with ones including an allocation token.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_INSTRUMENTATION_ALLOCTOKEN_H
+#define LLVM_TRANSFORMS_INSTRUMENTATION_ALLOCTOKEN_H
+
+#include "llvm/IR/Analysis.h"
+#include "llvm/IR/PassManager.h"
+#include <optional>
+
+namespace llvm {
+
+class Module;
+
+struct AllocTokenOptions {
+  std::optional<uint64_t> MaxTokens;
+  bool FastABI = false;
+  bool Extended = false;
+  AllocTokenOptions() = default;
+};
+
+/// A module pass that rewrites heap allocations to use token-enabled
+/// allocation functions based on various source-level properties.
+class AllocTokenPass : public PassInfoMixin<AllocTokenPass> {
+public:
+  LLVM_ABI explicit AllocTokenPass(AllocTokenOptions Opts = {});
+  LLVM_ABI PreservedAnalyses run(Module &M, ModuleAnalysisManager &MAM);
+  static bool isRequired() { return true; }
+
+private:
+  const AllocTokenOptions Options;
+};
+
+} // namespace llvm
+
+#endif // LLVM_TRANSFORMS_INSTRUMENTATION_ALLOCTOKEN_H
diff --git a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
index 22a0d0ffdbaab..67ad4a2655ecd 100644
--- a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+++ b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
@@ -2203,6 +2203,8 @@ static Attribute::AttrKind getAttrFromCode(uint64_t Code) {
     return Attribute::SanitizeRealtime;
   case bitc::ATTR_KIND_SANITIZE_REALTIME_BLOCKING:
     return Attribute::SanitizeRealtimeBlocking;
+  case bitc::ATTR_KIND_SANITIZE_ALLOC_TOKEN:
+    return Attribute::SanitizeAllocToken;
   case bitc::ATTR_KIND_SPECULATIVE_LOAD_HARDENING:
     return Attribute::SpeculativeLoadHardening;
   case bitc::ATTR_KIND_SWIFT_ERROR:
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index a1d5b36bde64d..742c1836c32cd 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -883,6 +883,8 @@ static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) {
     return bitc::ATTR_KIND_STRUCT_RET;
   case Attribute::SanitizeAddress:
     return bitc::ATTR_KIND_SANITIZE_ADDRESS;
+  case Attribute::SanitizeAllocToken:
+    return bitc::ATTR_KIND_SANITIZE_ALLOC_TOKEN;
   case Attribute::SanitizeHWAddress:
     return bitc::ATTR_KIND_SANITIZE_HWADDRESS;
   case Attribute::SanitizeThread:
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index d75304b5e11f6..221c6a6363613 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -237,6 +237,7 @@
 #include "llvm/Transforms/IPO/WholeProgramDevirt.h"
 #include "llvm/Transforms/InstCombine/InstCombine.h"
 #include "llvm/Transforms/Instrumentation/AddressSanitizer.h"
+#include "llvm/Transforms/Instrumentation/AllocToken.h"
 #include "llvm/Transforms/Instrumentation/BoundsChecking.h"
 #include "llvm/Transforms/Instrumentation/CGProfile.h"
 #include "llvm/Transforms/Instrumentation/ControlHeightReduction.h"
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 4b462b9c6845c..00896b2a8f3fc 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -124,6 +124,7 @@ MODULE_PASS("openmp-opt", OpenMPOptPass())
 MODULE_PASS("openmp-opt-postlink",
             OpenMPOptPass(ThinOrFullLTOPhase::FullLTOPostLink))
 MODULE_PASS("partial-inliner", PartialInlinerPass())
+MODULE_PASS("alloc-token", AllocTokenPass())
 MODULE_PASS("pgo-icall-prom", PGOIndirectCallPromotion())
 MODULE_PASS("pgo-instr-gen", PGOInstrumentationGen())
 MODULE_PASS("pgo-instr-use", PGOInstrumentationUse())
diff --git a/llvm/lib/Transforms/Instrumentation/AllocToken.cpp b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
new file mode 100644
index 0000000000000..f0f7f14448dc8
--- /dev/null
+++ b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
@@ -0,0 +1,484 @@
+//===- AllocToken.cpp - Allocation token instrumentation ------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements AllocToken, an instrumentation pass that
+// replaces allocation calls with token-enabled versions.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Instrumentation/AllocToken.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/SmallPtrSet.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/Statistic.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Analysis/MemoryBuiltins.h"
+#include "llvm/Analysis/OptimizationRemarkEmitter.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
+#include "llvm/IR/Analysis.h"
+#include "llvm/IR/Attributes.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/DerivedTypes.h"
+#include "llvm/IR/Function.h"
+#include "llvm/IR/GlobalValue.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InstIterator.h"
+#include "llvm/IR/InstrTypes.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/Metadata.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/PassManager.h"
+#include "llvm/IR/Type.h"
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/RandomNumberGenerator.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Support/xxhash.h"
+#include <cassert>
+#include <cstddef>
+#include <cstdint>
+#include <limits>
+#include <memory>
+#include <optional>
+#include <string>
+#include <utility>
+#include <variant>
+
+using namespace llvm;
+
+#define DEBUG_TYPE "alloc-token"
+
+namespace {
+
+//===--- Constants --------------------------------------------------------===//
+
+enum class TokenMode : unsigned {
+  /// Incrementally increasing token ID.
+  Increment = 0,
+
+  /// Simple mode that returns a statically-assigned random token ID.
+  Random = 1,
+
+  /// Token ID based on allocated type hash.
+  TypeHash = 2,
+
+  // Mode count - keep last
+  ModeCount
+};
+
+//===--- Command-line options ---------------------------------------------===//
+
+struct ModeParser : public cl::parser<unsigned> {
+  ModeParser(cl::Option &O) : cl::parser<unsigned>(O) {}
+  bool parse(cl::Option &O, StringRef ArgName, StringRef Arg, unsigned &Value) {
+    if (cl::parser<unsigned>::parse(O, ArgName, Arg, Value))
+      return true;
+    if (Value >= static_cast<unsigned>(TokenMode::ModeCount))
+      return O.error("'" + Arg + "' value invalid");
+    return false;
+  }
+};
+
+cl::opt<unsigned, false, ModeParser>
+    ClMode("alloc-token-mode", cl::desc("Token assignment mode"), cl::Hidden,
+           cl::init(static_cast<unsigned>(TokenMode::TypeHash)));
+
+cl::opt<std::string> ClFuncPrefix("alloc-token-prefix",
+                                  cl::desc("The allocation function prefix"),
+                                  cl::Hidden, cl::init("__alloc_token_"));
+
+cl::opt<uint64_t> ClMaxTokens("alloc-token-max",
+                              cl::desc("Maximum number of tokens (0 = no max)"),
+                              cl::Hidden, cl::init(0));
+
+cl::opt<bool>
+    ClFastABI("alloc-token-fast-abi",
+              cl::desc("The token ID is encoded in the function name"),
+              cl::Hidden, cl::init(false));
+
+// Instrument libcalls only by default - compatible allocators only need to take
+// care of providing standard allocation functions. With extended coverage, also
+// instrument non-libcall allocation function calls with !alloc_token_hint
+// metadata.
+cl::opt<bool>
+    ClExtended("alloc-token-extended",
+               cl::desc("Extend coverage to custom allocation functions"),
+               cl::Hidden, cl::init(false));
+
+// C++ defines ::operator new (and variants) as replaceable (vs. standard
+// library versions), which are nobuiltin, and are therefore not covered by
+// isAllocationFn(). Cover by default, as users of AllocToken are already
+// required to provide token-aware allocation functions (no defaults).
+cl::opt<bool> ClCoverReplaceableNew("alloc-token-cover-replaceable-new",
+                                    cl::desc("Cover replaceable operator new"),
+                                    cl::Hidden, cl::init(true));
+
+// strdup-family functions only operate on strings, covering them does not make
+// sense in most cases.
+cl::opt<bool>
+    ClCoverStrdup("alloc-token-cover-strdup",
+                  cl::desc("Cover strdup-family allocation functions"),
+                  cl::Hidden, cl::init(false));
+
+cl::opt<uint64_t> ClFallbackToken(
+    "alloc-token-fallback",
+    cl::desc("The default fallback token where none could be determined"),
+    cl::Hidden, cl::init(0));
+
+//===--- Statistics -------------------------------------------------------===//
+
+STATISTIC(NumFunctionsInstrumented, "Functions instrumented");
+STATISTIC(NumAllocations, "Allocations found");
+
+//===----------------------------------------------------------------------===//
+
+/// Returns the !alloc_token_hint metadata if available.
+///
+/// Expected format is: !{<type-name>}
+MDNode *getAllocTokenHintMetadata(const CallBase &CB) {
+  MDNode *Ret = CB.getMetadata("alloc_token_hint");
+  if (!Ret)
+    return nullptr;
+  assert(Ret->getNumOperands() == 1 && "bad !alloc_token_hint");
+  assert(isa<MDString>(Ret->getOperand(0)));
+  return Ret;
+}
+
+class ModeBase {
+public:
+  explicit ModeBase(uint64_t MaxTokens) : MaxTokens(MaxTokens) {}
+
+protected:
+  uint64_t boundedToken(uint64_t Val) const {
+    return MaxTokens ? Val % MaxTokens : Val;
+  }
+
+  const uint64_t MaxTokens;
+};
+
+/// Implementation for TokenMode::Increment.
+class IncrementMode : public ModeBase {
+public:
+  using ModeBase::ModeBase;
+
+  uint64_t operator()(const CallBase &CB, OptimizationRemarkEmitter &) {
+    return boundedToken(Counter++);
+  }
+
+private:
+  uint64_t Counter = 0;
+};
+
+/// Implementation for TokenMode::Random.
+class RandomMode : public ModeBase {
+public:
+  RandomMode(uint64_t MaxTokens, std::unique_ptr<RandomNumberGenerator> RNG)
+      : ModeBase(MaxTokens), RNG(std::move(RNG)) {}
+  uint64_t operator()(const CallBase &CB, OptimizationRemarkEmitter &) {
+    return boundedToken((*RNG)());
+  }
+
+private:
+  std::unique_ptr<RandomNumberGenerator> RNG;
+};
+
+/// Implementation for TokenMode::TypeHash. The implementation ensures
+/// hashes are stable across different compiler invocations. Uses xxHash as the
+/// hash function.
+class TypeHashMode : public ModeBase {
+public:
+  using ModeBase::ModeBase;
+
+  uint64_t operator()(const CallBase &CB, OptimizationRemarkEmitter &ORE) {
+    if (MDNode *N = getAllocTokenHintMetadata(CB)) {
+      MDString *S = cast<MDString>(N->getOperand(0));
+      return boundedToken(xxHash64(S->getString()));
+    }
+    remarkNoHint(CB, ORE);
+    return ClFallbackToken;
+  }
+
+  /// Remark that there was no precise type information.
+  void remarkNoHint(const CallBase &CB, OptimizationRemarkEmitter &ORE) {
+    ORE.emit([&] {
+      ore::NV FuncNV("Function", CB.getParent()->getParent());
+      const Function *Callee = CB.getCalledFunction();
+      ore::NV CalleeNV("Callee", Callee ? Callee->getName() : "<unknown>");
+      return OptimizationRemark(DEBUG_TYPE, "NoAllocTokenHint", &CB)
+             << "Call to '" << CalleeNV << "' in '" << FuncNV
+             << "' without source-level type token";
+    });
+  }
+};
+
+// Apply opt overrides.
+AllocTokenOptions &&transformOptionsFromCl(AllocTokenOptions &&Opts) {
+  if (!Opts.MaxTokens.has_value())
+    Opts.MaxTokens = ClMaxTokens;
+  Opts.FastABI |= ClFastABI;
+  Opts.Extended |= ClExtended;
+  return std::move(Opts);
+}
+
+class AllocToken {
+public:
+  explicit AllocToken(AllocTokenOptions Opts, Module &M,
+                      ModuleAnalysisManager &MAM)
+      : Options(transformOptionsFromCl(std::move(Opts))), Mod(M),
+        FAM(MAM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager()),
+        Mode(IncrementMode(*Options.MaxTokens)) {
+    switch (static_cast<TokenMode>(ClMode.getValue())) {
+    case TokenMode::Increment:
+      break;
+    case TokenMode::Random:
+      Mode.emplace<RandomMode>(*Options.MaxTokens, M.createRNG(DEBUG_TYPE));
+      break;
+    case TokenMode::TypeHash:
+      Mode.emplace<TypeHashMode>(*Options.MaxTokens);
+      break;
+    case TokenMode::ModeCount:
+      llvm_unreachable("");
+      break;
+    }
+  }
+
+  bool instrumentFunction(Function &F);
+
+private:
+  /// Returns true for !isAllocationFn() functions that are also eligible for
+  /// instrumentation.
+  bool isInstrumentableLibFunc(LibFunc Func) const;
+
+  /// Returns true for isAllocationFn() functions that we should ignore.
+  bool ignoreInstrumentableLibFunc(LibFunc Func) const;
+
+  /// Replace a call/invoke with a call/invoke to the allocation function
+  /// with token ID.
+  void replaceAllocationCall(CallBase *CB, LibFunc Func,
+                             OptimizationRemarkEmitter &ORE,
+                             const TargetLibraryInfo &TLI);
+
+  /// Return replacement function for a LibFunc that takes a token ID.
+  FunctionCallee getTokenAllocFunction(const CallBase &CB, uint64_t TokenID,
+                                       LibFunc OriginalFunc);
+
+  /// Return the token ID from metadata in the call.
+  uint64_t getToken(const CallBase &CB, OptimizationRemarkEmitter &ORE) {
+    return std::visit([&](auto &&Mode) { return Mode(CB, ORE); }, Mode);
+  }
+
+  const AllocTokenOptions Options;
+  Module &Mod;
+  FunctionAnalysisManager &FAM;
+  // Cache for replacement functions.
+  DenseMap<std::pair<LibFunc, uint64_t>, FunctionCallee> TokenAllocFunctions;
+  // Selected mode.
+  std::variant<IncrementMode, RandomMode, TypeHashMode> Mode;
+};
+
+bool AllocToken::instrumentFunction(Function &F) {
+  // Do not apply any instrumentation for naked functions.
+  if (F.hasFnAttribute(Attribute::Naked))
+    return false;
+  if (F.hasFnAttribute(Attribute::DisableSanitizerInstrumentation))
+    return false;
+  // Don't touch available_externally functions, their actual body is elsewhere.
+  if (F.getLinkage() == GlobalValue::AvailableExternallyLinkage)
+    return false;
+  // Only instrument functions that have the sanitize_alloc_token attribute.
+  if (!F.hasFnAttribute(Attribute::SanitizeAllocToken))
+    return false;
+
+  auto &ORE = FAM.getResult<OptimizationRemarkEmitterAnalysis>(F);
+  auto &TLI = FAM.getResult<TargetLibraryAnalysis>(F);
+  SmallVector<std::pair<CallBase *, LibFunc>, 4> AllocCalls;
+
+  // Collect all allocation calls to avoid iterator invalidation.
+  for (Instruction &I : instructions(F)) {
+    auto *CB = dyn_cast<CallBase>(&I);
+    if (!CB)
+      continue;
+    const Function *Callee = CB->getCalledFunction();
+    if (!Callee)
+      continue;
+    // Ignore nobuiltin of the CallBase, so that we can cover nobuiltin libcalls
+    // if requested via isInstrumentableLibFunc(). Note that isAllocationFn() is
+    // returning false for nobuiltin calls.
+    LibFunc Func;
+    if (TLI.getLibFunc(*Callee, Func)) {
+      if (ignoreInstrumentableLibFunc(Func))
+        continue;
+      if (isInstrumentableLibFunc(Func) || isAllocationFn(CB, &TLI))
+        AllocCalls.emplace_back(CB, Func);
+    } else if (Options.Extended && getAllocTokenHintMetadata(*CB)) {
+      AllocCalls.emplace_back(CB, NotLibFunc);
+    }
+  }
+
+  bool Modified = false;
+
+  if (!AllocCalls.empty()) {
+    for (auto &[CB, Func] : AllocCalls) {
+      replaceAllocationCall(CB, Func, ORE, TLI);
+    }
+    NumAllocations += AllocCalls.size();
+    NumFunctionsInstrumented++;
+    Modified = true;
+  }
+
+  return Modified;
+}
+
+bool AllocToken::isInstrumentableLibFunc(LibFunc Func) const {
+  switch (Func) {
+  case LibFunc_posix_memalign:
+  case LibFunc_size_returning_new:
+  case LibFunc_size_returning_new_hot_cold:
+  case LibFunc_size_returning_new_aligned:
+  case LibFunc_size_returning_new_aligned_hot_cold:
+    return true;
+  case LibFunc_Znwj:
+  case LibFunc_ZnwjRKSt9nothrow_t:
+  case LibFunc_ZnwjSt11align_val_t:
+  case LibFunc_ZnwjSt11align_val_tRKSt9nothrow_t:
+  case LibFunc_Znwm:
+  case LibFunc_Znwm12__hot_cold_t:
+  case LibFunc_ZnwmRKSt9nothrow_t:
+  case LibFunc_ZnwmRKSt9nothrow_t12__hot_cold_t:
+  case LibFunc_ZnwmSt11align_val_t:
+  case LibFunc_ZnwmSt11align_val_t12__hot_cold_t:
+  case LibFunc_ZnwmSt11align_val_tRKSt9nothrow_t:
+  case LibFunc_ZnwmSt11align_val_tRKSt9nothrow_t12__hot_cold_t:
+  case LibFunc_Znaj:
+  case LibFunc_ZnajRKSt9nothrow_t:
+  case LibFunc_ZnajSt11align_val_t:
+  case LibFunc_ZnajSt11align_val_tRKSt9nothrow_t:
+  case LibFunc_Znam:
+  case LibFunc_Znam12__hot_cold_t:
+  case LibFunc_ZnamRKSt9nothrow_t:
+  case LibFunc_ZnamRKSt9nothrow_t12__hot_cold_t:
+  case LibFunc_ZnamSt11align_val_t:
+  case LibFunc_ZnamSt11align_val_t12__hot_cold_t:
+  case LibFunc_ZnamSt11align_val_tRKSt9nothrow_t:
+  case LibFunc_ZnamSt11align_val_tRKSt9nothrow_t12__hot_cold_t:
+    return ClCoverReplaceableNew;
+  default:
+    return false;
+  }
+}
+
+bool AllocToken::ignoreInstrumentableLibFunc(LibFunc Func) const {
+  switch (Func) {
+  case LibFunc_strdup:
+  case LibFunc_dunder_strdup:
+  case LibFunc_strndup:
+  case LibFunc_dunder_strndup:
+    return !ClCoverStrdup;
+  default:
+    return false;
+  }
+}
+
+void AllocToken::replaceAllocationCall(CallBase *CB, LibFunc Func,
+                                       OptimizationRemarkEmitter &ORE,
+                                       const TargetLibraryInfo &TLI) {
+  uint64_t TokenID = getToken(*CB, ORE);
+
+  FunctionCallee TokenAlloc = getTokenAllocFunction(*CB, TokenID, Func);
+  if (!TokenAlloc)
+    return;
+
+  IRBuilder<> IRB(CB);
+
+  // Original args.
+  SmallVector<Value *, 4> NewArgs{CB->args()};
+  if (!Options.FastABI) {
+    // Add token ID.
+    NewArgs.push_back(
+        ConstantInt::get(Type::getInt64Ty(Mod.getContext()), TokenID));
+  }
+  assert(TokenAlloc.getFunctionType()->getNumParams() == NewArgs.size());
+
+  // Preserve invoke vs call semantics for exception handling.
+  CallBase *NewCall;
+  if (auto *II = dyn_cast<InvokeInst>(CB)) {
+    NewCall = IRB.CreateInvoke(TokenAlloc, II->getNormalDest(),
+                               II->getUnwindDest(), NewArgs);
+  } else {
+    NewCall = IRB.CreateCall(TokenAlloc, NewArgs);
+    cast<CallInst>(NewCall)->setTailCall(CB->isTailCall());
+  }
+  NewCall->setCallingConv(CB->getCallingConv());
+  NewCall->copyMetadata(*CB);
+  NewCall->setAttributes(CB->getAttributes());
+
+  // Replace all uses and delete the old call.
+  CB->replaceAllUsesWith(NewCall);
+  CB->eraseFromParent();
+}
+
+FunctionCallee AllocToken::getTokenAllocFunction(const CallBase &CB,
+                                                 uint64_t TokenID,
+                                                 LibFunc OriginalFunc) {
+  std::optional<std::pair<LibFunc, uint64_t>> Key;
+  if (OriginalFunc != NotLibFunc) {
+    Key = std::make_pair(OriginalFunc, Options.FastABI ? TokenID : 0);
+    auto It = TokenAllocFunctions.find(*Key);
+    if (LLVM_LIKELY(It != TokenAllocFunctions.end()))
+      return It->second;
+  }
+
+  const Function *Callee = CB.getCalledFunction();
+  if (!Callee)
+    return FunctionCallee();
+  const FunctionType *OldFTy = Callee->getFunctionType();
+  if (OldFTy->isVarArg())
+    return FunctionCallee();
+  // Copy params, and append token ID type.
+  LLVMContext &C = Mod.getContext();
+  Type *RetTy = OldFTy->getReturnType();
+  SmallVector<Type *, 4> NewParams{OldFTy->params()};
+  std::string TokenAllocName = ClFuncPrefix;
+  if (Options.FastABI) {
+    TokenAllocName += utostr(TokenID) + "_";
+  } else {
+    NewParams.push_back(Type::getInt64Ty(C)); // token ID
+  }
+  FunctionType *NewFTy = FunctionType::get(RetTy, NewParams, false);
+  // Remove leading '_' - we add our own.
+  StringRef No_ = Callee->getName().drop_while([](char C) { return C == '_'; });
+  TokenAllocName += No_;
+  FunctionCallee TokenAlloc = Mod.getOrInsertFunction(TokenAllocName, NewFTy);
+  if (Function *F = dyn_cast<Function>(TokenAlloc.getCallee()))
+    F->copyAttributesFrom(Callee); // preserve attrs
+
+  if (Key.has_value())
+    TokenAllocFunctions[*Key] = TokenAlloc;
+  return TokenAlloc;
+}
+
+} // namespace
+
+AllocTokenPass::AllocTokenPass(AllocTokenOptions Opts)
+    : Options(std::move(Opts)) {}
+
+PreservedAnalyses AllocTokenPass::run(Module &M, ModuleAnalysisManager &MAM) {
+  AllocToken Pass(Options, M, MAM);
+  bool Modified = false;
+
+  for (Function &F : M) {
+    if (LLVM_LIKELY(F.empty()))
+      continue; // declaration
+    Modified |= Pass.instrumentFunction(F);
+  }
+
+  return Modified ? PreservedAnalyses::none() : PreservedAnalyses::all();
+}
diff --git a/llvm/lib/Transforms/Instrumentation/CMakeLists.txt b/llvm/lib/Transforms/Instrumentation/CMakeLists.txt
index 15fd421a41b0f..80576c61fd80c 100644
--- a/llvm/lib/Transforms/Instrumentation/CMakeLists.txt
+++ b/llvm/lib/Transforms/Instrumentation/CMakeLists.txt
@@ -1,5 +1,6 @@
 add_llvm_component_library(LLVMInstrumentation
   AddressSanitizer.cpp
+  AllocToken.cpp
   BoundsChecking.cpp
   CGProfile.cpp
   ControlHeightReduction.cpp
diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
index bbd1ed6a3ab2d..5ba6f95f5fae8 100644
--- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -970,6 +970,7 @@ Function *CodeExtractor::constructFunctionDeclaration(
       case Attribute::SanitizeMemTag:
       case Attribute::SanitizeRealtime:
       case Attribute::SanitizeRealtimeBlocking:
+      case Attribute::SanitizeAllocToken:
       case Attribute::SpeculativeLoadHardening:
       case Attribute::StackProtect:
       case Attribute::StackProtectReq:
diff --git a/llvm/test/Bitcode/attributes.ll b/llvm/test/Bitcode/attributes.ll
index 8c1a76365e1b4..aef7810fe2c3b 100644
--- a/llvm/test/Bitcode/attributes.ll
+++ b/llvm/test/Bitcode/attributes.ll
@@ -516,6 +516,11 @@ define void @f93() sanitize_realtime_blocking {
         ret void;
 }
 
+; CHECK: define void @f_sanitize_alloc_token() #55
+define void @f_sanitize_alloc_token() sanitize_alloc_token {
+        ret void;
+}
+
 ; CHECK: define void @f87() [[FNRETTHUNKEXTERN:#[0-9]+]]
 define void @f87() fn_ret_thunk_extern { ret void }
 
@@ -627,6 +632,7 @@ define void @dead_on_return(ptr dead_on_return %p) {
 ; CHECK: attributes #52 = { nosanitize_bounds }
 ; CHECK: attributes #53 = { sanitize_realtime }
 ; CHECK: attributes #54 = { sanitize_realtime_blocking }
+; CHECK: attributes #55 = { sanitize_alloc_token }
 ; CHECK: attributes [[FNRETTHUNKEXTERN]] = { fn_ret_thunk_extern }
 ; CHECK: attributes [[SKIPPROFILE]] = { skipprofile }
 ; CHECK: attributes [[OPTDEBUG]] = { optdebug }
diff --git a/llvm/test/Bitcode/compatibility.ll b/llvm/test/Bitcode/compatibility.ll
index 0b5ce08c00a23..e21786e5ee330 100644
--- a/llvm/test/Bitcode/compatibility.ll
+++ b/llvm/test/Bitcode/compatibility.ll
@@ -1718,7 +1718,7 @@ exit:
   ; CHECK: select <2 x i1> <i1 true, i1 false>, <2 x i8> <i8 2, i8 3>, <2 x i8> <i8 3, i8 2>
 
   call void @f.nobuiltin() builtin
-  ; CHECK: call void @f.nobuiltin() #54
+  ; CHECK: call void @f.nobuiltin() #55
 
   call fastcc noalias ptr @f.noalias() noinline
   ; CHECK: call fastcc noalias ptr @f.noalias() #12
@@ -2151,6 +2151,9 @@ declare void @f.sanitize_realtime() sanitize_realtime
 declare void @f.sanitize_realtime_blocking() sanitize_realtime_blocking
 ; CHECK: declare void @f.sanitize_realtime_blocking() #53
 
+declare void @f.sanitize_alloc_token() sanitize_alloc_token
+; CHECK: declare void @f.sanitize_alloc_token() #54
+
 ; CHECK: declare nofpclass(snan) float @nofpclass_snan(float nofpclass(snan))
 declare nofpclass(snan) float @nofpclass_snan(float nofpclass(snan))
 
@@ -2284,7 +2287,8 @@ define float @nofpclass_callsites(float %arg, { float } %arg1) {
 ; CHECK: attributes #51 = { sanitize_numerical_stability }
 ; CHECK: attributes #52 = { sanitize_realtime }
 ; CHECK: attributes #53 = { sanitize_realtime_blocking }
-; CHECK: attributes #54 = { builtin }
+; CHECK: attributes #54 = { sanitize_alloc_token }
+; CHECK: attributes #55 = { builtin }
 
 ;; Metadata
 
diff --git a/llvm/test/Instrumentation/AllocToken/basic.ll b/llvm/test/Instrumentation/AllocToken/basic.ll
new file mode 100644
index 0000000000000..94f5ef7ac5511
--- /dev/null
+++ b/llvm/test/Instrumentation/AllocToken/basic.ll
@@ -0,0 +1,84 @@
+; RUN: opt < %s -passes=inferattrs,alloc-token -alloc-token-mode=0 -S | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare ptr @malloc(i64)
+declare ptr @calloc(i64, i64)
+declare ptr @realloc(ptr, i64)
+declare ptr @_Znwm(i64)
+declare ptr @_Znam(i64)
+declare void @free(ptr)
+declare void @_ZdlPv(ptr)
+declare i32 @foobar(i64)
+
+; Test basic allocation call rewriting
+; CHECK-LABEL: @test_basic_rewriting
+define ptr @test_basic_rewriting() sanitize_alloc_token {
+entry:
+  ; CHECK: [[PTR1:%[0-9]]] = call ptr @__alloc_token_malloc(i64 64, i64 0)
+  ; CHECK: call ptr @__alloc_token_calloc(i64 8, i64 8, i64 1)
+  ; CHECK: call ptr @__alloc_token_realloc(ptr [[PTR1]], i64 128, i64 2)
+  ; CHECK-NOT: call ptr @malloc(
+  ; CHECK-NOT: call ptr @calloc(
+  ; CHECK-NOT: call ptr @realloc(
+  %ptr1 = call ptr @malloc(i64 64)
+  %ptr2 = call ptr @calloc(i64 8, i64 8)
+  %ptr3 = call ptr @realloc(ptr %ptr1, i64 128)
+  ret ptr %ptr3
+}
+
+; Test C++ operator rewriting
+; CHECK-LABEL: @test_cpp_operators
+define ptr @test_cpp_operators() sanitize_alloc_token {
+entry:
+  ; CHECK: call ptr @__alloc_token_Znwm(i64 32, i64 3)
+  ; CHECK: call ptr @__alloc_token_Znam(i64 64, i64 4)
+  ; CHECK-NOT: call ptr @_Znwm(
+  ; CHECK-NOT: call ptr @_Znam(
+  %ptr1 = call ptr @_Znwm(i64 32)
+  %ptr2 = call ptr @_Znam(i64 64)
+  ret ptr %ptr1
+}
+
+; Functions without sanitize_alloc_token do not get instrumented
+; CHECK-LABEL: @without_attribute
+define ptr @without_attribute() {
+entry:
+  ; CHECK: call ptr @malloc(i64 16)
+  ; CHECK-NOT: call ptr @__alloc_token_malloc
+  %ptr = call ptr @malloc(i64 16)
+  ret ptr %ptr
+}
+
+; Test that free/delete are untouched
+; CHECK-LABEL: @test_free_untouched
+define void @test_free_untouched(ptr %ptr) sanitize_alloc_token {
+entry:
+  ; CHECK: call void @free(ptr %ptr)
+  ; CHECK: call void @_ZdlPv(ptr %ptr)
+  ; CHECK-NOT: call ptr @__alloc_token_
+  call void @free(ptr %ptr)
+  call void @_ZdlPv(ptr %ptr)
+  ret void
+}
+
+; Non-allocation functions are untouched
+; CHECK-LABEL: @no_allocations
+define i32 @no_allocations(i32 %x) sanitize_alloc_token {
+entry:
+  ; CHECK: call i32 @foobar
+  ; CHECK-NOT: call i32 @__alloc_token_
+  %result = call i32 @foobar(i64 42)
+  ret i32 %result
+}
+
+; Test that tail calls are preserved
+; CHECK-LABEL: @test_tail_call_preserved
+define ptr @test_tail_call_preserved() sanitize_alloc_token {
+entry:
+  ; CHECK: tail call ptr @__alloc_token_malloc(i64 42, i64 5)
+  ; CHECK-NOT: tail call ptr @malloc(
+  %result = tail call ptr @malloc(i64 42)
+  ret ptr %result
+}
diff --git a/llvm/test/Instrumentation/AllocToken/extralibfuncs.ll b/llvm/test/Instrumentation/AllocToken/extralibfuncs.ll
new file mode 100644
index 0000000000000..37c6d6463ffd2
--- /dev/null
+++ b/llvm/test/Instrumentation/AllocToken/extralibfuncs.ll
@@ -0,0 +1,32 @@
+; Test for special libfuncs not automatically considered allocation functions.
+;
+; RUN: opt < %s -passes=inferattrs,alloc-token -S | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare {ptr, i64} @__size_returning_new(i64)
+
+; CHECK-LABEL: @test_extra_libfuncs
+define ptr @test_extra_libfuncs() sanitize_alloc_token {
+entry:
+  ; CHECK: call {{.*}} @__alloc_token_size_returning_new(
+  %srn = call {ptr, i64} @__size_returning_new(i64 10), !alloc_token_hint !0
+  %ptr1  = extractvalue {ptr, i64} %srn, 0
+  ret ptr %ptr1
+}
+
+declare ptr @_Znwm(i64) nobuiltin allocsize(0)
+declare ptr @_Znam(i64) nobuiltin allocsize(0)
+
+; CHECK-LABEL: @test_replaceable_new
+define ptr @test_replaceable_new() sanitize_alloc_token {
+entry:
+  ; CHECK: call ptr @__alloc_token_Znwm(
+  %ptr1 = call ptr @_Znwm(i64 32), !alloc_token_hint !0
+  ; CHECK: call ptr @__alloc_token_Znam(
+  %ptr2 = call ptr @_Znam(i64 64), !alloc_token_hint !0
+  ret ptr %ptr1
+}
+
+!0 = !{!"int"}
diff --git a/llvm/test/Instrumentation/AllocToken/fast.ll b/llvm/test/Instrumentation/AllocToken/fast.ll
new file mode 100644
index 0000000000000..c691cdcdc37c6
--- /dev/null
+++ b/llvm/test/Instrumentation/AllocToken/fast.ll
@@ -0,0 +1,39 @@
+; RUN: opt < %s -passes=inferattrs,alloc-token -alloc-token-mode=0 -alloc-token-fast-abi -alloc-token-max=3 -S | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare ptr @malloc(i64)
+declare ptr @calloc(i64, i64)
+declare ptr @realloc(ptr, i64)
+declare ptr @_Znwm(i64)
+declare ptr @_Znam(i64)
+
+; Test basic allocation call rewriting
+; CHECK-LABEL: @test_basic_rewriting
+define ptr @test_basic_rewriting() sanitize_alloc_token {
+entry:
+  ; CHECK: [[PTR1:%[0-9]]] = call ptr @__alloc_token_0_malloc(i64 64)
+  ; CHECK: call ptr @__alloc_token_1_calloc(i64 8, i64 8)
+  ; CHECK: call ptr @__alloc_token_2_realloc(ptr [[PTR1]], i64 128)
+  ; CHECK-NOT: call ptr @malloc(
+  ; CHECK-NOT: call ptr @calloc(
+  ; CHECK-NOT: call ptr @realloc(
+  %ptr1 = call ptr @malloc(i64 64)
+  %ptr2 = call ptr @calloc(i64 8, i64 8)
+  %ptr3 = call ptr @realloc(ptr %ptr1, i64 128)
+  ret ptr %ptr3
+}
+
+; Test C++ operator rewriting
+; CHECK-LABEL: @test_cpp_operators
+define ptr @test_cpp_operators() sanitize_alloc_token {
+entry:
+  ; CHECK: call ptr @__alloc_token_0_Znwm(i64 32)
+  ; CHECK: call ptr @__alloc_token_1_Znam(i64 64)
+  ; CHECK-NOT: call ptr @_Znwm(
+  ; CHECK-NOT: call ptr @_Znam(
+  %ptr1 = call ptr @_Znwm(i64 32)
+  %ptr2 = call ptr @_Znam(i64 64)
+  ret ptr %ptr1
+}
diff --git a/llvm/test/Instrumentation/AllocToken/ignore.ll b/llvm/test/Instrumentation/AllocToken/ignore.ll
new file mode 100644
index 0000000000000..65921685d70a0
--- /dev/null
+++ b/llvm/test/Instrumentation/AllocToken/ignore.ll
@@ -0,0 +1,30 @@
+; Test for all allocation functions that should be ignored by default.
+;
+; RUN: opt < %s -passes=inferattrs,alloc-token -S | FileCheck %s --check-prefixes=CHECK,DEFAULT
+; RUN: opt < %s -passes=inferattrs,alloc-token -alloc-token-cover-strdup -S | FileCheck %s --check-prefixes=CHECK,COVER
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare ptr @strdup(ptr)
+declare ptr @__strdup(ptr)
+declare ptr @strndup(ptr, i64)
+declare ptr @__strndup(ptr, i64)
+
+; CHECK-LABEL: @test_ignorable_allocation_functions
+define ptr @test_ignorable_allocation_functions(ptr %ptr) sanitize_alloc_token {
+entry:
+  ; COVER:   call ptr @__alloc_token_strdup(
+  ; DEFAULT: call ptr @strdup(
+  %ptr1 = call ptr @strdup(ptr %ptr)
+  ; COVER:   call ptr @__alloc_token_strdup(
+  ; DEFAULT: call ptr @__strdup(
+  %ptr2 = call ptr @__strdup(ptr %ptr)
+  ; COVER:   call ptr @__alloc_token_strndup(
+  ; DEFAULT: call ptr @strndup(
+  %ptr3 = call ptr @strndup(ptr %ptr, i64 42)
+  ; COVER:   call ptr @__alloc_token_strndup(
+  ; DEFAULT: call ptr @__strndup(
+  %ptr4 = call ptr @__strndup(ptr %ptr, i64 42)
+  ret ptr %ptr1
+}
diff --git a/llvm/test/Instrumentation/AllocToken/invoke.ll b/llvm/test/Instrumentation/AllocToken/invoke.ll
new file mode 100644
index 0000000000000..243462a54968f
--- /dev/null
+++ b/llvm/test/Instrumentation/AllocToken/invoke.ll
@@ -0,0 +1,86 @@
+; RUN: opt < %s -passes=inferattrs,alloc-token -alloc-token-mode=0 -S | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; CHECK-LABEL: @test_invoke_malloc
+define ptr @test_invoke_malloc() sanitize_alloc_token personality ptr @__gxx_personality_v0 {
+entry:
+  ; CHECK: invoke ptr @__alloc_token_malloc(i64 64, i64 0)
+  ; CHECK-NEXT: to label %normal unwind label %cleanup
+  ; CHECK-NOT: call ptr @__alloc_token_malloc
+  ; CHECK-NOT: call ptr @malloc
+  %ptr = invoke ptr @malloc(i64 64) to label %normal unwind label %cleanup
+
+normal:
+  ret ptr %ptr
+
+cleanup:
+  %lp = landingpad { ptr, i32 } cleanup
+  ret ptr null
+}
+
+; CHECK-LABEL: @test_invoke_operator_new
+define ptr @test_invoke_operator_new() sanitize_alloc_token personality ptr @__gxx_personality_v0 {
+entry:
+  ; CHECK: invoke ptr @__alloc_token_Znwm(i64 32, i64 1)
+  ; CHECK-NEXT: to label %normal unwind label %cleanup
+  ; CHECK-NOT: call ptr @__alloc_token_Znwm
+  ; CHECK-NOT: call ptr @_Znwm
+  %ptr = invoke ptr @_Znwm(i64 32) to label %normal unwind label %cleanup
+
+normal:
+  ret ptr %ptr
+
+cleanup:
+  %lp = landingpad { ptr, i32 } cleanup
+  ret ptr null
+}
+
+; Test complex exception flow with multiple invoke allocations
+; CHECK-LABEL: @test_complex_invoke_flow
+define ptr @test_complex_invoke_flow() sanitize_alloc_token personality ptr @__gxx_personality_v0 {
+entry:
+  ; CHECK: invoke ptr @__alloc_token_malloc(i64 16, i64 2)
+  ; CHECK-NEXT: to label %first_ok unwind label %cleanup1
+  %ptr1 = invoke ptr @malloc(i64 16) to label %first_ok unwind label %cleanup1
+
+first_ok:
+  ; CHECK: invoke ptr @__alloc_token_Znwm(i64 32, i64 3)
+  ; CHECK-NEXT: to label %second_ok unwind label %cleanup2
+  %ptr2 = invoke ptr @_Znwm(i64 32) to label %second_ok unwind label %cleanup2
+
+second_ok:
+  ret ptr %ptr1
+
+cleanup1:
+  %lp1 = landingpad { ptr, i32 } cleanup
+  ret ptr null
+
+cleanup2:
+  %lp2 = landingpad { ptr, i32 } cleanup
+  ret ptr null
+}
+
+; Test mixed call/invoke
+; CHECK-LABEL: @test_mixed_call_invoke
+define ptr @test_mixed_call_invoke() sanitize_alloc_token personality ptr @__gxx_personality_v0 {
+entry:
+  ; CHECK: call ptr @__alloc_token_malloc(i64 8, i64 4)
+  %ptr1 = call ptr @malloc(i64 8)
+
+  ; CHECK: invoke ptr @__alloc_token_malloc(i64 16, i64 5)
+  ; CHECK-NEXT: to label %normal unwind label %cleanup
+  %ptr2 = invoke ptr @malloc(i64 16) to label %normal unwind label %cleanup
+
+normal:
+  ret ptr %ptr1
+
+cleanup:
+  %lp = landingpad { ptr, i32 } cleanup
+  ret ptr null
+}
+
+declare ptr @malloc(i64)
+declare ptr @_Znwm(i64)
+declare i32 @__gxx_personality_v0(...)
diff --git a/llvm/test/Instrumentation/AllocToken/nonlibcalls.ll b/llvm/test/Instrumentation/AllocToken/nonlibcalls.ll
new file mode 100644
index 0000000000000..be99e5ef14b16
--- /dev/null
+++ b/llvm/test/Instrumentation/AllocToken/nonlibcalls.ll
@@ -0,0 +1,63 @@
+; RUN: opt < %s -passes=inferattrs,alloc-token -alloc-token-mode=0 -alloc-token-extended -S | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare ptr @malloc(i64)
+declare ptr @custom_malloc(i64)
+declare ptr @kmalloc(i64, i64)
+
+; CHECK-LABEL: @test_libcall
+define ptr @test_libcall() sanitize_alloc_token {
+entry:
+  ; CHECK: call ptr @__alloc_token_malloc(i64 64, i64 0)
+  %ptr1 = call ptr @malloc(i64 64)
+  ret ptr %ptr1
+}
+
+; CHECK-LABEL: @test_libcall_hint
+define ptr @test_libcall_hint() sanitize_alloc_token {
+entry:
+  ; CHECK: call ptr @__alloc_token_malloc(i64 64, i64 1)
+  %ptr1 = call ptr @malloc(i64 64), !alloc_token_hint !0
+  ret ptr %ptr1
+}
+
+; CHECK-LABEL: @test_nonlibcall_nohint
+define ptr @test_nonlibcall_nohint() sanitize_alloc_token {
+entry:
+  ; CHECK: call ptr @custom_malloc(i64 8)
+  ; CHECK: call ptr @kmalloc(i64 32, i64 0)
+  %ptr1 = call ptr @custom_malloc(i64 8)
+  %ptr2 = call ptr @kmalloc(i64 32, i64 0)
+  ret ptr %ptr1
+}
+
+; CHECK-LABEL: @test_nonlibcall_hint
+define ptr @test_nonlibcall_hint() sanitize_alloc_token {
+entry:
+  ; CHECK: call ptr @__alloc_token_custom_malloc(i64 8, i64 2)
+  ; CHECK: call ptr @__alloc_token_kmalloc(i64 32, i64 0, i64 3)
+  ; CHECK: call ptr @__alloc_token_custom_malloc(i64 64, i64 4)
+  ; CHECK: call ptr @__alloc_token_kmalloc(i64 128, i64 2, i64 5)
+  %ptr1 = call ptr @custom_malloc(i64 8), !alloc_token_hint !0
+  %ptr2 = call ptr @kmalloc(i64 32, i64 0), !alloc_token_hint !0
+  %ptr3 = call ptr @custom_malloc(i64 64), !alloc_token_hint !0
+  %ptr4 = call ptr @kmalloc(i64 128, i64 2), !alloc_token_hint !0
+  ret ptr %ptr1
+}
+
+; Functions without sanitize_alloc_token do not get instrumented
+; CHECK-LABEL: @without_attribute
+define ptr @without_attribute() {
+entry:
+  ; CHECK: call ptr @malloc(i64 64)
+  ; CHECK: call ptr @custom_malloc(i64 8)
+  ; CHECK: call ptr @kmalloc(i64 32, i64 0)
+  %ptr1 = call ptr @malloc(i64 64), !alloc_token_hint !0
+  %ptr2 = call ptr @custom_malloc(i64 8), !alloc_token_hint !0
+  %ptr3 = call ptr @kmalloc(i64 32, i64 0), !alloc_token_hint !0
+  ret ptr %ptr1
+}
+
+!0 = !{!"int"}
diff --git a/llvm/test/Instrumentation/AllocToken/remark.ll b/llvm/test/Instrumentation/AllocToken/remark.ll
new file mode 100644
index 0000000000000..d5ecfc41bace5
--- /dev/null
+++ b/llvm/test/Instrumentation/AllocToken/remark.ll
@@ -0,0 +1,27 @@
+; RUN: opt < %s -passes=inferattrs,alloc-token -pass-remarks=alloc-token -S 2>&1 | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare ptr @malloc(i64)
+
+; CHECK-NOT: remark: <unknown>:0:0: Call to 'malloc' in 'test_has_metadata' without source-level type token
+; CHECK: remark: <unknown>:0:0: Call to 'malloc' in 'test_no_metadata' without source-level type token
+
+; CHECK-LABEL: @test_has_metadata
+define ptr @test_has_metadata() sanitize_alloc_token {
+entry:
+  ; CHECK: call ptr @__alloc_token_malloc(
+  %ptr1 = call ptr @malloc(i64 64), !alloc_token_hint !0
+  ret ptr %ptr1
+}
+
+; CHECK-LABEL: @test_no_metadata
+define ptr @test_no_metadata() sanitize_alloc_token {
+entry:
+  ; CHECK: call ptr @__alloc_token_malloc(
+  %ptr1 = call ptr @malloc(i64 32)
+  ret ptr %ptr1
+}
+
+!0 = !{!"int"}
diff --git a/llvm/test/Transforms/Inline/attributes.ll b/llvm/test/Transforms/Inline/attributes.ll
index 42b1a3a29aec4..55ab430f201d6 100644
--- a/llvm/test/Transforms/Inline/attributes.ll
+++ b/llvm/test/Transforms/Inline/attributes.ll
@@ -26,6 +26,10 @@ define i32 @sanitize_memtag_callee(i32 %i) sanitize_memtag {
   ret i32 %i
 }
 
+define i32 @sanitize_alloc_token_callee(i32 %i) sanitize_alloc_token {
+  ret i32 %i
+}
+
 define i32 @safestack_callee(i32 %i) safestack {
   ret i32 %i
 }
@@ -58,6 +62,10 @@ define i32 @alwaysinline_sanitize_memtag_callee(i32 %i) alwaysinline sanitize_me
   ret i32 %i
 }
 
+define i32 @alwaysinline_sanitize_alloc_token_callee(i32 %i) alwaysinline sanitize_alloc_token {
+  ret i32 %i
+}
+
 define i32 @alwaysinline_safestack_callee(i32 %i) alwaysinline safestack {
   ret i32 %i
 }
@@ -184,6 +192,39 @@ define i32 @test_sanitize_memtag(i32 %arg) sanitize_memtag {
 ; CHECK-NEXT: ret i32
 }
 
+; ---------------------------------------------------------------------------- ;
+
+; Can inline sanitize_alloc_token functions into a noattr function. The
+; attribute is *not* viral, otherwise may break code.
+define i32 @test_no_sanitize_alloc_token(i32 %arg) {
+; CHECK-LABEL: @test_no_sanitize_alloc_token(
+; CHECK-SAME: ) {
+; CHECK-NOT: call
+; CHECK: ret i32
+entry:
+  %x1 = call i32 @noattr_callee(i32 %arg)
+  %x2 = call i32 @sanitize_alloc_token_callee(i32 %x1)
+  %x3 = call i32 @alwaysinline_callee(i32 %x2)
+  %x4 = call i32 @alwaysinline_sanitize_alloc_token_callee(i32 %x3)
+  ret i32 %x4
+}
+
+; Can inline noattr functions into a sanitize_alloc_token function. If
+; inlinable noattr functions cannot be instrumented, they should be marked with
+; explicit noinline.
+define i32 @test_sanitize_alloc_token(i32 %arg) sanitize_alloc_token {
+; CHECK-LABEL: @test_sanitize_alloc_token(
+; CHECK-SAME: ) [[SANITIZE_ALLOC_TOKEN:.*]] {
+; CHECK-NOT: call
+; CHECK: ret i32
+entry:
+  %x1 = call i32 @noattr_callee(i32 %arg)
+  %x2 = call i32 @sanitize_alloc_token_callee(i32 %x1)
+  %x3 = call i32 @alwaysinline_callee(i32 %x2)
+  %x4 = call i32 @alwaysinline_sanitize_alloc_token_callee(i32 %x3)
+  ret i32 %x4
+}
+
 define i32 @test_safestack(i32 %arg) safestack {
   %x1 = call i32 @noattr_callee(i32 %arg)
   %x2 = call i32 @safestack_callee(i32 %x1)
@@ -639,6 +680,7 @@ define i32 @loader_replaceable_caller() {
   ret i32 %1
 }
 
+; CHECK: attributes [[SANITIZE_ALLOC_TOKEN]] = { sanitize_alloc_token }
 ; CHECK: attributes [[SLH]] = { speculative_load_hardening }
 ; CHECK: attributes [[FPMAD_FALSE]] = { "less-precise-fpmad"="false" }
 ; CHECK: attributes [[FPMAD_TRUE]] = { "less-precise-fpmad"="true" }
diff --git a/llvm/utils/emacs/llvm-mode.el b/llvm/utils/emacs/llvm-mode.el
index 660d0718f098c..240c13319f634 100644
--- a/llvm/utils/emacs/llvm-mode.el
+++ b/llvm/utils/emacs/llvm-mode.el
@@ -34,7 +34,7 @@
          "inaccessiblemem_or_argmemonly" "inalloca" "inlinehint" "jumptable" "minsize" "mustprogress" "naked" "nobuiltin" "nonnull" "nocapture"
          "nocallback" "nocf_check" "noduplicate" "noext" "nofree" "noimplicitfloat" "noinline" "nomerge" "nonlazybind" "noprofile" "noredzone" "noreturn"
          "norecurse" "nosync" "noundef" "nounwind" "nosanitize_bounds" "nosanitize_coverage" "null_pointer_is_valid" "optdebug" "optforfuzzing" "optnone" "optsize" "preallocated" "readnone" "readonly" "returned" "returns_twice"
-         "shadowcallstack" "signext" "speculatable" "speculative_load_hardening" "ssp" "sspreq" "sspstrong" "safestack" "sanitize_address" "sanitize_hwaddress" "sanitize_memtag"
+         "shadowcallstack" "signext" "speculatable" "speculative_load_hardening" "ssp" "sspreq" "sspstrong" "safestack" "sanitize_address" "sanitize_alloc_token" "sanitize_hwaddress" "sanitize_memtag"
          "sanitize_thread" "sanitize_memory" "strictfp" "swifterror" "uwtable" "vscale_range" "willreturn" "writeonly" "zeroext") 'symbols) . font-lock-constant-face)
    ;; Variables
    '("%[-a-zA-Z$._][-a-zA-Z$._0-9]*" . font-lock-variable-name-face)
diff --git a/llvm/utils/gn/secondary/llvm/lib/Transforms/Instrumentation/BUILD.gn b/llvm/utils/gn/secondary/llvm/lib/Transforms/Instrumentation/BUILD.gn
index a8eb834c1da23..2c6204e758559 100644
--- a/llvm/utils/gn/secondary/llvm/lib/Transforms/Instrumentation/BUILD.gn
+++ b/llvm/utils/gn/secondary/llvm/lib/Transforms/Instrumentation/BUILD.gn
@@ -11,6 +11,7 @@ static_library("Instrumentation") {
   ]
   sources = [
     "AddressSanitizer.cpp",
+    "AllocToken.cpp",
     "BlockCoverageInference.cpp",
     "BoundsChecking.cpp",
     "CGProfile.cpp",
diff --git a/llvm/utils/llvm.grm b/llvm/utils/llvm.grm
index 411323178bde1..dddfe3c301b65 100644
--- a/llvm/utils/llvm.grm
+++ b/llvm/utils/llvm.grm
@@ -173,6 +173,7 @@ FuncAttr      ::= noreturn
  | returns_twice
  | nonlazybind
  | sanitize_address
+ | sanitize_alloc_token
  | sanitize_thread
  | sanitize_memory
  | mustprogress
diff --git a/llvm/utils/vim/syntax/llvm.vim b/llvm/utils/vim/syntax/llvm.vim
index e3b8ff8629559..e048caa20406a 100644
--- a/llvm/utils/vim/syntax/llvm.vim
+++ b/llvm/utils/vim/syntax/llvm.vim
@@ -163,6 +163,7 @@ syn keyword llvmKeyword
       \ returns_twice
       \ safestack
       \ sanitize_address
+      \ sanitize_alloc_token
       \ sanitize_hwaddress
       \ sanitize_memory
       \ sanitize_memtag
diff --git a/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml b/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
index b64482336f404..1faaf6b26f301 100644
--- a/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
+++ b/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
@@ -258,6 +258,7 @@ patterns:
             \\breturns_twice\\b|\
             \\bsafestack\\b|\
             \\bsanitize_address\\b|\
+            \\bsanitize_alloc_token\\b|\
             \\bsanitize_hwaddress\\b|\
             \\bsanitize_memory\\b|\
             \\bsanitize_memtag\\b|\



More information about the llvm-commits mailing list