[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_alloc_token_infer() and llvm.alloc.token.id (PR #156842)

Tue Sep 9 06:03:26 PDT 2025

llvmbot wrote:




@llvm/pr-subscribers-llvm-ir

Author: Marco Elver (melver)

<details>
<summary>Changes</summary>

Implement `__builtin_alloc_token_infer(<malloc-args>, ...)` to allow
compile-time querying of the token ID, where the builtin arguments
mirror those normally passed to any allocation function. The argument
expressions are unevaluated operands. For type-based token modes, the
same type inference logic is used as for untyped allocation calls.

For example the token ID that would be passed to (with `-fsanitize=alloc-token`):

	some_malloc(sizeof(Type), ...)

is equivalent to the token ID returned by

	__builtin_alloc_token_infer(sizeof(Type), ...)

The builtin provides a mechanism to pass or compare token IDs in code
that needs to be explicitly allocation token-aware (such as inside an
allocator, or through wrapper macros). The builtin is always available.

The implementation bridges the frontend and middle-end via a new
intrinsic, `llvm.alloc.token.id`.

- In Clang, reuse the existing `EmitAllocTokenHint` logic to construct
  an `!alloc_token_hint` metadata node. This node is then passed as a
  `metadata` argument to the `llvm.alloc.token.id` intrinsic.

- The `AllocToken` pass is taught to recognize and lower this intrinsic.
  It extracts the metadata from the intrinsic's argument and feeds it
  into the same token-generation logic used for instrumenting allocation
  calls. The intrinsic is then replaced with the resulting constant
  `i64` token ID.

A more concrete demonstration of __builtin_alloc_token_infer's use is
enabling type-aware Slab allocations in the Linux kernel:
 https://lore.kernel.org/all/20250825154505.1558444-1-elver@google.com/
Notably, any kind of allocation-call rewriting is a poor fit for the
Linux kernel's kmalloc-family functions, which are macros that wrap
(multiple) layers of inline and non-inline wrapper functions. Given the
Linux kernel defines its own allocation APIs, the more explicit builtin
gives the right level of control over where the type inference happens
and the resulting token is passed.

---

This change is part of the following series:
  1. https://github.com/llvm/llvm-project/pull/156838
  2. https://github.com/llvm/llvm-project/pull/156839
  3. https://github.com/llvm/llvm-project/pull/156840
  4. https://github.com/llvm/llvm-project/pull/156841
  5. https://github.com/llvm/llvm-project/pull/156842

---

Patch is 23.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156842.diff


14 Files Affected:

- (modified) clang/docs/AllocToken.rst (+33-10) 
- (modified) clang/docs/ReleaseNotes.rst (+4-1) 
- (modified) clang/include/clang/Basic/Builtins.td (+6) 
- (modified) clang/include/clang/Sema/Sema.h (+3) 
- (modified) clang/lib/CodeGen/BackendUtil.cpp (+17-10) 
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+8) 
- (modified) clang/lib/CodeGen/CGExpr.cpp (+22-12) 
- (modified) clang/lib/CodeGen/CodeGenFunction.h (+6-1) 
- (modified) clang/lib/Sema/SemaChecking.cpp (+22) 
- (modified) clang/test/CodeGen/lto-newpm-pipeline.c (+6-2) 
- (added) clang/test/CodeGenCXX/alloc-token-builtin.cpp (+76) 
- (modified) llvm/include/llvm/IR/Intrinsics.td (+8) 
- (modified) llvm/lib/Transforms/Instrumentation/AllocToken.cpp (+47-8) 
- (added) llvm/test/Instrumentation/AllocToken/intrinsic.ll (+29) 


``````````diff

diff --git a/clang/docs/AllocToken.rst b/clang/docs/AllocToken.rst
index 7ad5e5f03d8a0..062181ae8e27d 100644
--- a/clang/docs/AllocToken.rst
+++ b/clang/docs/AllocToken.rst
@@ -49,6 +49,39 @@ change or removal. These may (experimentally) be selected with ``-mllvm
 * *Increment* (mode=0): This mode assigns a simple, incrementally increasing
   token ID to each allocation site.
 
+The following command-line options affect generated token IDs:
+
+* ``-falloc-token-max=<N>``
+    Configures the maximum number of tokens. No max by default (tokens bounded
+    by ``UINT64_MAX``).
+
+Querying Token IDs with ``__builtin_alloc_token_infer``
+=======================================================
+
+For use cases where the token ID must be known at compile time, Clang provides
+a builtin function:
+
+.. code-block:: c
+
+    uint64_t __builtin_alloc_token_infer(<args>, ...);
+
+This builtin returns the token ID inferred from its argument expressions, which
+mirror arguments normally passed to any allocation function. The argument
+expressions are **unevaluated**, so it can be used with expressions that would
+have side effects without any runtime impact.
+
+For example, it can be used as follows:
+
+.. code-block:: c
+
+    struct MyType { ... };
+    void *__partition_alloc(size_t size, uint64_t partition);
+    #define partition_alloc(...) __partition_alloc(__VA_ARGS__, __builtin_alloc_token_infer(__VA_ARGS__))
+
+    void foo(void) {
+        MyType *x = partition_alloc(sizeof(*x));
+    }
+
 Allocation Token Instrumentation
 ================================
 
@@ -70,16 +103,6 @@ example:
     // Instrumented:
     ptr = __alloc_token_malloc(size, token_id);
 
-In addition, it is typically recommended to configure the following:
-
-* ``-falloc-token-max=<N>``
-    Configures the maximum number of tokens. No max by default (tokens bounded
-    by ``UINT64_MAX``).
-
-    .. code-block:: console
-
-        % clang++ -fsanitize=alloc-token -falloc-token-max=512 example.cc
-
 Runtime Interface
 -----------------
 
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 193b356631995..0275dfe7f9764 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -205,7 +205,10 @@ Non-comprehensive list of changes in this release
 
 - Introduce support for allocation tokens to enable allocator-level heap
   organization strategies. A feature to instrument all allocation functions
-  with a token ID can be enabled via the ``-fsanitize=alloc-token`` flag.
+  with a token ID can be enabled via the ``-fsanitize=alloc-token`` flag. A
+  builtin ``__builtin_alloc_token_infer(<args>, ...)`` is provided to allow
+  compile-time querying of allocation token IDs, where the builtin arguments
+  mirror those normally passed to an allocation function.
 
 New Compiler Flags
 ------------------
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index af0e8242f1e0d..163e68b9916dd 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1274,6 +1274,12 @@ def AllocaWithAlignUninitialized : Builtin {
   let Prototype = "void*(size_t, _Constant size_t)";
 }
 
+def AllocTokenInfer : Builtin {
+  let Spellings = ["__builtin_alloc_token_infer"];
+  let Attributes = [NoThrow, Const, Pure, CustomTypeChecking, UnevaluatedArguments];
+  let Prototype = "unsigned long long int(...)";
+}
+
 def CallWithStaticChain : Builtin {
   let Spellings = ["__builtin_call_with_static_chain"];
   let Attributes = [NoThrow, CustomTypeChecking];
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index aa035a1555950..d7ec26a5c57cb 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -2946,6 +2946,9 @@ class Sema final : public SemaBase {
   /// than 8.
   bool BuiltinAllocaWithAlign(CallExpr *TheCall);
 
+  /// Handle __builtin_alloc_token_infer.
+  bool BuiltinAllocTokenInfer(CallExpr *TheCall);
+
   /// BuiltinArithmeticFence - Handle __arithmetic_fence.
   bool BuiltinArithmeticFence(CallExpr *TheCall);
 
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 8b297134de4e7..b020dd289fb42 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -794,16 +794,6 @@ static void addSanitizers(const Triple &TargetTriple,
     if (LangOpts.Sanitize.has(SanitizerKind::DataFlow)) {
       MPM.addPass(DataFlowSanitizerPass(LangOpts.NoSanitizeFiles));
     }
-
-    if (LangOpts.Sanitize.has(SanitizerKind::AllocToken)) {
-      if (Level == OptimizationLevel::O0) {
-        // The default pass builder only infers libcall function attrs when
-        // optimizing, so we insert it here because we need it for accurate
-        // memory allocation function detection.
-        MPM.addPass(InferFunctionAttrsPass());
-      }
-      MPM.addPass(AllocTokenPass(getAllocTokenOptions(CodeGenOpts)));
-    }
   };
   if (ClSanitizeOnOptimizerEarlyEP) {
     PB.registerOptimizerEarlyEPCallback(
@@ -846,6 +836,22 @@ static void addSanitizers(const Triple &TargetTriple,
   }
 }
 
+static void addAllocTokenPass(const Triple &TargetTriple,
+                              const CodeGenOptions &CodeGenOpts,
+                              const LangOptions &LangOpts, PassBuilder &PB) {
+  PB.registerOptimizerLastEPCallback(
+      [&](ModulePassManager &MPM, OptimizationLevel Level, ThinOrFullLTOPhase) {
+        if (Level == OptimizationLevel::O0 &&
+            LangOpts.Sanitize.has(SanitizerKind::AllocToken)) {
+          // The default pass builder only infers libcall function attrs when
+          // optimizing, so we insert it here because we need it for accurate
+          // memory allocation function detection with -fsanitize=alloc-token.
+          MPM.addPass(InferFunctionAttrsPass());
+        }
+        MPM.addPass(AllocTokenPass(getAllocTokenOptions(CodeGenOpts)));
+      });
+}
+
 void EmitAssemblyHelper::RunOptimizationPipeline(
     BackendAction Action, std::unique_ptr<raw_pwrite_stream> &OS,
     std::unique_ptr<llvm::ToolOutputFile> &ThinLinkOS, BackendConsumer *BC) {
@@ -1101,6 +1107,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
     if (!IsThinLTOPostLink) {
       addSanitizers(TargetTriple, CodeGenOpts, LangOpts, PB);
       addKCFIPass(TargetTriple, LangOpts, PB);
+      addAllocTokenPass(TargetTriple, CodeGenOpts, LangOpts, PB);
     }
 
     if (std::optional<GCOVOptions> Options =
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 172a521e63c17..523233a811875 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -4475,6 +4475,14 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     return RValue::get(AI);
   }
 
+  case Builtin::BI__builtin_alloc_token_infer: {
+    llvm::MDNode *MDN = EmitAllocTokenHint(E);
+    llvm::Value *MDV = MetadataAsValue::get(getLLVMContext(), MDN);
+    llvm::Function *F = CGM.getIntrinsic(llvm::Intrinsic::alloc_token_id);
+    llvm::CallBase *TokenID = Builder.CreateCall(F, MDV);
+    return RValue::get(TokenID);
+  }
+
   case Builtin::BIbzero:
   case Builtin::BI__builtin_bzero: {
     Address Dest = EmitPointerWithAlignment(E->getArg(0));
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index dc428f04e873a..460b44d57df8e 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -1273,11 +1273,7 @@ void CodeGenFunction::EmitBoundsCheckImpl(const Expr *E, llvm::Value *Bound,
   EmitCheck(std::make_pair(Check, CheckKind), CheckHandler, StaticData, Index);
 }
 
-void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
-                                         QualType AllocType) {
-  assert(SanOpts.has(SanitizerKind::AllocToken) &&
-         "Only needed with -fsanitize=alloc-token");
-
+llvm::MDNode *CodeGenFunction::EmitAllocTokenHint(QualType AllocType) {
   llvm::MDBuilder MDB(getLLVMContext());
 
   // Get unique type name.
@@ -1340,14 +1336,20 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
   };
   const bool ContainsPtr = TypeContainsPtr(TypeContainsPtr, AllocType);
   if (!ContainsPtr && IncompleteType)
-    return;
+    return nullptr;
   auto *ContainsPtrC = Builder.getInt1(ContainsPtr);
   auto *ContainsPtrMD = MDB.createConstant(ContainsPtrC);
 
   // Format: !{<type-name>, <contains-pointer>}
-  auto *MDN =
-      llvm::MDNode::get(CGM.getLLVMContext(), {TypeNameMD, ContainsPtrMD});
-  CB->setMetadata(llvm::LLVMContext::MD_alloc_token_hint, MDN);
+  return llvm::MDNode::get(CGM.getLLVMContext(), {TypeNameMD, ContainsPtrMD});
+}
+
+void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
+                                         QualType AllocType) {
+  assert(SanOpts.has(SanitizerKind::AllocToken) &&
+         "Only needed with -fsanitize=alloc-token");
+  CB->setMetadata(llvm::LLVMContext::MD_alloc_token_hint,
+                  EmitAllocTokenHint(AllocType));
 }
 
 /// Infer type from a simple sizeof expression.
@@ -1423,8 +1425,7 @@ static QualType inferTypeFromCastExpr(const CallExpr *CallE,
   return QualType();
 }
 
-void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
-                                         const CallExpr *E) {
+llvm::MDNode *CodeGenFunction::EmitAllocTokenHint(const CallExpr *E) {
   QualType AllocType;
   // First check arguments.
   for (const Expr *Arg : E->arguments()) {
@@ -1439,7 +1440,16 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
     AllocType = inferTypeFromCastExpr(E, CurCast);
   // Emit if we were able to infer the type.
   if (!AllocType.isNull())
-    EmitAllocTokenHint(CB, AllocType);
+    return EmitAllocTokenHint(AllocType);
+  return nullptr;
+}
+
+void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
+                                         const CallExpr *E) {
+  assert(SanOpts.has(SanitizerKind::AllocToken) &&
+         "Only needed with -fsanitize=alloc-token");
+  if (llvm::MDNode *MDN = EmitAllocTokenHint(E))
+    CB->setMetadata(llvm::LLVMContext::MD_alloc_token_hint, MDN);
 }
 
 CodeGenFunction::ComplexPairTy CodeGenFunction::
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 8e89838531d35..d0616e11754b3 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -3352,10 +3352,15 @@ class CodeGenFunction : public CodeGenTypeCache {
   SanitizerAnnotateDebugInfo(ArrayRef<SanitizerKind::SanitizerOrdinal> Ordinals,
                              SanitizerHandler Handler);
 
-  /// Emit additional metadata used by the AllocToken instrumentation.
+  /// Emit metadata used by the AllocToken instrumentation.
+  llvm::MDNode *EmitAllocTokenHint(QualType AllocType);
+  /// Emit and set additional metadata used by the AllocToken instrumentation.
   void EmitAllocTokenHint(llvm::CallBase *CB, QualType AllocType);
   /// Emit additional metadata used by the AllocToken instrumentation,
   /// inferring the type from an allocation call expression.
+  llvm::MDNode *EmitAllocTokenHint(const CallExpr *E);
+  /// Emit and set additional metadata used by the AllocToken instrumentation,
+  /// inferring the type from an allocation call expression.
   void EmitAllocTokenHint(llvm::CallBase *CB, const CallExpr *E);
 
   llvm::Value *GetCountedByFieldExprGEP(const Expr *Base, const FieldDecl *FD,
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 077f4311ed729..70fb41238a60b 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -2638,6 +2638,10 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
       builtinAllocaAddrSpace(*this, TheCall);
     }
     break;
+  case Builtin::BI__builtin_alloc_token_infer:
+    if (BuiltinAllocTokenInfer(TheCall))
+      return ExprError();
+    break;
   case Builtin::BI__arithmetic_fence:
     if (BuiltinArithmeticFence(TheCall))
       return ExprError();
@@ -5760,6 +5764,24 @@ bool Sema::BuiltinAllocaWithAlign(CallExpr *TheCall) {
   return false;
 }
 
+bool Sema::BuiltinAllocTokenInfer(CallExpr *TheCall) {
+  if (checkArgCountAtLeast(TheCall, 1))
+    return true;
+
+  for (Expr *Arg : TheCall->arguments()) {
+    // If argument is dependent on a template parameter, we can't resolve now.
+    if (Arg->isTypeDependent() || Arg->isValueDependent())
+      continue;
+    // Reject void types.
+    QualType ArgTy = Arg->IgnoreParenImpCasts()->getType();
+    if (ArgTy->isVoidType())
+      return Diag(Arg->getBeginLoc(), diag::err_param_with_void_type);
+  }
+
+  TheCall->setType(Context.UnsignedLongLongTy);
+  return false;
+}
+
 bool Sema::BuiltinAssumeAligned(CallExpr *TheCall) {
   if (checkArgCountRange(TheCall, 2, 3))
     return true;
diff --git a/clang/test/CodeGen/lto-newpm-pipeline.c b/clang/test/CodeGen/lto-newpm-pipeline.c
index ea9784a76f923..dceaaf136ebfc 100644
--- a/clang/test/CodeGen/lto-newpm-pipeline.c
+++ b/clang/test/CodeGen/lto-newpm-pipeline.c
@@ -32,10 +32,12 @@
 // CHECK-FULL-O0-NEXT: Running pass: AlwaysInlinerPass
 // CHECK-FULL-O0-NEXT: Running analysis: ProfileSummaryAnalysis
 // CHECK-FULL-O0-NEXT: Running pass: CoroConditionalWrapper
+// CHECK-FULL-O0-NEXT: Running pass: AllocTokenPass
+// CHECK-FULL-O0-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis
+// CHECK-FULL-O0-NEXT: Running analysis: TargetLibraryAnalysis
 // CHECK-FULL-O0-NEXT: Running pass: CanonicalizeAliasesPass
 // CHECK-FULL-O0-NEXT: Running pass: NameAnonGlobalPass
 // CHECK-FULL-O0-NEXT: Running pass: AnnotationRemarksPass
-// CHECK-FULL-O0-NEXT: Running analysis: TargetLibraryAnalysis
 // CHECK-FULL-O0-NEXT: Running pass: VerifierPass
 // CHECK-FULL-O0-NEXT: Running pass: BitcodeWriterPass
 
@@ -46,10 +48,12 @@
 // CHECK-THIN-O0-NEXT: Running pass: AlwaysInlinerPass
 // CHECK-THIN-O0-NEXT: Running analysis: ProfileSummaryAnalysis
 // CHECK-THIN-O0-NEXT: Running pass: CoroConditionalWrapper
+// CHECK-THIN-O0-NEXT: Running pass: AllocTokenPass
+// CHECK-THIN-O0-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis
+// CHECK-THIN-O0-NEXT: Running analysis: TargetLibraryAnalysis
 // CHECK-THIN-O0-NEXT: Running pass: CanonicalizeAliasesPass
 // CHECK-THIN-O0-NEXT: Running pass: NameAnonGlobalPass
 // CHECK-THIN-O0-NEXT: Running pass: AnnotationRemarksPass
-// CHECK-THIN-O0-NEXT: Running analysis: TargetLibraryAnalysis
 // CHECK-THIN-O0-NEXT: Running pass: VerifierPass
 // CHECK-THIN-O0-NEXT: Running pass: ThinLTOBitcodeWriterPass
 
diff --git a/clang/test/CodeGenCXX/alloc-token-builtin.cpp b/clang/test/CodeGenCXX/alloc-token-builtin.cpp
new file mode 100644
index 0000000000000..7a868d8d20b93
--- /dev/null
+++ b/clang/test/CodeGenCXX/alloc-token-builtin.cpp
@@ -0,0 +1,76 @@
+// Test IR generation of the builtin without evaluating the LLVM intrinsic.
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -Werror -std=c++20 -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-CODEGEN
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -Werror -std=c++20 -emit-llvm -falloc-token-max=2  %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-LOWER
+
+extern "C" void *my_malloc(unsigned long, unsigned long);
+
+struct NoPtr {
+  int x;
+  long y;
+};
+
+struct WithPtr {
+  int a;
+  char *buf;
+};
+
+int unevaluated_fn();
+
+// CHECK-LABEL: @_Z16test_builtin_intv(
+unsigned long long test_builtin_int() {
+  // CHECK-CODEGEN: call i64 @llvm.alloc.token.id(metadata ![[MD_INT:[0-9]+]])
+  // CHECK-LOWER: ret i64 0
+  return __builtin_alloc_token_infer(sizeof(1));
+}
+
+// CHECK-LABEL: @_Z16test_builtin_ptrv(
+unsigned long long test_builtin_ptr() {
+  // CHECK-CODEGEN: call i64 @llvm.alloc.token.id(metadata ![[MD_PTR:[0-9]+]])
+  // CHECK-LOWER: ret i64 1
+  return __builtin_alloc_token_infer(sizeof(int *));
+}
+
+// CHECK-LABEL: @_Z25test_builtin_struct_noptrv(
+unsigned long long test_builtin_struct_noptr() {
+  // CHECK-CODEGEN: call i64 @llvm.alloc.token.id(metadata ![[MD_NOPTR:[0-9]+]])
+  // CHECK-LOWER: ret i64 0
+  return __builtin_alloc_token_infer(sizeof(NoPtr));
+}
+
+// CHECK-LABEL: @_Z25test_builtin_struct_w_ptrv(
+unsigned long long test_builtin_struct_w_ptr() {
+  // CHECK-CODEGEN: call i64 @llvm.alloc.token.id(metadata ![[MD_WITHPTR:[0-9]+]])
+  // CHECK-LOWER: ret i64 1
+  return __builtin_alloc_token_infer(sizeof(WithPtr), 123);
+}
+
+// CHECK-LABEL: @_Z24test_builtin_unevaluatedv(
+unsigned long long test_builtin_unevaluated() {
+  // CHECK-NOT: call{{.*}}unevaluated_fn
+  // CHECK-CODEGEN: call i64 @llvm.alloc.token.id(metadata ![[MD_INT:[0-9]+]])
+  // CHECK-LOWER: ret i64 0
+  return __builtin_alloc_token_infer(sizeof(int) * unevaluated_fn());
+}
+
+// CHECK-LABEL: @_Z36test_builtin_unsequenced_unevaluatedi(
+void test_builtin_unsequenced_unevaluated(int x) {
+  // CHECK:     add nsw
+  // CHECK-NOT: add nsw
+  // CHECK-CODEGEN: %[[REG:[0-9]+]] = call i64 @llvm.alloc.token.id(metadata ![[MD_UNKNOWN:[0-9]+]])
+  // CHECK-CODEGEN: call{{.*}}@my_malloc({{.*}}, i64 noundef %[[REG]])
+  // CHECK-LOWER: call{{.*}}@my_malloc({{.*}}, i64 noundef 0)
+  my_malloc(++x, __builtin_alloc_token_infer(++x));
+}
+
+// CHECK-LABEL: @_Z20test_builtin_unknownv(
+unsigned long long test_builtin_unknown() {
+  // CHECK-CODEGEN: call i64 @llvm.alloc.token.id(metadata ![[MD_UNKNOWN:[0-9]+]])
+  // CHECK-LOWER: ret i64 0
+  return __builtin_alloc_token_infer(4096);
+}
+
+// CHECK-CODEGEN: ![[MD_INT]] = !{!"int", i1 false}
+// CHECK-CODEGEN: ![[MD_PTR]] = !{!"int *", i1 true}
+// CHECK-CODEGEN: ![[MD_NOPTR]] = !{!"NoPtr", i1 false}
+// CHECK-CODEGEN: ![[MD_WITHPTR]] = !{!"WithPtr", i1 true}
+// CHECK-CODEGEN: ![[MD_UNKNOWN]] = !{}
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 8e2e0604cb3af..5b1d5eac70895 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -2852,7 +2852,15 @@ def int_ptrauth_blend :
 def int_ptrauth_sign_generic :
   DefaultAttrsIntrinsic<[llvm_i64_ty], [llvm_i64_ty, llvm_i64_ty], [IntrNoMem]>;
 
+//===----------------- AllocToken Intrinsics ------------------------------===//
+
+// Return the token ID for the given !alloc_token_hint metadata.
+def int_alloc_token_id :
+  DefaultAttrsIntrinsic<[llvm_i64_ty], [llvm_metadata_ty],
+                        [IntrNoMem, NoUndef<RetIndex>]>;
+
 //===----------------------------------------------------------------------===//
+
 //===------- Convergence Intrinsics ---------------------------------------===//
 
 def int_experimental_convergence_entry
diff --git a/llvm/lib/Transforms/Instrumentation/AllocToken.cpp b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
index 74cda227d50a7..3a28705d87523 100644
--- a/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
+++ b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
@@ -31,6 +31,7 @@
 #include "llvm/IR/InstIterator.h"
 #include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/PassManager.h"
@@ -149,9 +150,19 @@ STATISTIC(NumAllocations, "Allocations found");
 ///
 /// Expected format is: !{<type-name>, <contains-pointer>}
 MDNode *getAllocTokenHintMetadata(const CallBase &CB) {
-  MDNode *Ret = CB.getMetadata(LLVMContext::MD_alloc_token_hint);
-  if (!Ret)
-    return nullptr;
+  MDNode *Ret = nullptr;
+  if (auto *II = dyn_cast<IntrinsicInst>(&CB);
+      II && II->getIntrinsicID() == Intrinsic::alloc_token_id) {
+    auto *MDV = cast<MetadataAsValue>(II->getArgOperand(0));
+    Ret = cast<MDNode>(MDV->getMetadata());
+    // If the intrinsic has an empty MDNode, type inference failed.
+    if (Ret->getNumOperands() == 0)
+      return nullptr;
+  } else {
+    Ret = CB.getMetadata(LLVMContext::MD_alloc_token_hint);
+    if (!Ret)
+      return nullptr;
+  }
   assert(Ret->getNumOperands() == 2 && "bad !alloc_token_hint");
   assert(isa<MDString>(Ret->getOperand(0)));
   assert(isa<ConstantAsMetadata>(Ret->getOperand(1)));
@@ -313,6 +324,9 @@ class AllocToken {
   FunctionCallee getTokenAllocFunction(const CallBase &CB, uint64_t TokenID,
                                        LibFunc OriginalFunc);
 
+  /// Lower alloc_token_* intrinsics.
+  void replaceIntrinsicInst(IntrinsicInst *II, OptimizationRemarkEmitter &ORE);
+
   /// Return the token ID from metadata in the call.
   uint64_t getToken(const Ca...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/156842