[llvm-branch-commits] [clang] [AllocToken, Clang] Implement TypeHashPointerSplit mode (PR #156840)

Tue Sep 30 14:03:58 PDT 2025

================
@@ -205,6 +231,26 @@ class TypeHashMode : public ModeBase {
   }
 };
 
+/// Implementation for TokenMode::TypeHashPointerSplit.
+class TypeHashPointerSplitMode : public TypeHashMode {
+public:
+  using TypeHashMode::TypeHashMode;
+
+  uint64_t operator()(const CallBase &CB, OptimizationRemarkEmitter &ORE) {
+    if (MaxTokens == 1)
+      return 0;
+    const uint64_t HalfTokens =
+        (MaxTokens ? MaxTokens : std::numeric_limits<uint64_t>::max()) / 2;
+    const auto [N, H] = getHash(CB, ORE);
+    if (!N)
+      return H;                     // fallback token
----------------
melver wrote:

It goes into the pointer-less bucket by default:
```

--- a/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
+++ b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
@@ -242,8 +242,12 @@ public:
     const uint64_t HalfTokens =
         (MaxTokens ? MaxTokens : std::numeric_limits<uint64_t>::max()) / 2;
     const auto [N, H] = getHash(CB, ORE);
-    if (!N)
-      return H;                     // fallback token
+    if (!N) {
+      // Pick the fallback token (ClFallbackToken), which by default is 0,
+      // meaning it'll fall into the pointer-less bucket. Override by setting
+      // -alloc-token-fallback if that is the wrong choice.
+      return H;
+    }
```

Advanced users could e.g. set -alloc-token-fallback to some bucket outside the range of normal buckets, but I have no intuition if that's a good or bad choice if this is used for heap hardening strategies. So I wouldn't want to expose this as a standard "frontend option" either.

E.g. we're having discussions if we should pick the pointer-containing bucket as the default for fallbacks. The intuition behind that is to "protect pointer-containing allocations" from (more likely) buggy plain data/buffer allocations, but at the same time we could stick a plain buffer allocation (where inference failed) into the fallback which makes the whole point moot. Ideally we end up with few/no fallback cases.

https://github.com/llvm/llvm-project/pull/156840