[compiler-rt] [ctxprof] Auto root detection: trie for stack samples (PR #133106)

Mircea Trofin via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 27 15:07:15 PDT 2025


================
@@ -0,0 +1,90 @@
+//===- RootAutodetector.cpp - detect contextual profiling roots -----------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "RootAutoDetector.h"
+#include "CtxInstrProfiling.h"
+
+#include "sanitizer_common/sanitizer_common.h"
+#include "sanitizer_common/sanitizer_mutex.h"
+#include "sanitizer_common/sanitizer_placement_new.h"
+#include <assert.h>
+#include <dlfcn.h>
+#include <pthread.h>
+
+using namespace __ctx_profile;
+
+uptr PerThreadCallsiteTrie::getFctStartAddr(uptr CallsiteAddress) const {
+  // this requires --linkopt=-Wl,--export-dynamic
+  Dl_info Info;
+  if (dladdr(reinterpret_cast<const void *>(CallsiteAddress), &Info) != 0)
+    return reinterpret_cast<uptr>(Info.dli_saddr);
+  return 0;
+}
+
+void PerThreadCallsiteTrie::insertStack(const StackTrace &ST) {
+  auto *Current = &TheTrie;
+  // the stack is backwards - the first callsite is at the top.
+  for (int I = ST.size - 1; I >= 0; --I) {
+    auto ChildAddr = ST.trace[I];
+    auto [Iter, _] = Current->Children.insert({ChildAddr, Trie(ChildAddr)});
+    ++Current->Count;
+    Current = &Iter->second;
+  }
+}
+
+DenseMap<uptr, uint64_t> PerThreadCallsiteTrie::determineRoots() const {
+  // Assuming a message pump design, roots are those functions called by the
+  // message pump. The message pump is an infinite loop (for all practical
+  // considerations) fetching data from a queue. The root functions return -
+  // otherwise the message pump doesn't work. This function detects roots as the
+  // first place in the trie (starting from the root) where a function calls 2
+  // or more functions.
+  //
+  // We start with a callsite trie - the nodes are callsites. Different child
+  // nodes may actually correspond to the same function.
+  //
+  // For example: using function(callsite)
+  // f1(csf1_1) -> f2(csf2_1) -> f3
+  //            -> f2(csf2_2) -> f4
+  //
+  // would be represented in our trie as:
+  // csf1_1 -> csf2_1 -> f3
+  //        -> csf2_2 -> f4
+  //
+  // While we can assert the control flow returns to f2, we don't know if it
+  // ever returns to f1. f2 could be the message pump.
+  //
+  // We need to convert our callsite tree into a function tree. We can also,
+  // more economically, just see how many distinct functions there are at a
+  // certain depth. When that count is greater than 1, we got to potential roots
+  // and everything above should be considered as non-roots.
+  DenseMap<uptr, uint64_t> Result;
+  Set<const Trie *> Worklist;
----------------
mtrofin wrote:

It should only matter for the unittest, and DenseMap (which is what this Set is) should be deterministic, run to run.

https://github.com/llvm/llvm-project/pull/133106


More information about the llvm-commits mailing list