[compiler-rt] [llvm] [ctx_profile] Pull `ContextNode` in a `.inc` file (PR #91669)

Mircea Trofin via llvm-commits llvm-commits at lists.llvm.org
Thu May 9 15:09:20 PDT 2024


https://github.com/mtrofin created https://github.com/llvm/llvm-project/pull/91669

This pulls out `ContextNode` as we need to use it pretty much as-is to implement a writer. The writer will be implemented on the LLVM side because it takes a dependency on BitStreamWriter.

Since we can't reuse a header between compiler-rt and llvm, we use a .inc which is copied on both sides, and test that the 2 copies are identical.

The changes adds the necessary other stuff for compiler-rt/ctx_profile testing.

>From 9b3717458079135078ffcc9cb601b59eb7566891 Mon Sep 17 00:00:00 2001
From: Mircea Trofin <mtrofin at google.com>
Date: Thu, 9 May 2024 14:56:59 -0700
Subject: [PATCH] [ctx_profile] Pull `ContextNode` in a `.inc` file

This pulls out `ContextNode` as we need to use it pretty much as-is to
implement a writer. The writer will be implemented on the LLVM side because
it takes a dependency on BitStreamWriter.

Since we can't reuse a header between compiler-rt and llvm, we use a .inc
which is copied on both sides, and test that the 2 copies are identical.

The changes adds the necessary other stuff for compiler-rt/ctx_profile testing.
---
 compiler-rt/lib/ctx_profile/CMakeLists.txt    |   1 +
 .../lib/ctx_profile/CtxInstrContextNode.inc   | 102 ++++++++++++++++++
 .../lib/ctx_profile/CtxInstrProfiling.cpp     |  54 +++++-----
 .../lib/ctx_profile/CtxInstrProfiling.h       |  95 +---------------
 compiler-rt/test/ctx_profile/CMakeLists.txt   |  22 ++++
 .../TestCases/check-same-ctx-node.test        |   5 +
 compiler-rt/test/ctx_profile/lit.cfg.py       |  33 ++++++
 .../test/ctx_profile/lit.site.cfg.py.in       |  14 +++
 llvm/lib/ProfileData/CtxInstrContextNode.inc  | 102 ++++++++++++++++++
 9 files changed, 309 insertions(+), 119 deletions(-)
 create mode 100644 compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
 create mode 100644 compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test
 create mode 100644 compiler-rt/test/ctx_profile/lit.cfg.py
 create mode 100644 compiler-rt/test/ctx_profile/lit.site.cfg.py.in
 create mode 100644 llvm/lib/ProfileData/CtxInstrContextNode.inc

diff --git a/compiler-rt/lib/ctx_profile/CMakeLists.txt b/compiler-rt/lib/ctx_profile/CMakeLists.txt
index 80e71acc38f8a..e2bf1776cd76c 100644
--- a/compiler-rt/lib/ctx_profile/CMakeLists.txt
+++ b/compiler-rt/lib/ctx_profile/CMakeLists.txt
@@ -6,6 +6,7 @@ set(CTX_PROFILE_SOURCES
 
 set(CTX_PROFILE_HEADERS
   CtxInstrProfiling.h
+  CtxInstrContextNode.inc
   )
 
 include_directories(..)
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
new file mode 100644
index 0000000000000..06be172fe1503
--- /dev/null
+++ b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
@@ -0,0 +1,102 @@
+/*===- CtxInstrContextNode.inc- Contextual instrumentation-based PGO  -----===*\
+|*
+|* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+|* See https://llvm.org/LICENSE.txt for license information.
+|* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+|*
+\*===----------------------------------------------------------------------===*/
+
+//==============================================================================
+//
+// NOTE!
+// llvm/lib/ProfileData/CtxInstrContextNode.inc and
+//   compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
+// must be exact copies of eachother
+//
+//==============================================================================
+
+/// The contextual profile is a directed tree where each node has one parent. A
+/// node (ContextNode) corresponds to a function activation. The root of the
+/// tree is at a function that was marked as entrypoint to the compiler. A node
+/// stores counter values for edges and a vector of subcontexts. These are the
+/// contexts of callees. The index in the subcontext vector corresponds to the
+/// index of the callsite (as was instrumented via llvm.instrprof.callsite). At
+/// that index we find a linked list, potentially empty, of ContextNodes. Direct
+/// calls will have 0 or 1 values in the linked list, but indirect callsites may
+/// have more.
+///
+/// The ContextNode has a fixed sized header describing it - the GUID of the
+/// function, the size of the counter and callsite vectors. It is also an
+/// (intrusive) linked list for the purposes of the indirect call case above.
+///
+/// Allocation is expected to happen on an Arena. The allocation lays out inline
+/// the counter and subcontexts vectors. The class offers APIs to correctly
+/// reference the latter.
+///
+/// The layout is as follows:
+///
+/// [[declared fields][counters vector][vector of ptrs to subcontexts]]
+///
+/// See also documentation on the counters and subContexts members below.
+///
+/// The structure of the ContextNode is known to LLVM, because LLVM needs to:
+///   (1) increment counts, and
+///   (2) form a GEP for the position in the subcontext list of a callsite
+/// This means changes to LLVM contextual profile lowering and changes here
+/// must be coupled.
+/// Note: the header content isn't interesting to LLVM (other than its size)
+///
+/// Part of contextual collection is the notion of "scratch contexts". These are
+/// buffers that are "large enough" to allow for memory-safe acceses during
+/// counter increments - meaning the counter increment code in LLVM doesn't need
+/// to be concerned with memory safety. Their subcontexts never get populated,
+/// though. The runtime code here produces and recognizes them.
+
+using GUID = uint64_t;
+
+class ContextNode final {
+  const GUID Guid;
+  ContextNode *const Next;
+  const uint32_t NrCounters;
+  const uint32_t NrCallsites;
+
+public:
+  ContextNode(GUID Guid, uint32_t NrCounters, uint32_t NrCallsites,
+              ContextNode *Next = nullptr)
+      : Guid(Guid), Next(Next), NrCounters(NrCounters),
+        NrCallsites(NrCallsites) {}
+
+  static inline size_t getAllocSize(uint32_t NrCounters, uint32_t NrCallsites) {
+    return sizeof(ContextNode) + sizeof(uint64_t) * NrCounters +
+           sizeof(ContextNode *) * NrCallsites;
+  }
+
+  // The counters vector starts right after the static header.
+  uint64_t *counters() {
+    ContextNode *addr_after = &(this[1]);
+    return reinterpret_cast<uint64_t *>(addr_after);
+  }
+
+  uint32_t counters_size() const { return NrCounters; }
+  uint32_t callsites_size() const { return NrCallsites; }
+
+  const uint64_t *counters() const {
+    return const_cast<ContextNode *>(this)->counters();
+  }
+
+  // The subcontexts vector starts right after the end of the counters vector.
+  ContextNode **subContexts() {
+    return reinterpret_cast<ContextNode **>(&(counters()[NrCounters]));
+  }
+
+  ContextNode *const *subContexts() const {
+    return const_cast<ContextNode *>(this)->subContexts();
+  }
+
+  GUID guid() const { return Guid; }
+  ContextNode *next() const { return Next; }
+
+  size_t size() const { return getAllocSize(NrCounters, NrCallsites); }
+
+  uint64_t entrycount() const { return counters()[0]; }
+};
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
index 68bfe5c1ae614..9a064a1f194e0 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
+++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
@@ -90,6 +90,26 @@ bool validate(const ContextRoot *Root) {
   }
   return true;
 }
+
+inline ContextNode *allocContextNode(char *Place, GUID Guid,
+                                     uint32_t NrCounters, uint32_t NrCallsites,
+                                     ContextNode *Next = nullptr) {
+  assert(reinterpret_cast<uint64_t>(Place) % ExpectedAlignment == 0);
+  return new (Place) ContextNode(Guid, NrCounters, NrCallsites, Next);
+}
+
+void resetContextNode(ContextNode &Node) {
+  // FIXME(mtrofin): this is std::memset, which we can probably use if we
+  // drop/reduce the dependency on sanitizer_common.
+  for (uint32_t I = 0; I < Node.counters_size(); ++I)
+    Node.counters()[I] = 0;
+  for (uint32_t I = 0; I < Node.callsites_size(); ++I)
+    for (auto *Next = Node.subContexts()[I]; Next; Next = Next->next())
+      resetContextNode(*Next);
+}
+
+void onContextEnter(ContextNode &Node) { ++Node.counters()[0]; }
+
 } // namespace
 
 // the scratch buffer - what we give when we can't produce a real context (the
@@ -134,27 +154,9 @@ void Arena::freeArenaList(Arena *&A) {
   A = nullptr;
 }
 
-inline ContextNode *ContextNode::alloc(char *Place, GUID Guid,
-                                       uint32_t NrCounters,
-                                       uint32_t NrCallsites,
-                                       ContextNode *Next) {
-  assert(reinterpret_cast<uint64_t>(Place) % ExpectedAlignment == 0);
-  return new (Place) ContextNode(Guid, NrCounters, NrCallsites, Next);
-}
-
-void ContextNode::reset() {
-  // FIXME(mtrofin): this is std::memset, which we can probably use if we
-  // drop/reduce the dependency on sanitizer_common.
-  for (uint32_t I = 0; I < NrCounters; ++I)
-    counters()[I] = 0;
-  for (uint32_t I = 0; I < NrCallsites; ++I)
-    for (auto *Next = subContexts()[I]; Next; Next = Next->Next)
-      Next->reset();
-}
-
 // If this is the first time we hit a callsite with this (Guid) particular
 // callee, we need to allocate.
-ContextNode *getCallsiteSlow(uint64_t Guid, ContextNode **InsertionPoint,
+ContextNode *getCallsiteSlow(GUID Guid, ContextNode **InsertionPoint,
                              uint32_t NrCounters, uint32_t NrCallsites) {
   auto AllocSize = ContextNode::getAllocSize(NrCounters, NrCallsites);
   auto *Mem = __llvm_ctx_profile_current_context_root->CurrentMem;
@@ -169,8 +171,8 @@ ContextNode *getCallsiteSlow(uint64_t Guid, ContextNode **InsertionPoint,
         Mem->allocateNewArena(getArenaAllocSize(AllocSize), Mem);
     AllocPlace = Mem->tryBumpAllocate(AllocSize);
   }
-  auto *Ret = ContextNode::alloc(AllocPlace, Guid, NrCounters, NrCallsites,
-                                 *InsertionPoint);
+  auto *Ret = allocContextNode(AllocPlace, Guid, NrCounters, NrCallsites,
+                               *InsertionPoint);
   *InsertionPoint = Ret;
   return Ret;
 }
@@ -224,7 +226,7 @@ ContextNode *__llvm_ctx_profile_get_context(void *Callee, GUID Guid,
                         "Context: %p, Asked: %lu %u %u, Got: %lu %u %u \n",
                         Ret, Guid, NrCallsites, NrCounters, Ret->guid(),
                         Ret->callsites_size(), Ret->counters_size());
-  Ret->onEntry();
+  onContextEnter(*Ret);
   return Ret;
 }
 
@@ -241,8 +243,8 @@ void setupContext(ContextRoot *Root, GUID Guid, uint32_t NrCounters,
   auto *M = Arena::allocateNewArena(getArenaAllocSize(Needed));
   Root->FirstMemBlock = M;
   Root->CurrentMem = M;
-  Root->FirstNode = ContextNode::alloc(M->tryBumpAllocate(Needed), Guid,
-                                       NrCounters, NrCallsites);
+  Root->FirstNode = allocContextNode(M->tryBumpAllocate(Needed), Guid,
+                                     NrCounters, NrCallsites);
   AllContextRoots.PushBack(Root);
 }
 
@@ -254,7 +256,7 @@ ContextNode *__llvm_ctx_profile_start_context(
   }
   if (Root->Taken.TryLock()) {
     __llvm_ctx_profile_current_context_root = Root;
-    Root->FirstNode->onEntry();
+    onContextEnter(*Root->FirstNode);
     return Root->FirstNode;
   }
   // If this thread couldn't take the lock, return scratch context.
@@ -281,7 +283,7 @@ void __llvm_ctx_profile_start_collection() {
     for (auto *Mem = Root->FirstMemBlock; Mem; Mem = Mem->next())
       ++NrMemUnits;
 
-    Root->FirstNode->reset();
+    resetContextNode(*Root->FirstNode);
   }
   __sanitizer::Printf("[ctxprof] Initial NrMemUnits: %zu \n", NrMemUnits);
 }
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h
index 8c4be5d8a23a7..9e41335950e10 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h
+++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h
@@ -13,7 +13,7 @@
 #include <sanitizer/common_interface_defs.h>
 
 namespace __ctx_profile {
-using GUID = uint64_t;
+
 static constexpr size_t ExpectedAlignment = 8;
 // We really depend on this, see further below. We currently support x86_64.
 // When we want to support other archs, we need to trace the places Alignment is
@@ -62,98 +62,7 @@ class Arena final {
 // it to be thus aligned.
 static_assert(alignof(Arena) == ExpectedAlignment);
 
-/// The contextual profile is a directed tree where each node has one parent. A
-/// node (ContextNode) corresponds to a function activation. The root of the
-/// tree is at a function that was marked as entrypoint to the compiler. A node
-/// stores counter values for edges and a vector of subcontexts. These are the
-/// contexts of callees. The index in the subcontext vector corresponds to the
-/// index of the callsite (as was instrumented via llvm.instrprof.callsite). At
-/// that index we find a linked list, potentially empty, of ContextNodes. Direct
-/// calls will have 0 or 1 values in the linked list, but indirect callsites may
-/// have more.
-///
-/// The ContextNode has a fixed sized header describing it - the GUID of the
-/// function, the size of the counter and callsite vectors. It is also an
-/// (intrusive) linked list for the purposes of the indirect call case above.
-///
-/// Allocation is expected to happen on an Arena. The allocation lays out inline
-/// the counter and subcontexts vectors. The class offers APIs to correctly
-/// reference the latter.
-///
-/// The layout is as follows:
-///
-/// [[declared fields][counters vector][vector of ptrs to subcontexts]]
-///
-/// See also documentation on the counters and subContexts members below.
-///
-/// The structure of the ContextNode is known to LLVM, because LLVM needs to:
-///   (1) increment counts, and
-///   (2) form a GEP for the position in the subcontext list of a callsite
-/// This means changes to LLVM contextual profile lowering and changes here
-/// must be coupled.
-/// Note: the header content isn't interesting to LLVM (other than its size)
-///
-/// Part of contextual collection is the notion of "scratch contexts". These are
-/// buffers that are "large enough" to allow for memory-safe acceses during
-/// counter increments - meaning the counter increment code in LLVM doesn't need
-/// to be concerned with memory safety. Their subcontexts never get populated,
-/// though. The runtime code here produces and recognizes them.
-class ContextNode final {
-  const GUID Guid;
-  ContextNode *const Next;
-  const uint32_t NrCounters;
-  const uint32_t NrCallsites;
-
-public:
-  ContextNode(GUID Guid, uint32_t NrCounters, uint32_t NrCallsites,
-              ContextNode *Next = nullptr)
-      : Guid(Guid), Next(Next), NrCounters(NrCounters),
-        NrCallsites(NrCallsites) {}
-  static inline ContextNode *alloc(char *Place, GUID Guid, uint32_t NrCounters,
-                                   uint32_t NrCallsites,
-                                   ContextNode *Next = nullptr);
-
-  static inline size_t getAllocSize(uint32_t NrCounters, uint32_t NrCallsites) {
-    return sizeof(ContextNode) + sizeof(uint64_t) * NrCounters +
-           sizeof(ContextNode *) * NrCallsites;
-  }
-
-  // The counters vector starts right after the static header.
-  uint64_t *counters() {
-    ContextNode *addr_after = &(this[1]);
-    return reinterpret_cast<uint64_t *>(addr_after);
-  }
-
-  uint32_t counters_size() const { return NrCounters; }
-  uint32_t callsites_size() const { return NrCallsites; }
-
-  const uint64_t *counters() const {
-    return const_cast<ContextNode *>(this)->counters();
-  }
-
-  // The subcontexts vector starts right after the end of the counters vector.
-  ContextNode **subContexts() {
-    return reinterpret_cast<ContextNode **>(&(counters()[NrCounters]));
-  }
-
-  ContextNode *const *subContexts() const {
-    return const_cast<ContextNode *>(this)->subContexts();
-  }
-
-  GUID guid() const { return Guid; }
-  ContextNode *next() { return Next; }
-
-  size_t size() const { return getAllocSize(NrCounters, NrCallsites); }
-
-  void reset();
-
-  // since we go through the runtime to get a context back to LLVM, in the entry
-  // basic block, might as well handle incrementing the entry basic block
-  // counter.
-  void onEntry() { ++counters()[0]; }
-
-  uint64_t entrycount() const { return counters()[0]; }
-};
+#include "CtxInstrContextNode.inc"
 
 // Verify maintenance to ContextNode doesn't change this invariant, which makes
 // sure the inlined vectors are appropriately aligned.
diff --git a/compiler-rt/test/ctx_profile/CMakeLists.txt b/compiler-rt/test/ctx_profile/CMakeLists.txt
index 23c6fb16ed1f4..371f1a2dcbb05 100644
--- a/compiler-rt/test/ctx_profile/CMakeLists.txt
+++ b/compiler-rt/test/ctx_profile/CMakeLists.txt
@@ -2,6 +2,28 @@ set(CTX_PROFILE_LIT_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
 
 set(CTX_PROFILE_TESTSUITES)
 
+macro(get_bits_for_arch arch bits)
+  if (${arch} MATCHES "x86_64")
+    set(${bits} 64)
+  else()
+    message(FATAL_ERROR "Unexpected target architecture: ${arch}")
+  endif()
+endmacro()
+
+set(CTX_PROFILE_TEST_DEPS ${SANITIZER_COMMON_LIT_TEST_DEPS} ctx_profile)
+
+foreach(arch ${CTX_PROFILE_SUPPORTED_ARCH})
+  set(CTX_PROFILE_TEST_TARGET_ARCH ${arch})
+  string(TOLOWER "-${arch}-${OS_NAME}" CTX_PROFILE_TEST_CONFIG_SUFFIX)
+  string(TOUPPER ${arch} ARCH_UPPER_CASE)
+  set(CONFIG_NAME ${ARCH_UPPER_CASE}${OS_NAME}Config)
+  configure_lit_site_cfg(
+    ${CMAKE_CURRENT_SOURCE_DIR}/lit.site.cfg.py.in
+    ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME}/lit.site.cfg.py
+    )
+  list(APPEND CTX_PROFILE_TESTSUITES ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME})
+endforeach()
+
 # Add unit tests.
 if(COMPILER_RT_INCLUDE_TESTS)
   foreach(arch ${CTX_PROFILE_SUPPORTED_ARCH})
diff --git a/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test b/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test
new file mode 100644
index 0000000000000..6b4dba3227a45
--- /dev/null
+++ b/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test
@@ -0,0 +1,5 @@
+;
+; NOTE: if this test fails, please make sure the two files are identical copies
+; of eachother.
+;
+; RUN: diff %crt_src/lib/ctx_profile/CtxInstrContextNode.inc %llvm_src/lib/ProfileData/CtxInstrContextNode.inc
diff --git a/compiler-rt/test/ctx_profile/lit.cfg.py b/compiler-rt/test/ctx_profile/lit.cfg.py
new file mode 100644
index 0000000000000..327f48a1f88d5
--- /dev/null
+++ b/compiler-rt/test/ctx_profile/lit.cfg.py
@@ -0,0 +1,33 @@
+# -*- Python -*-
+
+import os
+import platform
+import re
+
+import lit.formats
+
+# Only run the tests on supported OSs.
+if config.host_os not in ["Linux"]:
+    config.unsupported = True
+
+def get_required_attr(config, attr_name):
+    attr_value = getattr(config, attr_name, None)
+    if attr_value == None:
+        lit_config.fatal(
+            "No attribute %r in test configuration! You may need to run "
+            "tests from your build directory or add this attribute "
+            "to lit.site.cfg.py " % attr_name
+        )
+    return attr_value
+
+
+# Setup config name.
+config.name = "CtxProfile" + config.name_suffix
+
+# Setup source root.
+config.test_source_root = os.path.dirname(__file__)
+# Default test suffixes.
+config.suffixes = [".c", ".cpp", ".test"]
+
+config.substitutions.append(("%crt_src", config.compiler_rt_src_root))
+config.substitutions.append(("%llvm_src", config.llvm_src_root))
diff --git a/compiler-rt/test/ctx_profile/lit.site.cfg.py.in b/compiler-rt/test/ctx_profile/lit.site.cfg.py.in
new file mode 100644
index 0000000000000..e8df42d097d89
--- /dev/null
+++ b/compiler-rt/test/ctx_profile/lit.site.cfg.py.in
@@ -0,0 +1,14 @@
+ at LIT_SITE_CFG_IN_HEADER@
+
+# Tool-specific config options.
+config.name_suffix = "@CTX_PROFILE_TEST_CONFIG_SUFFIX@"
+config.target_cflags = "@CTX_PROFILE_TEST_TARGET_CFLAGS@"
+config.clang = "@CTX_PROFILE_TEST_TARGET_CC@"
+config.bits = "@CTX_PROFILE_TEST_BITS@"
+config.target_arch = "@CTX_PROFILE_TEST_TARGET_ARCH@"
+
+# Load common config for all compiler-rt lit tests.
+lit_config.load_config(config, "@COMPILER_RT_BINARY_DIR@/test/lit.common.configured")
+
+# Load tool-specific config that would do the real work.
+lit_config.load_config(config, "@CTX_PROFILE_LIT_SOURCE_DIR@/lit.cfg.py")
diff --git a/llvm/lib/ProfileData/CtxInstrContextNode.inc b/llvm/lib/ProfileData/CtxInstrContextNode.inc
new file mode 100644
index 0000000000000..06be172fe1503
--- /dev/null
+++ b/llvm/lib/ProfileData/CtxInstrContextNode.inc
@@ -0,0 +1,102 @@
+/*===- CtxInstrContextNode.inc- Contextual instrumentation-based PGO  -----===*\
+|*
+|* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+|* See https://llvm.org/LICENSE.txt for license information.
+|* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+|*
+\*===----------------------------------------------------------------------===*/
+
+//==============================================================================
+//
+// NOTE!
+// llvm/lib/ProfileData/CtxInstrContextNode.inc and
+//   compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
+// must be exact copies of eachother
+//
+//==============================================================================
+
+/// The contextual profile is a directed tree where each node has one parent. A
+/// node (ContextNode) corresponds to a function activation. The root of the
+/// tree is at a function that was marked as entrypoint to the compiler. A node
+/// stores counter values for edges and a vector of subcontexts. These are the
+/// contexts of callees. The index in the subcontext vector corresponds to the
+/// index of the callsite (as was instrumented via llvm.instrprof.callsite). At
+/// that index we find a linked list, potentially empty, of ContextNodes. Direct
+/// calls will have 0 or 1 values in the linked list, but indirect callsites may
+/// have more.
+///
+/// The ContextNode has a fixed sized header describing it - the GUID of the
+/// function, the size of the counter and callsite vectors. It is also an
+/// (intrusive) linked list for the purposes of the indirect call case above.
+///
+/// Allocation is expected to happen on an Arena. The allocation lays out inline
+/// the counter and subcontexts vectors. The class offers APIs to correctly
+/// reference the latter.
+///
+/// The layout is as follows:
+///
+/// [[declared fields][counters vector][vector of ptrs to subcontexts]]
+///
+/// See also documentation on the counters and subContexts members below.
+///
+/// The structure of the ContextNode is known to LLVM, because LLVM needs to:
+///   (1) increment counts, and
+///   (2) form a GEP for the position in the subcontext list of a callsite
+/// This means changes to LLVM contextual profile lowering and changes here
+/// must be coupled.
+/// Note: the header content isn't interesting to LLVM (other than its size)
+///
+/// Part of contextual collection is the notion of "scratch contexts". These are
+/// buffers that are "large enough" to allow for memory-safe acceses during
+/// counter increments - meaning the counter increment code in LLVM doesn't need
+/// to be concerned with memory safety. Their subcontexts never get populated,
+/// though. The runtime code here produces and recognizes them.
+
+using GUID = uint64_t;
+
+class ContextNode final {
+  const GUID Guid;
+  ContextNode *const Next;
+  const uint32_t NrCounters;
+  const uint32_t NrCallsites;
+
+public:
+  ContextNode(GUID Guid, uint32_t NrCounters, uint32_t NrCallsites,
+              ContextNode *Next = nullptr)
+      : Guid(Guid), Next(Next), NrCounters(NrCounters),
+        NrCallsites(NrCallsites) {}
+
+  static inline size_t getAllocSize(uint32_t NrCounters, uint32_t NrCallsites) {
+    return sizeof(ContextNode) + sizeof(uint64_t) * NrCounters +
+           sizeof(ContextNode *) * NrCallsites;
+  }
+
+  // The counters vector starts right after the static header.
+  uint64_t *counters() {
+    ContextNode *addr_after = &(this[1]);
+    return reinterpret_cast<uint64_t *>(addr_after);
+  }
+
+  uint32_t counters_size() const { return NrCounters; }
+  uint32_t callsites_size() const { return NrCallsites; }
+
+  const uint64_t *counters() const {
+    return const_cast<ContextNode *>(this)->counters();
+  }
+
+  // The subcontexts vector starts right after the end of the counters vector.
+  ContextNode **subContexts() {
+    return reinterpret_cast<ContextNode **>(&(counters()[NrCounters]));
+  }
+
+  ContextNode *const *subContexts() const {
+    return const_cast<ContextNode *>(this)->subContexts();
+  }
+
+  GUID guid() const { return Guid; }
+  ContextNode *next() const { return Next; }
+
+  size_t size() const { return getAllocSize(NrCounters, NrCallsites); }
+
+  uint64_t entrycount() const { return counters()[0]; }
+};



More information about the llvm-commits mailing list