[compiler-rt] [llvm] [ctx_profile] Pull `ContextNode` in a `.inc` file (PR #91669)

Mircea Trofin via llvm-commits llvm-commits at lists.llvm.org
Thu May 9 16:51:22 PDT 2024


https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/91669

>From 9b3717458079135078ffcc9cb601b59eb7566891 Mon Sep 17 00:00:00 2001
From: Mircea Trofin <mtrofin at google.com>
Date: Thu, 9 May 2024 14:56:59 -0700
Subject: [PATCH 1/5] [ctx_profile] Pull `ContextNode` in a `.inc` file

This pulls out `ContextNode` as we need to use it pretty much as-is to
implement a writer. The writer will be implemented on the LLVM side because
it takes a dependency on BitStreamWriter.

Since we can't reuse a header between compiler-rt and llvm, we use a .inc
which is copied on both sides, and test that the 2 copies are identical.

The changes adds the necessary other stuff for compiler-rt/ctx_profile testing.
---
 compiler-rt/lib/ctx_profile/CMakeLists.txt    |   1 +
 .../lib/ctx_profile/CtxInstrContextNode.inc   | 102 ++++++++++++++++++
 .../lib/ctx_profile/CtxInstrProfiling.cpp     |  54 +++++-----
 .../lib/ctx_profile/CtxInstrProfiling.h       |  95 +---------------
 compiler-rt/test/ctx_profile/CMakeLists.txt   |  22 ++++
 .../TestCases/check-same-ctx-node.test        |   5 +
 compiler-rt/test/ctx_profile/lit.cfg.py       |  33 ++++++
 .../test/ctx_profile/lit.site.cfg.py.in       |  14 +++
 llvm/lib/ProfileData/CtxInstrContextNode.inc  | 102 ++++++++++++++++++
 9 files changed, 309 insertions(+), 119 deletions(-)
 create mode 100644 compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
 create mode 100644 compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test
 create mode 100644 compiler-rt/test/ctx_profile/lit.cfg.py
 create mode 100644 compiler-rt/test/ctx_profile/lit.site.cfg.py.in
 create mode 100644 llvm/lib/ProfileData/CtxInstrContextNode.inc

diff --git a/compiler-rt/lib/ctx_profile/CMakeLists.txt b/compiler-rt/lib/ctx_profile/CMakeLists.txt
index 80e71acc38f8a..e2bf1776cd76c 100644
--- a/compiler-rt/lib/ctx_profile/CMakeLists.txt
+++ b/compiler-rt/lib/ctx_profile/CMakeLists.txt
@@ -6,6 +6,7 @@ set(CTX_PROFILE_SOURCES
 
 set(CTX_PROFILE_HEADERS
   CtxInstrProfiling.h
+  CtxInstrContextNode.inc
   )
 
 include_directories(..)
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
new file mode 100644
index 0000000000000..06be172fe1503
--- /dev/null
+++ b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
@@ -0,0 +1,102 @@
+/*===- CtxInstrContextNode.inc- Contextual instrumentation-based PGO  -----===*\
+|*
+|* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+|* See https://llvm.org/LICENSE.txt for license information.
+|* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+|*
+\*===----------------------------------------------------------------------===*/
+
+//==============================================================================
+//
+// NOTE!
+// llvm/lib/ProfileData/CtxInstrContextNode.inc and
+//   compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
+// must be exact copies of eachother
+//
+//==============================================================================
+
+/// The contextual profile is a directed tree where each node has one parent. A
+/// node (ContextNode) corresponds to a function activation. The root of the
+/// tree is at a function that was marked as entrypoint to the compiler. A node
+/// stores counter values for edges and a vector of subcontexts. These are the
+/// contexts of callees. The index in the subcontext vector corresponds to the
+/// index of the callsite (as was instrumented via llvm.instrprof.callsite). At
+/// that index we find a linked list, potentially empty, of ContextNodes. Direct
+/// calls will have 0 or 1 values in the linked list, but indirect callsites may
+/// have more.
+///
+/// The ContextNode has a fixed sized header describing it - the GUID of the
+/// function, the size of the counter and callsite vectors. It is also an
+/// (intrusive) linked list for the purposes of the indirect call case above.
+///
+/// Allocation is expected to happen on an Arena. The allocation lays out inline
+/// the counter and subcontexts vectors. The class offers APIs to correctly
+/// reference the latter.
+///
+/// The layout is as follows:
+///
+/// [[declared fields][counters vector][vector of ptrs to subcontexts]]
+///
+/// See also documentation on the counters and subContexts members below.
+///
+/// The structure of the ContextNode is known to LLVM, because LLVM needs to:
+///   (1) increment counts, and
+///   (2) form a GEP for the position in the subcontext list of a callsite
+/// This means changes to LLVM contextual profile lowering and changes here
+/// must be coupled.
+/// Note: the header content isn't interesting to LLVM (other than its size)
+///
+/// Part of contextual collection is the notion of "scratch contexts". These are
+/// buffers that are "large enough" to allow for memory-safe acceses during
+/// counter increments - meaning the counter increment code in LLVM doesn't need
+/// to be concerned with memory safety. Their subcontexts never get populated,
+/// though. The runtime code here produces and recognizes them.
+
+using GUID = uint64_t;
+
+class ContextNode final {
+  const GUID Guid;
+  ContextNode *const Next;
+  const uint32_t NrCounters;
+  const uint32_t NrCallsites;
+
+public:
+  ContextNode(GUID Guid, uint32_t NrCounters, uint32_t NrCallsites,
+              ContextNode *Next = nullptr)
+      : Guid(Guid), Next(Next), NrCounters(NrCounters),
+        NrCallsites(NrCallsites) {}
+
+  static inline size_t getAllocSize(uint32_t NrCounters, uint32_t NrCallsites) {
+    return sizeof(ContextNode) + sizeof(uint64_t) * NrCounters +
+           sizeof(ContextNode *) * NrCallsites;
+  }
+
+  // The counters vector starts right after the static header.
+  uint64_t *counters() {
+    ContextNode *addr_after = &(this[1]);
+    return reinterpret_cast<uint64_t *>(addr_after);
+  }
+
+  uint32_t counters_size() const { return NrCounters; }
+  uint32_t callsites_size() const { return NrCallsites; }
+
+  const uint64_t *counters() const {
+    return const_cast<ContextNode *>(this)->counters();
+  }
+
+  // The subcontexts vector starts right after the end of the counters vector.
+  ContextNode **subContexts() {
+    return reinterpret_cast<ContextNode **>(&(counters()[NrCounters]));
+  }
+
+  ContextNode *const *subContexts() const {
+    return const_cast<ContextNode *>(this)->subContexts();
+  }
+
+  GUID guid() const { return Guid; }
+  ContextNode *next() const { return Next; }
+
+  size_t size() const { return getAllocSize(NrCounters, NrCallsites); }
+
+  uint64_t entrycount() const { return counters()[0]; }
+};
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
index 68bfe5c1ae614..9a064a1f194e0 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
+++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
@@ -90,6 +90,26 @@ bool validate(const ContextRoot *Root) {
   }
   return true;
 }
+
+inline ContextNode *allocContextNode(char *Place, GUID Guid,
+                                     uint32_t NrCounters, uint32_t NrCallsites,
+                                     ContextNode *Next = nullptr) {
+  assert(reinterpret_cast<uint64_t>(Place) % ExpectedAlignment == 0);
+  return new (Place) ContextNode(Guid, NrCounters, NrCallsites, Next);
+}
+
+void resetContextNode(ContextNode &Node) {
+  // FIXME(mtrofin): this is std::memset, which we can probably use if we
+  // drop/reduce the dependency on sanitizer_common.
+  for (uint32_t I = 0; I < Node.counters_size(); ++I)
+    Node.counters()[I] = 0;
+  for (uint32_t I = 0; I < Node.callsites_size(); ++I)
+    for (auto *Next = Node.subContexts()[I]; Next; Next = Next->next())
+      resetContextNode(*Next);
+}
+
+void onContextEnter(ContextNode &Node) { ++Node.counters()[0]; }
+
 } // namespace
 
 // the scratch buffer - what we give when we can't produce a real context (the
@@ -134,27 +154,9 @@ void Arena::freeArenaList(Arena *&A) {
   A = nullptr;
 }
 
-inline ContextNode *ContextNode::alloc(char *Place, GUID Guid,
-                                       uint32_t NrCounters,
-                                       uint32_t NrCallsites,
-                                       ContextNode *Next) {
-  assert(reinterpret_cast<uint64_t>(Place) % ExpectedAlignment == 0);
-  return new (Place) ContextNode(Guid, NrCounters, NrCallsites, Next);
-}
-
-void ContextNode::reset() {
-  // FIXME(mtrofin): this is std::memset, which we can probably use if we
-  // drop/reduce the dependency on sanitizer_common.
-  for (uint32_t I = 0; I < NrCounters; ++I)
-    counters()[I] = 0;
-  for (uint32_t I = 0; I < NrCallsites; ++I)
-    for (auto *Next = subContexts()[I]; Next; Next = Next->Next)
-      Next->reset();
-}
-
 // If this is the first time we hit a callsite with this (Guid) particular
 // callee, we need to allocate.
-ContextNode *getCallsiteSlow(uint64_t Guid, ContextNode **InsertionPoint,
+ContextNode *getCallsiteSlow(GUID Guid, ContextNode **InsertionPoint,
                              uint32_t NrCounters, uint32_t NrCallsites) {
   auto AllocSize = ContextNode::getAllocSize(NrCounters, NrCallsites);
   auto *Mem = __llvm_ctx_profile_current_context_root->CurrentMem;
@@ -169,8 +171,8 @@ ContextNode *getCallsiteSlow(uint64_t Guid, ContextNode **InsertionPoint,
         Mem->allocateNewArena(getArenaAllocSize(AllocSize), Mem);
     AllocPlace = Mem->tryBumpAllocate(AllocSize);
   }
-  auto *Ret = ContextNode::alloc(AllocPlace, Guid, NrCounters, NrCallsites,
-                                 *InsertionPoint);
+  auto *Ret = allocContextNode(AllocPlace, Guid, NrCounters, NrCallsites,
+                               *InsertionPoint);
   *InsertionPoint = Ret;
   return Ret;
 }
@@ -224,7 +226,7 @@ ContextNode *__llvm_ctx_profile_get_context(void *Callee, GUID Guid,
                         "Context: %p, Asked: %lu %u %u, Got: %lu %u %u \n",
                         Ret, Guid, NrCallsites, NrCounters, Ret->guid(),
                         Ret->callsites_size(), Ret->counters_size());
-  Ret->onEntry();
+  onContextEnter(*Ret);
   return Ret;
 }
 
@@ -241,8 +243,8 @@ void setupContext(ContextRoot *Root, GUID Guid, uint32_t NrCounters,
   auto *M = Arena::allocateNewArena(getArenaAllocSize(Needed));
   Root->FirstMemBlock = M;
   Root->CurrentMem = M;
-  Root->FirstNode = ContextNode::alloc(M->tryBumpAllocate(Needed), Guid,
-                                       NrCounters, NrCallsites);
+  Root->FirstNode = allocContextNode(M->tryBumpAllocate(Needed), Guid,
+                                     NrCounters, NrCallsites);
   AllContextRoots.PushBack(Root);
 }
 
@@ -254,7 +256,7 @@ ContextNode *__llvm_ctx_profile_start_context(
   }
   if (Root->Taken.TryLock()) {
     __llvm_ctx_profile_current_context_root = Root;
-    Root->FirstNode->onEntry();
+    onContextEnter(*Root->FirstNode);
     return Root->FirstNode;
   }
   // If this thread couldn't take the lock, return scratch context.
@@ -281,7 +283,7 @@ void __llvm_ctx_profile_start_collection() {
     for (auto *Mem = Root->FirstMemBlock; Mem; Mem = Mem->next())
       ++NrMemUnits;
 
-    Root->FirstNode->reset();
+    resetContextNode(*Root->FirstNode);
   }
   __sanitizer::Printf("[ctxprof] Initial NrMemUnits: %zu \n", NrMemUnits);
 }
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h
index 8c4be5d8a23a7..9e41335950e10 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h
+++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h
@@ -13,7 +13,7 @@
 #include <sanitizer/common_interface_defs.h>
 
 namespace __ctx_profile {
-using GUID = uint64_t;
+
 static constexpr size_t ExpectedAlignment = 8;
 // We really depend on this, see further below. We currently support x86_64.
 // When we want to support other archs, we need to trace the places Alignment is
@@ -62,98 +62,7 @@ class Arena final {
 // it to be thus aligned.
 static_assert(alignof(Arena) == ExpectedAlignment);
 
-/// The contextual profile is a directed tree where each node has one parent. A
-/// node (ContextNode) corresponds to a function activation. The root of the
-/// tree is at a function that was marked as entrypoint to the compiler. A node
-/// stores counter values for edges and a vector of subcontexts. These are the
-/// contexts of callees. The index in the subcontext vector corresponds to the
-/// index of the callsite (as was instrumented via llvm.instrprof.callsite). At
-/// that index we find a linked list, potentially empty, of ContextNodes. Direct
-/// calls will have 0 or 1 values in the linked list, but indirect callsites may
-/// have more.
-///
-/// The ContextNode has a fixed sized header describing it - the GUID of the
-/// function, the size of the counter and callsite vectors. It is also an
-/// (intrusive) linked list for the purposes of the indirect call case above.
-///
-/// Allocation is expected to happen on an Arena. The allocation lays out inline
-/// the counter and subcontexts vectors. The class offers APIs to correctly
-/// reference the latter.
-///
-/// The layout is as follows:
-///
-/// [[declared fields][counters vector][vector of ptrs to subcontexts]]
-///
-/// See also documentation on the counters and subContexts members below.
-///
-/// The structure of the ContextNode is known to LLVM, because LLVM needs to:
-///   (1) increment counts, and
-///   (2) form a GEP for the position in the subcontext list of a callsite
-/// This means changes to LLVM contextual profile lowering and changes here
-/// must be coupled.
-/// Note: the header content isn't interesting to LLVM (other than its size)
-///
-/// Part of contextual collection is the notion of "scratch contexts". These are
-/// buffers that are "large enough" to allow for memory-safe acceses during
-/// counter increments - meaning the counter increment code in LLVM doesn't need
-/// to be concerned with memory safety. Their subcontexts never get populated,
-/// though. The runtime code here produces and recognizes them.
-class ContextNode final {
-  const GUID Guid;
-  ContextNode *const Next;
-  const uint32_t NrCounters;
-  const uint32_t NrCallsites;
-
-public:
-  ContextNode(GUID Guid, uint32_t NrCounters, uint32_t NrCallsites,
-              ContextNode *Next = nullptr)
-      : Guid(Guid), Next(Next), NrCounters(NrCounters),
-        NrCallsites(NrCallsites) {}
-  static inline ContextNode *alloc(char *Place, GUID Guid, uint32_t NrCounters,
-                                   uint32_t NrCallsites,
-                                   ContextNode *Next = nullptr);
-
-  static inline size_t getAllocSize(uint32_t NrCounters, uint32_t NrCallsites) {
-    return sizeof(ContextNode) + sizeof(uint64_t) * NrCounters +
-           sizeof(ContextNode *) * NrCallsites;
-  }
-
-  // The counters vector starts right after the static header.
-  uint64_t *counters() {
-    ContextNode *addr_after = &(this[1]);
-    return reinterpret_cast<uint64_t *>(addr_after);
-  }
-
-  uint32_t counters_size() const { return NrCounters; }
-  uint32_t callsites_size() const { return NrCallsites; }
-
-  const uint64_t *counters() const {
-    return const_cast<ContextNode *>(this)->counters();
-  }
-
-  // The subcontexts vector starts right after the end of the counters vector.
-  ContextNode **subContexts() {
-    return reinterpret_cast<ContextNode **>(&(counters()[NrCounters]));
-  }
-
-  ContextNode *const *subContexts() const {
-    return const_cast<ContextNode *>(this)->subContexts();
-  }
-
-  GUID guid() const { return Guid; }
-  ContextNode *next() { return Next; }
-
-  size_t size() const { return getAllocSize(NrCounters, NrCallsites); }
-
-  void reset();
-
-  // since we go through the runtime to get a context back to LLVM, in the entry
-  // basic block, might as well handle incrementing the entry basic block
-  // counter.
-  void onEntry() { ++counters()[0]; }
-
-  uint64_t entrycount() const { return counters()[0]; }
-};
+#include "CtxInstrContextNode.inc"
 
 // Verify maintenance to ContextNode doesn't change this invariant, which makes
 // sure the inlined vectors are appropriately aligned.
diff --git a/compiler-rt/test/ctx_profile/CMakeLists.txt b/compiler-rt/test/ctx_profile/CMakeLists.txt
index 23c6fb16ed1f4..371f1a2dcbb05 100644
--- a/compiler-rt/test/ctx_profile/CMakeLists.txt
+++ b/compiler-rt/test/ctx_profile/CMakeLists.txt
@@ -2,6 +2,28 @@ set(CTX_PROFILE_LIT_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
 
 set(CTX_PROFILE_TESTSUITES)
 
+macro(get_bits_for_arch arch bits)
+  if (${arch} MATCHES "x86_64")
+    set(${bits} 64)
+  else()
+    message(FATAL_ERROR "Unexpected target architecture: ${arch}")
+  endif()
+endmacro()
+
+set(CTX_PROFILE_TEST_DEPS ${SANITIZER_COMMON_LIT_TEST_DEPS} ctx_profile)
+
+foreach(arch ${CTX_PROFILE_SUPPORTED_ARCH})
+  set(CTX_PROFILE_TEST_TARGET_ARCH ${arch})
+  string(TOLOWER "-${arch}-${OS_NAME}" CTX_PROFILE_TEST_CONFIG_SUFFIX)
+  string(TOUPPER ${arch} ARCH_UPPER_CASE)
+  set(CONFIG_NAME ${ARCH_UPPER_CASE}${OS_NAME}Config)
+  configure_lit_site_cfg(
+    ${CMAKE_CURRENT_SOURCE_DIR}/lit.site.cfg.py.in
+    ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME}/lit.site.cfg.py
+    )
+  list(APPEND CTX_PROFILE_TESTSUITES ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME})
+endforeach()
+
 # Add unit tests.
 if(COMPILER_RT_INCLUDE_TESTS)
   foreach(arch ${CTX_PROFILE_SUPPORTED_ARCH})
diff --git a/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test b/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test
new file mode 100644
index 0000000000000..6b4dba3227a45
--- /dev/null
+++ b/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test
@@ -0,0 +1,5 @@
+;
+; NOTE: if this test fails, please make sure the two files are identical copies
+; of eachother.
+;
+; RUN: diff %crt_src/lib/ctx_profile/CtxInstrContextNode.inc %llvm_src/lib/ProfileData/CtxInstrContextNode.inc
diff --git a/compiler-rt/test/ctx_profile/lit.cfg.py b/compiler-rt/test/ctx_profile/lit.cfg.py
new file mode 100644
index 0000000000000..327f48a1f88d5
--- /dev/null
+++ b/compiler-rt/test/ctx_profile/lit.cfg.py
@@ -0,0 +1,33 @@
+# -*- Python -*-
+
+import os
+import platform
+import re
+
+import lit.formats
+
+# Only run the tests on supported OSs.
+if config.host_os not in ["Linux"]:
+    config.unsupported = True
+
+def get_required_attr(config, attr_name):
+    attr_value = getattr(config, attr_name, None)
+    if attr_value == None:
+        lit_config.fatal(
+            "No attribute %r in test configuration! You may need to run "
+            "tests from your build directory or add this attribute "
+            "to lit.site.cfg.py " % attr_name
+        )
+    return attr_value
+
+
+# Setup config name.
+config.name = "CtxProfile" + config.name_suffix
+
+# Setup source root.
+config.test_source_root = os.path.dirname(__file__)
+# Default test suffixes.
+config.suffixes = [".c", ".cpp", ".test"]
+
+config.substitutions.append(("%crt_src", config.compiler_rt_src_root))
+config.substitutions.append(("%llvm_src", config.llvm_src_root))
diff --git a/compiler-rt/test/ctx_profile/lit.site.cfg.py.in b/compiler-rt/test/ctx_profile/lit.site.cfg.py.in
new file mode 100644
index 0000000000000..e8df42d097d89
--- /dev/null
+++ b/compiler-rt/test/ctx_profile/lit.site.cfg.py.in
@@ -0,0 +1,14 @@
+ at LIT_SITE_CFG_IN_HEADER@
+
+# Tool-specific config options.
+config.name_suffix = "@CTX_PROFILE_TEST_CONFIG_SUFFIX@"
+config.target_cflags = "@CTX_PROFILE_TEST_TARGET_CFLAGS@"
+config.clang = "@CTX_PROFILE_TEST_TARGET_CC@"
+config.bits = "@CTX_PROFILE_TEST_BITS@"
+config.target_arch = "@CTX_PROFILE_TEST_TARGET_ARCH@"
+
+# Load common config for all compiler-rt lit tests.
+lit_config.load_config(config, "@COMPILER_RT_BINARY_DIR@/test/lit.common.configured")
+
+# Load tool-specific config that would do the real work.
+lit_config.load_config(config, "@CTX_PROFILE_LIT_SOURCE_DIR@/lit.cfg.py")
diff --git a/llvm/lib/ProfileData/CtxInstrContextNode.inc b/llvm/lib/ProfileData/CtxInstrContextNode.inc
new file mode 100644
index 0000000000000..06be172fe1503
--- /dev/null
+++ b/llvm/lib/ProfileData/CtxInstrContextNode.inc
@@ -0,0 +1,102 @@
+/*===- CtxInstrContextNode.inc- Contextual instrumentation-based PGO  -----===*\
+|*
+|* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+|* See https://llvm.org/LICENSE.txt for license information.
+|* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+|*
+\*===----------------------------------------------------------------------===*/
+
+//==============================================================================
+//
+// NOTE!
+// llvm/lib/ProfileData/CtxInstrContextNode.inc and
+//   compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
+// must be exact copies of eachother
+//
+//==============================================================================
+
+/// The contextual profile is a directed tree where each node has one parent. A
+/// node (ContextNode) corresponds to a function activation. The root of the
+/// tree is at a function that was marked as entrypoint to the compiler. A node
+/// stores counter values for edges and a vector of subcontexts. These are the
+/// contexts of callees. The index in the subcontext vector corresponds to the
+/// index of the callsite (as was instrumented via llvm.instrprof.callsite). At
+/// that index we find a linked list, potentially empty, of ContextNodes. Direct
+/// calls will have 0 or 1 values in the linked list, but indirect callsites may
+/// have more.
+///
+/// The ContextNode has a fixed sized header describing it - the GUID of the
+/// function, the size of the counter and callsite vectors. It is also an
+/// (intrusive) linked list for the purposes of the indirect call case above.
+///
+/// Allocation is expected to happen on an Arena. The allocation lays out inline
+/// the counter and subcontexts vectors. The class offers APIs to correctly
+/// reference the latter.
+///
+/// The layout is as follows:
+///
+/// [[declared fields][counters vector][vector of ptrs to subcontexts]]
+///
+/// See also documentation on the counters and subContexts members below.
+///
+/// The structure of the ContextNode is known to LLVM, because LLVM needs to:
+///   (1) increment counts, and
+///   (2) form a GEP for the position in the subcontext list of a callsite
+/// This means changes to LLVM contextual profile lowering and changes here
+/// must be coupled.
+/// Note: the header content isn't interesting to LLVM (other than its size)
+///
+/// Part of contextual collection is the notion of "scratch contexts". These are
+/// buffers that are "large enough" to allow for memory-safe acceses during
+/// counter increments - meaning the counter increment code in LLVM doesn't need
+/// to be concerned with memory safety. Their subcontexts never get populated,
+/// though. The runtime code here produces and recognizes them.
+
+using GUID = uint64_t;
+
+class ContextNode final {
+  const GUID Guid;
+  ContextNode *const Next;
+  const uint32_t NrCounters;
+  const uint32_t NrCallsites;
+
+public:
+  ContextNode(GUID Guid, uint32_t NrCounters, uint32_t NrCallsites,
+              ContextNode *Next = nullptr)
+      : Guid(Guid), Next(Next), NrCounters(NrCounters),
+        NrCallsites(NrCallsites) {}
+
+  static inline size_t getAllocSize(uint32_t NrCounters, uint32_t NrCallsites) {
+    return sizeof(ContextNode) + sizeof(uint64_t) * NrCounters +
+           sizeof(ContextNode *) * NrCallsites;
+  }
+
+  // The counters vector starts right after the static header.
+  uint64_t *counters() {
+    ContextNode *addr_after = &(this[1]);
+    return reinterpret_cast<uint64_t *>(addr_after);
+  }
+
+  uint32_t counters_size() const { return NrCounters; }
+  uint32_t callsites_size() const { return NrCallsites; }
+
+  const uint64_t *counters() const {
+    return const_cast<ContextNode *>(this)->counters();
+  }
+
+  // The subcontexts vector starts right after the end of the counters vector.
+  ContextNode **subContexts() {
+    return reinterpret_cast<ContextNode **>(&(counters()[NrCounters]));
+  }
+
+  ContextNode *const *subContexts() const {
+    return const_cast<ContextNode *>(this)->subContexts();
+  }
+
+  GUID guid() const { return Guid; }
+  ContextNode *next() const { return Next; }
+
+  size_t size() const { return getAllocSize(NrCounters, NrCallsites); }
+
+  uint64_t entrycount() const { return counters()[0]; }
+};

>From f601885813d166f2a0c7bdbcb2fdf33163def56d Mon Sep 17 00:00:00 2001
From: Mircea Trofin <mtrofin at google.com>
Date: Thu, 9 May 2024 15:15:32 -0700
Subject: [PATCH 2/5] fix python style

---
 compiler-rt/test/ctx_profile/lit.cfg.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/compiler-rt/test/ctx_profile/lit.cfg.py b/compiler-rt/test/ctx_profile/lit.cfg.py
index 327f48a1f88d5..d2c7550538ad2 100644
--- a/compiler-rt/test/ctx_profile/lit.cfg.py
+++ b/compiler-rt/test/ctx_profile/lit.cfg.py
@@ -10,6 +10,7 @@
 if config.host_os not in ["Linux"]:
     config.unsupported = True
 
+
 def get_required_attr(config, attr_name):
     attr_value = getattr(config, attr_name, None)
     if attr_value == None:

>From 1871330ddf8e50c9435e2b488c0e1aeeb107deaa Mon Sep 17 00:00:00 2001
From: Mircea Trofin <mtrofin at google.com>
Date: Thu, 9 May 2024 16:35:58 -0700
Subject: [PATCH 3/5] proper header

---
 compiler-rt/lib/ctx_profile/CMakeLists.txt    |  2 +-
 ...rContextNode.inc => CtxInstrContextNode.h} | 34 +++++++++++++------
 .../lib/ctx_profile/CtxInstrProfiling.cpp     |  2 +-
 .../lib/ctx_profile/CtxInstrProfiling.h       | 25 +++++++-------
 .../TestCases/check-same-ctx-node.test        |  2 +-
 ...rContextNode.inc => CtxInstrContextNode.h} | 34 +++++++++++++------
 6 files changed, 63 insertions(+), 36 deletions(-)
 rename compiler-rt/lib/ctx_profile/{CtxInstrContextNode.inc => CtxInstrContextNode.h} (80%)
 rename llvm/lib/ProfileData/{CtxInstrContextNode.inc => CtxInstrContextNode.h} (80%)

diff --git a/compiler-rt/lib/ctx_profile/CMakeLists.txt b/compiler-rt/lib/ctx_profile/CMakeLists.txt
index e2bf1776cd76c..1fa70594b28a3 100644
--- a/compiler-rt/lib/ctx_profile/CMakeLists.txt
+++ b/compiler-rt/lib/ctx_profile/CMakeLists.txt
@@ -5,8 +5,8 @@ set(CTX_PROFILE_SOURCES
   )
 
 set(CTX_PROFILE_HEADERS
+  CtxInstrContextNode.h
   CtxInstrProfiling.h
-  CtxInstrContextNode.inc
   )
 
 include_directories(..)
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
similarity index 80%
rename from compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
rename to compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
index 06be172fe1503..1627bdfffd089 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
+++ b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
@@ -1,18 +1,21 @@
-/*===- CtxInstrContextNode.inc- Contextual instrumentation-based PGO  -----===*\
-|*
-|* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-|* See https://llvm.org/LICENSE.txt for license information.
-|* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-|*
-\*===----------------------------------------------------------------------===*/
-
+//===--- CtxInstrContextNode.h - Contextual Profile Node --------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
 //==============================================================================
 //
 // NOTE!
-// llvm/lib/ProfileData/CtxInstrContextNode.inc and
-//   compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
+// llvm/lib/ProfileData/CtxInstrContextNode.h and
+//   compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
 // must be exact copies of eachother
 //
+// compiler-rt creates these objects as part of the instrumentation runtime for
+// contextual profiling. LLVM only consumes them to convert a contextual tree
+// to a bitstream.
+//
 //==============================================================================
 
 /// The contextual profile is a directed tree where each node has one parent. A
@@ -52,6 +55,14 @@
 /// to be concerned with memory safety. Their subcontexts never get populated,
 /// though. The runtime code here produces and recognizes them.
 
+#ifndef LLVM_LIB_PROFILEDATA_CTXINSTRCONTEXTNODE_H
+#define LLVM_LIB_PROFILEDATA_CTXINSTRCONTEXTNODE_H
+
+#include <stdint.h>
+#include <stdlib.h>
+
+namespace llvm {
+namespace ctx_profile {
 using GUID = uint64_t;
 
 class ContextNode final {
@@ -100,3 +111,6 @@ class ContextNode final {
 
   uint64_t entrycount() const { return counters()[0]; }
 };
+} // namespace ctx_profile
+} // namespace llvm
+#endif
\ No newline at end of file
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
index 9a064a1f194e0..9a1e643961e4b 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
+++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
@@ -289,7 +289,7 @@ void __llvm_ctx_profile_start_collection() {
 }
 
 bool __llvm_ctx_profile_fetch(
-    void *Data, bool (*Writer)(void *W, const __ctx_profile::ContextNode &)) {
+    void *Data, bool (*Writer)(void *W, const ContextNode &)) {
   assert(Writer);
   __sanitizer::GenericScopedLock<__sanitizer::SpinMutex> Lock(
       &AllContextsMutex);
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h
index 9e41335950e10..69ce796b71e31 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h
+++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.h
@@ -9,9 +9,12 @@
 #ifndef CTX_PROFILE_CTXINSTRPROFILING_H_
 #define CTX_PROFILE_CTXINSTRPROFILING_H_
 
+#include "CtxInstrContextNode.h"
 #include "sanitizer_common/sanitizer_mutex.h"
 #include <sanitizer/common_interface_defs.h>
 
+using namespace llvm::ctx_profile;
+
 namespace __ctx_profile {
 
 static constexpr size_t ExpectedAlignment = 8;
@@ -62,8 +65,6 @@ class Arena final {
 // it to be thus aligned.
 static_assert(alignof(Arena) == ExpectedAlignment);
 
-#include "CtxInstrContextNode.inc"
-
 // Verify maintenance to ContextNode doesn't change this invariant, which makes
 // sure the inlined vectors are appropriately aligned.
 static_assert(alignof(ContextNode) == ExpectedAlignment);
@@ -128,8 +129,7 @@ extern "C" {
 extern __thread void *volatile __llvm_ctx_profile_expected_callee[2];
 /// TLS where LLVM stores the pointer inside a caller's subcontexts vector that
 /// corresponds to the callsite being lowered.
-extern __thread __ctx_profile::ContextNode *
-    *volatile __llvm_ctx_profile_callsite[2];
+extern __thread ContextNode **volatile __llvm_ctx_profile_callsite[2];
 
 // __llvm_ctx_profile_current_context_root is exposed for unit testing,
 // othwerise it's only used internally by compiler-rt/ctx_profile.
@@ -138,10 +138,9 @@ extern __thread __ctx_profile::ContextRoot
 
 /// called by LLVM in the entry BB of a "entry point" function. The returned
 /// pointer may be "tainted" - its LSB set to 1 - to indicate it's scratch.
-__ctx_profile::ContextNode *
-__llvm_ctx_profile_start_context(__ctx_profile::ContextRoot *Root,
-                                 __ctx_profile::GUID Guid, uint32_t Counters,
-                                 uint32_t Callsites);
+ContextNode *__llvm_ctx_profile_start_context(__ctx_profile::ContextRoot *Root,
+                                              GUID Guid, uint32_t Counters,
+                                              uint32_t Callsites);
 
 /// paired with __llvm_ctx_profile_start_context, and called at the exit of the
 /// entry point function.
@@ -149,9 +148,9 @@ void __llvm_ctx_profile_release_context(__ctx_profile::ContextRoot *Root);
 
 /// called for any other function than entry points, in the entry BB of such
 /// function. Same consideration about LSB of returned value as .._start_context
-__ctx_profile::ContextNode *
-__llvm_ctx_profile_get_context(void *Callee, __ctx_profile::GUID Guid,
-                               uint32_t NrCounters, uint32_t NrCallsites);
+ContextNode *__llvm_ctx_profile_get_context(void *Callee, GUID Guid,
+                                            uint32_t NrCounters,
+                                            uint32_t NrCallsites);
 
 /// Prepares for collection. Currently this resets counter values but preserves
 /// internal context tree structure.
@@ -166,7 +165,7 @@ void __llvm_ctx_profile_free();
 /// The Writer's first parameter plays the role of closure for Writer, and is
 /// what the caller of __llvm_ctx_profile_fetch passes as the Data parameter.
 /// The second parameter is the root of a context tree.
-bool __llvm_ctx_profile_fetch(
-    void *Data, bool (*Writer)(void *, const __ctx_profile::ContextNode &));
+bool __llvm_ctx_profile_fetch(void *Data,
+                              bool (*Writer)(void *, const ContextNode &));
 }
 #endif // CTX_PROFILE_CTXINSTRPROFILING_H_
diff --git a/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test b/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test
index 6b4dba3227a45..37d36dbb93791 100644
--- a/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test
+++ b/compiler-rt/test/ctx_profile/TestCases/check-same-ctx-node.test
@@ -2,4 +2,4 @@
 ; NOTE: if this test fails, please make sure the two files are identical copies
 ; of eachother.
 ;
-; RUN: diff %crt_src/lib/ctx_profile/CtxInstrContextNode.inc %llvm_src/lib/ProfileData/CtxInstrContextNode.inc
+; RUN: diff %crt_src/lib/ctx_profile/CtxInstrContextNode.h %llvm_src/lib/ProfileData/CtxInstrContextNode.h
diff --git a/llvm/lib/ProfileData/CtxInstrContextNode.inc b/llvm/lib/ProfileData/CtxInstrContextNode.h
similarity index 80%
rename from llvm/lib/ProfileData/CtxInstrContextNode.inc
rename to llvm/lib/ProfileData/CtxInstrContextNode.h
index 06be172fe1503..1627bdfffd089 100644
--- a/llvm/lib/ProfileData/CtxInstrContextNode.inc
+++ b/llvm/lib/ProfileData/CtxInstrContextNode.h
@@ -1,18 +1,21 @@
-/*===- CtxInstrContextNode.inc- Contextual instrumentation-based PGO  -----===*\
-|*
-|* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-|* See https://llvm.org/LICENSE.txt for license information.
-|* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-|*
-\*===----------------------------------------------------------------------===*/
-
+//===--- CtxInstrContextNode.h - Contextual Profile Node --------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
 //==============================================================================
 //
 // NOTE!
-// llvm/lib/ProfileData/CtxInstrContextNode.inc and
-//   compiler-rt/lib/ctx_profile/CtxInstrContextNode.inc
+// llvm/lib/ProfileData/CtxInstrContextNode.h and
+//   compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
 // must be exact copies of eachother
 //
+// compiler-rt creates these objects as part of the instrumentation runtime for
+// contextual profiling. LLVM only consumes them to convert a contextual tree
+// to a bitstream.
+//
 //==============================================================================
 
 /// The contextual profile is a directed tree where each node has one parent. A
@@ -52,6 +55,14 @@
 /// to be concerned with memory safety. Their subcontexts never get populated,
 /// though. The runtime code here produces and recognizes them.
 
+#ifndef LLVM_LIB_PROFILEDATA_CTXINSTRCONTEXTNODE_H
+#define LLVM_LIB_PROFILEDATA_CTXINSTRCONTEXTNODE_H
+
+#include <stdint.h>
+#include <stdlib.h>
+
+namespace llvm {
+namespace ctx_profile {
 using GUID = uint64_t;
 
 class ContextNode final {
@@ -100,3 +111,6 @@ class ContextNode final {
 
   uint64_t entrycount() const { return counters()[0]; }
 };
+} // namespace ctx_profile
+} // namespace llvm
+#endif
\ No newline at end of file

>From 35cbfd40c9b7ccd6942c94d589a0d3259d5b52c4 Mon Sep 17 00:00:00 2001
From: Mircea Trofin <mtrofin at google.com>
Date: Thu, 9 May 2024 16:46:27 -0700
Subject: [PATCH 4/5] moved the test substitutions for paths more centrally

---
 compiler-rt/test/ctx_profile/lit.cfg.py | 3 ---
 compiler-rt/test/lit.common.cfg.py      | 6 ++++++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/compiler-rt/test/ctx_profile/lit.cfg.py b/compiler-rt/test/ctx_profile/lit.cfg.py
index d2c7550538ad2..a56dabb8ebeb3 100644
--- a/compiler-rt/test/ctx_profile/lit.cfg.py
+++ b/compiler-rt/test/ctx_profile/lit.cfg.py
@@ -29,6 +29,3 @@ def get_required_attr(config, attr_name):
 config.test_source_root = os.path.dirname(__file__)
 # Default test suffixes.
 config.suffixes = [".c", ".cpp", ".test"]
-
-config.substitutions.append(("%crt_src", config.compiler_rt_src_root))
-config.substitutions.append(("%llvm_src", config.llvm_src_root))
diff --git a/compiler-rt/test/lit.common.cfg.py b/compiler-rt/test/lit.common.cfg.py
index 28f126a11b16b..fae1d1686e569 100644
--- a/compiler-rt/test/lit.common.cfg.py
+++ b/compiler-rt/test/lit.common.cfg.py
@@ -987,3 +987,9 @@ def is_windows_lto_supported():
     gcc_dir = os.path.dirname(config.clang)
     libasan_dir = os.path.join(gcc_dir, "..", "lib" + config.bits)
     push_dynamic_library_lookup_path(config, libasan_dir)
+
+
+# Help tests that make sure certain files are in-sync between compiler-rt and
+# llvm.
+config.substitutions.append(("%crt_src", config.compiler_rt_src_root))
+config.substitutions.append(("%llvm_src", config.llvm_src_root))

>From 2aa9d9f1ec541a670093c41a5b075dd26a4d64fe Mon Sep 17 00:00:00 2001
From: Mircea Trofin <mtrofin at google.com>
Date: Thu, 9 May 2024 16:51:01 -0700
Subject: [PATCH 5/5] formatting

---
 compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
index 9a1e643961e4b..c5d167bf996ab 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
+++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
@@ -288,8 +288,8 @@ void __llvm_ctx_profile_start_collection() {
   __sanitizer::Printf("[ctxprof] Initial NrMemUnits: %zu \n", NrMemUnits);
 }
 
-bool __llvm_ctx_profile_fetch(
-    void *Data, bool (*Writer)(void *W, const ContextNode &)) {
+bool __llvm_ctx_profile_fetch(void *Data,
+                              bool (*Writer)(void *W, const ContextNode &)) {
   assert(Writer);
   __sanitizer::GenericScopedLock<__sanitizer::SpinMutex> Lock(
       &AllContextsMutex);



More information about the llvm-commits mailing list