[llvm] [llvm]Add a simple Telemetry framework (PR #102323)

Fri Sep 27 01:04:38 PDT 2024

================
@@ -0,0 +1,707 @@
+//===- llvm/unittest/Telemetry/TelemetryTest.cpp - Telemetry unittests ---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Telemetry/Telemetry.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/DataLayout.h"
+#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/JSON.h"
+#include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/raw_ostream.h"
+#include "gtest/gtest.h"
+#include <chrono>
+#include <ctime>
+#include <vector>
+
+// Testing parameters.
+// These are set by each test to force certain outcomes.
+// Since the tests may run in parallel, each test will have
+// its own TestContext populated.
+struct TestContext {
+  // Controlling whether there should be an Exit error (if so, what the
+  // expected exit message/description should be).
+  bool HasExitError = false;
+  std::string ExitMsg = "";
+
+  // Controlling whether there is a vendor-provided config for
+  // Telemetry.
+  bool HasVendorConfig = false;
+
+  // Controlling whether the data should be sanitized.
+  bool SanitizeData = false;
+
+  // These two fields data emitted by the framework for later
+  // verifications by the tests.
+  std::string Buffer = "";
+  std::vector<llvm::json::Object> EmittedJsons;
+
+  // The expected Uuid generated by the fake tool.
+  std::string ExpectedUuid = "";
+};
+
+// This is set by the test body.
+static thread_local TestContext *CurrentContext = nullptr;
+
+namespace llvm {
+namespace telemetry {
+namespace vendor_code {
+
+// Generate unique (but deterministic "uuid" for testing purposes).
+static std::string nextUuid() {
+  static std::atomic<int> seed = 1111;
+  return std::to_string(seed.fetch_add(1, std::memory_order_acquire));
+}
+
+struct VendorEntryKind {
+  static const KindType VendorCommon = 168; // 0b010101000
+  static const KindType Startup = 169;      // 0b010101001
+  static const KindType Exit = 170;         // 0b010101010
+};
+
+// Describes the exit signal of an event.
+// This is used by TelemetryInfo below.
+struct ExitDescription {
+  int ExitCode;
+  std::string Description;
+};
+
+// Defines a convenient type for timestamp of various events.
+// This is used by the EventStats below.
+using SteadyTimePoint = std::chrono::time_point<std::chrono::steady_clock>;
+
+// Various time (and possibly memory) statistics of an event.
+struct EventStats {
+  // REQUIRED: Start time of an event
+  SteadyTimePoint Start;
+  // OPTIONAL: End time of an event - may be empty if not meaningful.
+  std::optional<SteadyTimePoint> End;
+  // TBD: could add some memory stats here too?
+
+  EventStats() = default;
+  EventStats(SteadyTimePoint Start) : Start(Start) {}
+  EventStats(SteadyTimePoint Start, SteadyTimePoint End)
+      : Start(Start), End(End) {}
+};
+
+// Demonstrates that the TelemetryInfo (data courier) struct can be extended
+// by downstream code to store additional data as needed.
+// It can also define additional data serialization method.
+struct VendorCommonTelemetryInfo : public TelemetryInfo {
+  static bool classof(const TelemetryInfo *T) {
+    if (T == nullptr)
+      return false;
+    // Subclasses of this is also acceptable.
+    return (T->getKind() & VendorEntryKind::VendorCommon) ==
+           VendorEntryKind::VendorCommon;
----------------
labath wrote:

In my mind, the main advantage of my approach is that it allows for distributed creation of the class hierarchies -- there's no need for a central entity assigning bits. If all of the classes can be defined in a single place (e.g., one file), then it doesn't matter, but it can make a difference if the definitions are spread out over multiple files.

So, I'm leaving this up to you, with a note that this design is already sort of distributed the base entry type is defined here, and the implementations are going to be in other projects. It's not that bad, since this is only one entry here, and the implementations should not interact in principle, but it is there.

https://github.com/llvm/llvm-project/pull/102323