[llvm] [llvm]Add a simple Telemetry framework (PR #102323)

Vy Nguyen via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 23 14:42:50 PDT 2024


================
@@ -0,0 +1,136 @@
+//===- llvm/Telemetry/Telemetry.h - Telemetry -------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// This file provides the basic framework for Telemetry
+///
+/// It comprises of three important structs/classes:
+///
+/// - Telemeter: The class responsible for collecting and forwarding
+///              telemery data.
+/// - TelemetryInfo: data courier
+/// - TelemetryConfig: this stores configurations on Telemeter.
+///
+/// Refer to its documentation at llvm/docs/Telemetry.rst for more details.
+//===---------------------------------------------------------------------===//
+
+#ifndef LLVM_TELEMETRY_TELEMETRY_H
+#define LLVM_TELEMETRY_TELEMETRY_H
+
+#include <chrono>
+#include <ctime>
+#include <memory>
+#include <optional>
+#include <string>
+
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/JSON.h"
+
+namespace llvm {
+namespace telemetry {
+
+/// Configuration for the Telemeter class.
+/// This stores configurations from both users and vendors and is passed
+/// to the Telemeter upon construction. (Any changes to the config after
+/// the Telemeter's construction will not have effect on it).
+///
+/// This struct can be extended as needed to add additional configuration
+/// points specific to a vendor's implementation.
+struct Config {
+  // If true, telemetry will be enabled.
+  bool EnableTelemetry;
+
+  // Implementation-defined names of additional destinations to send
+  // telemetry data (Could be stdout, stderr, or some local paths, etc).
+  //
+  // These strings will be interpreted by the vendor's code.
+  // So the users must pick the from their vendor's pre-defined
+  // set of Destinations.
+  std::vector<std::string> AdditionalDestinations;
+};
+
+/// For isa, dyn_cast, etc operations on TelemetryInfo.
+typedef unsigned KindType;
+/// This struct is used by TelemetryInfo to support isa<>, dyn_cast<>
+/// operations.
+/// It is defined as a struct (rather than an enum) because it is
+/// expectend to be extended by subclasses which may have
+/// additional TelemetryInfo types defined to describe different events.
+struct EntryKind {
+  static const KindType Base = 0;
+};
+
+/// TelemetryInfo is the data courier, used to move instrumented data
+/// from the tool being monitored to the Telemery framework.
+///
+/// This base class contains only the basic set of telemetry data.
+/// Downstream implementations can add more fields as needed to describe
+/// additional events.
+///
+/// For example, The LLDB debugger can define a DebugCommandInfo subclass
+/// which has additional fields about the debug-command being instrumented,
+/// such as `CommandArguments` or `CommandName`.
+struct TelemetryInfo {
+  // This represents a unique-id, conventionally corresponding to
+  // a tool's session - i.e., every time the tool starts until it exits.
+  //
+  // Note: a tool could have multiple sessions running at once, in which
+  // case, these shall be multiple sets of TelemetryInfo with multiple unique
+  // ids.
+  //
+  // Different usages can assign different types of IDs to this field.
+  std::string SessionId;
+
+  TelemetryInfo() = default;
+  virtual ~TelemetryInfo() = default;
+
+  virtual json::Object serializeToJson() const;
----------------
oontvoo wrote:

I'd like to clarify the rationale of this a bit. Previously(see comments on Sept 3), you raised an issue with the design of `Destination` class where you said it tried to do two things  ("decide how and where to send the data"). To address that, I've taken the "how" (ie.,  "how to serialize" )part out and put it into this TelemetryInfo class. Now in retrospect, maybe the documentation should have been written simply as "The Destination class is responsible for receiving telemetry data" (where the data eventually go AFTER that is its own implementation). Would that have addressed your concern?

Second point, w.r.t your concern about the explosion of `(serialization-methods) x (final destinations)`, I think that should be an implementation detail on the `Destination` class(es). If the vendor wants to delegate the serialization-work to some common util to be shared amongst different Destinations(which, of course, they should), then they can do that - but it doesn't need to be explicitly specified here in the upstream interface.

Third point, your proposal for having the `Serializer` class does not work well for LLDB (and for our internal) use cases:
 - If we want to write structured-data, then we'd end up replicating most of the `llvm::json` interface
 - If we want to serialize to protobuf, which is what we would use internally at Google, then it's not quite straightforward because the code here would have to tease out the right key string.

This was also why I initially wanted to "hide" the serialization step inside `Destination`. The idea is that the `Destination` class would take an Info object directly, pick out what fields it wants,  and put them into whatever serialization form defined by the vendor. (Again, the idea of sharing the serialization code would still be possible, but it's an implementation detail and not part of the interface).

Does this sound reasonable?

https://github.com/llvm/llvm-project/pull/102323


More information about the llvm-commits mailing list