[llvm] [llvm]Add a simple Telemetry framework (PR #102323)

Pavel Labath via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 31 03:00:20 PDT 2024


================
@@ -0,0 +1,136 @@
+//===- llvm/Telemetry/Telemetry.h - Telemetry -------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// This file provides the basic framework for Telemetry
+///
+/// It comprises of three important structs/classes:
+///
+/// - Telemeter: The class responsible for collecting and forwarding
+///              telemery data.
+/// - TelemetryInfo: data courier
+/// - TelemetryConfig: this stores configurations on Telemeter.
+///
+/// Refer to its documentation at llvm/docs/Telemetry.rst for more details.
+//===---------------------------------------------------------------------===//
+
+#ifndef LLVM_TELEMETRY_TELEMETRY_H
+#define LLVM_TELEMETRY_TELEMETRY_H
+
+#include <chrono>
+#include <ctime>
+#include <memory>
+#include <optional>
+#include <string>
+
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/JSON.h"
+
+namespace llvm {
+namespace telemetry {
+
+/// Configuration for the Telemeter class.
+/// This stores configurations from both users and vendors and is passed
+/// to the Telemeter upon construction. (Any changes to the config after
+/// the Telemeter's construction will not have effect on it).
+///
+/// This struct can be extended as needed to add additional configuration
+/// points specific to a vendor's implementation.
+struct Config {
+  // If true, telemetry will be enabled.
+  bool EnableTelemetry;
+
+  // Implementation-defined names of additional destinations to send
+  // telemetry data (Could be stdout, stderr, or some local paths, etc).
+  //
+  // These strings will be interpreted by the vendor's code.
+  // So the users must pick the from their vendor's pre-defined
+  // set of Destinations.
+  std::vector<std::string> AdditionalDestinations;
+};
+
+/// For isa, dyn_cast, etc operations on TelemetryInfo.
+typedef unsigned KindType;
+/// This struct is used by TelemetryInfo to support isa<>, dyn_cast<>
+/// operations.
+/// It is defined as a struct (rather than an enum) because it is
+/// expectend to be extended by subclasses which may have
+/// additional TelemetryInfo types defined to describe different events.
+struct EntryKind {
+  static const KindType Base = 0;
+};
+
+/// TelemetryInfo is the data courier, used to move instrumented data
+/// from the tool being monitored to the Telemery framework.
+///
+/// This base class contains only the basic set of telemetry data.
+/// Downstream implementations can add more fields as needed to describe
+/// additional events.
+///
+/// For example, The LLDB debugger can define a DebugCommandInfo subclass
+/// which has additional fields about the debug-command being instrumented,
+/// such as `CommandArguments` or `CommandName`.
+struct TelemetryInfo {
+  // This represents a unique-id, conventionally corresponding to
+  // a tool's session - i.e., every time the tool starts until it exits.
+  //
+  // Note: a tool could have multiple sessions running at once, in which
+  // case, these shall be multiple sets of TelemetryInfo with multiple unique
+  // ids.
+  //
+  // Different usages can assign different types of IDs to this field.
+  std::string SessionId;
+
+  TelemetryInfo() = default;
+  virtual ~TelemetryInfo() = default;
+
+  virtual json::Object serializeToJson() const;
----------------
labath wrote:

> Regarding the details, I'd expect the theoretical `JsonSerializer` class to make use of the `llvm::json` interface, so converting the input types passed by the TelemetryInfo subclass `serialize` method into the appropriate Json types (aside: this is probably generic enough that it could live in the core LLVM code). Similarly, a `ProtobufSerializer` would do something equivalent. The keys are intended to be string literals typically that the `TelemetryInfo` subclasses specify in their `serialize` method - I'm not quite sure what you mean by the "right key string" in this context, as I'm not familiar with protobuf; it would help if you could elaborate some more what the issue here is.

Let me try to elaborate on this (I was discussing this with Vy last week). Protobufs are more strongly typed than json (you can basically think of them as structs which know how to serialize themselves to a byte stream). The natural way to work with them is to work with named accessors (`my_proto.set_field1(value1); my_proto.set_field2(value2)`). While they have a reflection API which allows accessing the fields using strings (which means a protobuf serializer could implement a virtual `write(StringRef key, ...)` function, it's a rather long-winded way of using them (you convert the field to a string in the TelemetryInfo object, and then immediately undo that in the reflection API. Since vendor "knows" about all of the telemetry types it wants to collect (*), it could do it more directly with `my_proto.set_field1(info.field1)`.

The second source of confusion (now I'm moving on to the "right key string" part) is that this `serializer.write` API (at least in the way I/we understand it) is only suitable for implementing simple dictionary types (`map<string, BasicType>`). It's not very suitable for more complex value types like lists or (sub)dictionaries, as you'd have to add a new (virtual) function for each new data type. (This can be avoided by creating some sort of a polymorphic value type that can hold arbitrary structured data, but at that point, we're sort of reinventing json::Value.)

(*) we assume that to be the case since the vendor needs to decide (based on its use case, data collection policy, etc.) whether a particular field can be collected or not

https://github.com/llvm/llvm-project/pull/102323


More information about the llvm-commits mailing list