[llvm] [llvm]Add a simple Telemetry framework (PR #102323)

Vy Nguyen via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 29 10:54:05 PDT 2024


================
@@ -0,0 +1,136 @@
+//===- llvm/Telemetry/Telemetry.h - Telemetry -------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// This file provides the basic framework for Telemetry
+///
+/// It comprises of three important structs/classes:
+///
+/// - Telemeter: The class responsible for collecting and forwarding
+///              telemery data.
+/// - TelemetryInfo: data courier
+/// - TelemetryConfig: this stores configurations on Telemeter.
+///
+/// Refer to its documentation at llvm/docs/Telemetry.rst for more details.
+//===---------------------------------------------------------------------===//
+
+#ifndef LLVM_TELEMETRY_TELEMETRY_H
+#define LLVM_TELEMETRY_TELEMETRY_H
+
+#include <chrono>
+#include <ctime>
+#include <memory>
+#include <optional>
+#include <string>
+
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/JSON.h"
+
+namespace llvm {
+namespace telemetry {
+
+/// Configuration for the Telemeter class.
+/// This stores configurations from both users and vendors and is passed
+/// to the Telemeter upon construction. (Any changes to the config after
+/// the Telemeter's construction will not have effect on it).
+///
+/// This struct can be extended as needed to add additional configuration
+/// points specific to a vendor's implementation.
+struct Config {
+  // If true, telemetry will be enabled.
+  bool EnableTelemetry;
+
+  // Implementation-defined names of additional destinations to send
+  // telemetry data (Could be stdout, stderr, or some local paths, etc).
+  //
+  // These strings will be interpreted by the vendor's code.
+  // So the users must pick the from their vendor's pre-defined
+  // set of Destinations.
+  std::vector<std::string> AdditionalDestinations;
+};
+
+/// For isa, dyn_cast, etc operations on TelemetryInfo.
+typedef unsigned KindType;
+/// This struct is used by TelemetryInfo to support isa<>, dyn_cast<>
+/// operations.
+/// It is defined as a struct (rather than an enum) because it is
+/// expectend to be extended by subclasses which may have
+/// additional TelemetryInfo types defined to describe different events.
+struct EntryKind {
+  static const KindType Base = 0;
+};
+
+/// TelemetryInfo is the data courier, used to move instrumented data
+/// from the tool being monitored to the Telemery framework.
+///
+/// This base class contains only the basic set of telemetry data.
+/// Downstream implementations can add more fields as needed to describe
+/// additional events.
+///
+/// For example, The LLDB debugger can define a DebugCommandInfo subclass
+/// which has additional fields about the debug-command being instrumented,
+/// such as `CommandArguments` or `CommandName`.
+struct TelemetryInfo {
+  // This represents a unique-id, conventionally corresponding to
+  // a tool's session - i.e., every time the tool starts until it exits.
+  //
+  // Note: a tool could have multiple sessions running at once, in which
+  // case, these shall be multiple sets of TelemetryInfo with multiple unique
+  // ids.
+  //
+  // Different usages can assign different types of IDs to this field.
+  std::string SessionId;
+
+  TelemetryInfo() = default;
+  virtual ~TelemetryInfo() = default;
+
+  virtual json::Object serializeToJson() const;
----------------
oontvoo wrote:

> I'm wondering if we've got slightly different visions of things and it's driving the way we're looking at the design differently. 

I think we have very similar ideas of how this ought to work, but there's some miscommunication probably. (ie., , your second paragraph describes pretty much how it is implemented, at least for LLDB's telemetry)


> Reading between the lines of some of the comments and responses you've made, I get the impression you expect either all concrete `TelemetryInfo` subclasses to be defined by vendors, or at least the all downstream (i.e. all) `Destination` classes to know how to convert them to whatever the Destination wants. 

 I want to point out that we should not conflate vendor-specific  code  with tool-specific code.
While it is true that I do not propose any concrete implementation here in `llvm::telemetry`, there *is* a concrete implementation in `lldb::telemetry`. 

So - there are two levels of (potential) overrides here:
 - One: by a tool that wants to use telemetry (eg., LLDB, clang, etc). 
 - Two: by different vendors (for the same tool).

I would expect, the bulk of code implementing subclasses of `TelemetryInfo` to be *upstream* in the individual tool's directory. That is to say, all of LLDB's subclasses of `TelemetryInfo` would live in `lldb/telemetry/...` (but not here in `llvm/telemetry`, because, as pointed out before, different tools care about different events and concepts).

Then, different vendors (for the same tool) will provide the concrete implementation of `Destination`. In this sense, I believe the `Destination` class does exactly what you envision the `Serializer` class would (ie., it takes a `TelemetryInfo` object and picks some or all of the data from that object and forward them elsewhere).

In other words, the only vendor-defined pieces are the concrete implementation of `Destination`
If you would like to have some share utils (eg., how to take a `TelemetryInfo` and turns it into a json object, you could still do that by putting the code in the tool's upstream directory, then your vendor-specific code can call into that)



https://github.com/llvm/llvm-project/pull/102323


More information about the llvm-commits mailing list