[llvm] [llvm]Add a simple Telemetry framework (PR #102323)

Fri Nov 1 08:48:50 PDT 2024

================
@@ -0,0 +1,136 @@
+//===- llvm/Telemetry/Telemetry.h - Telemetry -------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// This file provides the basic framework for Telemetry
+///
+/// It comprises of three important structs/classes:
+///
+/// - Telemeter: The class responsible for collecting and forwarding
+///              telemery data.
+/// - TelemetryInfo: data courier
+/// - TelemetryConfig: this stores configurations on Telemeter.
+///
+/// Refer to its documentation at llvm/docs/Telemetry.rst for more details.
+//===---------------------------------------------------------------------===//
+
+#ifndef LLVM_TELEMETRY_TELEMETRY_H
+#define LLVM_TELEMETRY_TELEMETRY_H
+
+#include <chrono>
+#include <ctime>
+#include <memory>
+#include <optional>
+#include <string>
+
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/JSON.h"
+
+namespace llvm {
+namespace telemetry {
+
+/// Configuration for the Telemeter class.
+/// This stores configurations from both users and vendors and is passed
+/// to the Telemeter upon construction. (Any changes to the config after
+/// the Telemeter's construction will not have effect on it).
+///
+/// This struct can be extended as needed to add additional configuration
+/// points specific to a vendor's implementation.
+struct Config {
+  // If true, telemetry will be enabled.
+  bool EnableTelemetry;
+
+  // Implementation-defined names of additional destinations to send
+  // telemetry data (Could be stdout, stderr, or some local paths, etc).
+  //
+  // These strings will be interpreted by the vendor's code.
+  // So the users must pick the from their vendor's pre-defined
+  // set of Destinations.
+  std::vector<std::string> AdditionalDestinations;
+};
+
+/// For isa, dyn_cast, etc operations on TelemetryInfo.
+typedef unsigned KindType;
+/// This struct is used by TelemetryInfo to support isa<>, dyn_cast<>
+/// operations.
+/// It is defined as a struct (rather than an enum) because it is
+/// expectend to be extended by subclasses which may have
+/// additional TelemetryInfo types defined to describe different events.
+struct EntryKind {
+  static const KindType Base = 0;
+};
+
+/// TelemetryInfo is the data courier, used to move instrumented data
+/// from the tool being monitored to the Telemery framework.
+///
+/// This base class contains only the basic set of telemetry data.
+/// Downstream implementations can add more fields as needed to describe
+/// additional events.
+///
+/// For example, The LLDB debugger can define a DebugCommandInfo subclass
+/// which has additional fields about the debug-command being instrumented,
+/// such as `CommandArguments` or `CommandName`.
+struct TelemetryInfo {
+  // This represents a unique-id, conventionally corresponding to
+  // a tool's session - i.e., every time the tool starts until it exits.
+  //
+  // Note: a tool could have multiple sessions running at once, in which
+  // case, these shall be multiple sets of TelemetryInfo with multiple unique
+  // ids.
+  //
+  // Different usages can assign different types of IDs to this field.
+  std::string SessionId;
+
+  TelemetryInfo() = default;
+  virtual ~TelemetryInfo() = default;
+
+  virtual json::Object serializeToJson() const;
----------------
jh7370 wrote:

Thanks for the explanations - I now see what you're thinking.

> I want to point out that we should not conflate vendor-specific code with tool-specific code.

This is a fair point. The key thing this means is that code in the top-level framework that is designed to be common needs to be able to work with subclasses defined in tool-level and vendor code. In particular, I think it's reasonable to expect there to be TelemetryInfo subclasses defined in common LLVM code (so for example a `DurationEvent` might be something desired by LLD, clang, llvm-objdump etc), in tools (for events specific to particular tools like LLDB) and in vendor code (for events that downstream vendors want which might not be applicable upstream).

I think it's also reasonable to assume that more event types will be written over time (and existing events will be extended). What I think we want to avoid though is for the need for every `Destination` (or some other class) to need to be extended every time a new event type is implemented. Indeed, it's going to be near impossible for downstream vendors to keep tabs on additions to upstream events in the current structure, I think, so they won't know whether there's a need to review new fields etc, with the current design. I guess the flipside of this is that if some vendors need to review fields for legal reasons, we need to make it clear to them that something has changed that needs review. Mind you, I think there'd still be cases where vendors need to review things without any changes in TelemetryInfo classes at all: imagine the case where someone changes the thing used to populate a field upstream, e.g. to use full paths instead of filenames - for some vendors, having the full path may not be acceptable.

Sorry, that last paragraph is a bit rambly, but I'm leaving it as-is, so that you can see some of the thoughts going through my head. 

Ultimately, I think my point is that the advantage of having something like `TelemetryInfo::serialize` is that the `Destination` classes don't necessarily all need modifying every time a new event field or type is added. I'm open to other ways of how this might be achieved.

I'll acknowledge that something strongly typed like protobuf doesn't work well with the `serialize` method approach, because of the extra indirection. However, you could have `ProtobufDestination` not use the `serialize` approach at all and dispatch by switching based on type, as described. If we allowed both options, it would admittedly mean the `serialize` method wouldn't always be used, but it would allow some `Destination` types to make use of it.

> The second source of confusion (now I'm moving on to the "right key string" part) is that this `serializer.write` API (at least in the way I/we understand it) is only suitable for implementing simple dictionary types (`map<string, BasicType>`). It's not very suitable for more complex value types like lists or (sub)dictionaries, as you'd have to add a new (virtual) function for each new data type. (This can be avoided by creating some sort of a polymorphic value type that can hold arbitrary structured data, but at that point, we're sort of reinventing json::Value.)

FWIW, in our downstream implementation, our `Serializer` base class declares `startArray`/`endArray`/`startObject`/`endObject` methods that our concrete serializers then implement for their different types. There's also a concrete method defined in the base class for serializing iterable types like arrays and vectors, and another for maps (the latter currently only supports `std::unordered_map`, but there's no reason it couldn't be extended to all map-like objects). Clearly for other complex object types, you'd need to define how to serialize that object type, though I've only rarely needed to resort to this in our downstream code.

https://github.com/llvm/llvm-project/pull/102323