[llvm] [docs][mlgo] Document `MLModelRunner` (PR #139205)

Fri May 9 07:39:46 PDT 2025

================
@@ -1,28 +1,174 @@
-====
-MLGO
-====
+=============================================
+Machine Learning - Guided Optimization (MLGO)
+=============================================
 
 Introduction
 ============
 
-MLGO is a framework for integrating ML techniques systematically in LLVM. It is
-designed primarily to replace heuristics within LLVM with machine learned
-models. Currently there is upstream infrastructure for the following
-heuristics:
+MLGO refers to integrating ML techniques (primarily) to replace heuristics within
+LLVM with machine learned models.
+
+Currently the following heuristics feature such integration:
 
 * Inlining for size
 * Register allocation (LLVM greedy eviction heuristic) for performance
 
-This document is an outline of the tooling that composes MLGO.
+This document is an outline of the tooling and APIs facilitating MLGO.
+
+Note that tools for orchestrating ML training are not part of LLVM, as they are
+dependency-heavy - both on the ML infrastructure choice, as well as choices of
+distrubuted computing. For the training scenario, LLVM only contains facilities
+enabling it, such as corpus extraction, training data extraction, evaluation of
+models during training.
+
+
+.. contents::
 
 Corpus Tooling
 ==============
 
 ..
     TODO(boomanaiden154): Write this section.
 
-Model Runner Interfaces
-=======================
+Interacting with ML models
+==========================
+
+We interact with ML models in 2 primary scenarios: one is to train such a model.
+The other, inference, is to use a model during compilation, to make optimization
+decisions.
+
+For a specific optimization problem - i.e. inlining, or regalloc eviction - we
+first separate correctness - preserving decisions from optimization decisions.
+For example, not inlining functions marked "no inline" is an example of the
+former. Same is not evicting an unevictable live range. An exmple of the latter
+is deciding to inline a function that will bloat the caller size, just because
+we have reason to believe that later, the effect will be some constant
+propagation that will actually reduce the size (or dynamic instruction count).
+
+ML models can be understood as functions. Their inputs are tensors - buffers of
+scalars. The output (in our case, singular) is a scalar. For example, for
+inlining, the inputs are properties of the caller, callee, and the callsite
+being analyzed for inlining. The output is a boolean.
+
+Inputs and outputs are named, have a scalar type (e.g. int32_t) and a shape
+(e.g. 3x4). These are the elements that we use to bind to a ML model.
+
+In both training and inference, we want to expose to ML (training algorithms or
+trained model, respectivelly) the features we want to make optimization
+decisions on. In that regard, the interface from the compiler side to the ML
+side is the same: pass features, and get a decision. It's essentially a function
+call, where the parameters and result are bound by name and are described by
+name, scalar type, and shape tuples.
+
+The main types in LLVM are:
+- ``MLModelRunner`` - an abstraction for the decision making mechanism
+- ``TensorSpec`` which describes a tensor.
+
+TensorSpec
+----------
+
+See ``llvm/Analysis/TensorSpec.h``. This is a simple data bag, identifying a
+tensor by name (a string), scalar type, and shape (a vector of ints). The scalar
+type can only be int (8, 16, 32, or 64), signed or unsigned; float; or double.
+
+MLModelRunner
+-------------
+
+See ``llvm/Analysis/MLModelRunner.h``. The abstraction has a pure virtual,
+``evaluateUntyped``, but the contract with implementers is a bit more involved:
+
+Implementers
+^^^^^^^^^^^^
+
+At construction, the implementer is expected to receive a list of ``TensorSpec``
+for input features and the ``TensorSpec`` of the output (e.g. 
+``std::vector<TensorSpec>``). The list type is not contractual, but it must be
+a 0-based indexing array-like container. In the order of appearance in the input
+list, for a ``TensorSpec`` with a name "N", shape D1xD2x...Dn, and scalar type
----------------
mtrofin wrote:

reworded, ptal

https://github.com/llvm/llvm-project/pull/139205