[llvm] [docs][mlgo] Document custom builds (PR #141243)

Fri May 23 08:34:31 PDT 2025

https://github.com/mtrofin created https://github.com/llvm/llvm-project/pull/141243

None

>From f85f96715ba92ed17b6e8aa4067114418144e110 Mon Sep 17 00:00:00 2001
From: Mircea Trofin <mtrofin at google.com>
Date: Fri, 23 May 2025 08:32:15 -0700
Subject: [PATCH] [docs][mlgo] Document custom builds

---
 llvm/docs/MLGO.rst | 66 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/llvm/docs/MLGO.rst b/llvm/docs/MLGO.rst
index 15d71c9d77506..fa4b02cb11be7 100644
--- a/llvm/docs/MLGO.rst
+++ b/llvm/docs/MLGO.rst
@@ -522,3 +522,69 @@ advanced usage, please refer to the original paper:
 `IR2Vec: LLVM IR Based Scalable Program Embeddings <https://doi.org/10.1145/3418463>`_.
 The LLVM source code for ``IR2Vec`` can also be explored to understand the 
 implementation details.
+
+Building with ML support
+========================
+
+**NOTE** For up to date information on custom builds, see the ``ml-*``
+`build bots <http://lab.llvm.org>`_. They are set up using 
+`like this <https://github.com/google/ml-compiler-opt/blob/main/buildbot/buildbot_init.sh>`_.
+
+Embed pre-trained models (aka "release" mode)
+---------------------------------------------
+
+This supports the ``ReleaseModeModelRunner`` model runners.
+
+You need a tensorflow pip package for the AOT (ahead-of-time) Saved Model compiler
+and a thin wrapper for the native function generated by it. We currently support
+TF 2.15. We recommend using a python virtual env (in which case, remember to
+pass ``-DPython3_ROOT_DIR`` to ``cmake``).
+
+Once you install the pip package, find where it was installed:
+
+.. code-block:: console
+
+  TF_PIP=$(sudo -u buildbot python3 -c "import tensorflow as tf; import os; print(os.path.dirname(tf.__file__))")``
+
+Then build LLVM:
+
+.. code-block:: console
+
+  cmake -DTENSORFLOW_AOT_PATH=$TF_PIP \
+    -DLLVM_INLINER_MODEL_PATH=<path to inliner saved model dir> \
+    -DLLVM_RAEVICT_MODEL_PATH=<path to regalloc eviction saved model dir> \
+    <...other options...> 
+
+The example shows the flags for both inlining and regalloc, but either may be
+omitted.
+
+You can also specify a URL for the path, and it is also possible to pre-compile
+the header and object and then just point to the precompiled artifacts. See for
+example ``LLVM_OVERRIDE_MODEL_HEADER_INLINERSIZEMODEL``.
+
+**Note** that we are transitioning away from the AOT compiler shipping with the
+tensorflow package, and to a EmitC, in-tree solution, so these details will
+change soon.
+
+Using TFLite (aka "development" mode)
+-------------------------------------
+
+This supports the ``ModelUnderTrainingRunner`` model runners.
+
+Build the TFLite package using `this script <https://raw.githubusercontent.com/google/ml-compiler-opt/refs/heads/main/buildbot/build_tflite.sh>`_.
+Then, assuming you ran that script in ``/tmp/tflitebuild``, just pass
+``-C /tmp/tflitebuild/tflite.cmake`` to the ``cmake`` for LLVM.
+
+Interactive Mode (for training / research)
+------------------------------------------ 
+
+The ``InteractiveModelRunner`` is available with no extra dependencies. For the
+optimizations that are currently MLGO-enabled, it may be used as follows:
+
+- for inlining: ``-mllvm -enable-ml-inliner=release -mllvm -inliner-interactive-channel-base=<name>``
+- for regalloc eviction: ``-mllvm -regalloc-evict-advisor=release -mllvm -regalloc-evict-interactive-channel-base=<name>``
+
+where the ``name`` is a path fragment. We will expect to find 2 files,
+``<name>.in`` (readable, data incoming from the managing process) and
+``<name>.out`` (writable, the model runner sends data to the managing process)
+