[llvm] [DTLTO][LLVM] Integrated Distributed ThinLTO (DTLTO) (PR #127749)

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Tue May 20 21:08:41 PDT 2025


================
@@ -0,0 +1,178 @@
+===================
+DTLTO
+===================
+.. contents::
+   :local:
+   :depth: 2
+
+.. toctree::
+   :maxdepth: 1
+
+Distributed ThinLTO (DTLTO)
+===========================
+
+Distributed ThinLTO (DTLTO) enables the distribution of backend ThinLTO
+compilations via external distribution systems, such as Incredibuild, during the
+link step.
+
+DTLTO extends the existing ThinLTO distribution support which uses separate
+*thin-link*, *backend compilation*, and *link* steps. This method is documented
+here:
+
+    https://blog.llvm.org/2016/06/thinlto-scalable-and-incremental-lto.html
+
+Using the *separate thin-link* approach requires a build system capable of
+handling the dynamic dependencies specified in the individual summary index
+files, such as Bazel. DTLTO removes this requirement, allowing it to be used
+with any build process that supports in-process ThinLTO.
+
+The following commands show the steps used for the *separate thin-link*
+approach for a basic example:
+
+.. code-block:: console
+
+    1. clang -flto=thin -O2 t1.c t2.c -c
+    2. clang -flto=thin -O2 t1.o t2.o -fuse-ld=lld -Wl,--thinlto-index-only
+    3. clang -O2 -o t1.native.o t1.o -c -fthinlto-index=t1.o.thinlto.bc
+    4. clang -O2 -o t2.native.o t2.o -c -fthinlto-index=t2.o.thinlto.bc
+    5. clang t1.native.o t2.native.o -o a.out -fuse-ld=lld
+
+With DTLTO, steps 2-5 are performed internally as part of the link step. The
+equivalent DTLTO commands for the above are:
+
+.. code-block:: console
+
+    clang -flto=thin -O2 t1.c t2.c -c
+    clang -flto=thin -O2 t1.o t2.o -fuse-ld=lld -fthinlto-distributor=<distributor_process>
+
+For DTLTO, LLD prepares the following for each ThinLTO backend compilation job:
+
+- An individual index file and a list of input and output files (corresponds to
+  step 2 above).
+- A Clang command line to perform the ThinLTO backend compilations.
+
+This information is supplied, via a JSON file, to ``distributor_process``, which
+executes the backend compilations using a distribution system (corresponds to
+steps 3 and 4 above). Upon completion, LLD integrates the compiled native object
+files into the link process and completes the link (corresponds to step 5
+above).
+
+This design keeps the details of distribution systems out of the LLVM source
+code.
+
+An example distributor that performs all work on the local system is included in
+the LLVM source tree. To run an example with that distributor, a command line
+such as the following can be used:
+
+.. code-block:: console
+
+   clang -flto=thin -fuse-ld=lld -O2 t1.o t2.o -fthinlto-distributor=$(which python3) \
+     -Xthinlto-distributor=$LLVMSRC/llvm/utils/dtlto/local.py
+
+Distributors
+------------
+
+Distributors are programs responsible for:
+
+1. Consuming the JSON backend compilations job description file.
+2. Translating job descriptions into requests for the distribution system.
+3. Blocking execution until all backend compilations are complete.
+
+Distributors must return a non-zero exit code on failure. They can be
+implemented as platform native executables or in a scripting language, such as
+Python.
+
+Clang and LLD provide options to specify a distributor program for managing
+backend compilations. Distributor options and backend compilation options can
+also be specified. Such options are transparently forwarded.
+
+The backend compilations are currently performed by invoking Clang. For further
+details, refer to:
+
+* Clang documentation: https://clang.llvm.org/docs/ThinLTO.html
+* LLD documentation: https://lld.llvm.org/DTLTO.html
+
+When invoked with a distributor, LLD generates a JSON file describing the
+backend compilation jobs and executes the distributor, passing it this file.
+
+JSON Schema
+-----------
+
+The JSON format is explained by reference to the following example, which
+describes the backend compilation of the modules ``t1.o`` and ``t2.o``:
+
+.. code-block:: json
+
+    {
+        "common": {
+            "linker_output": "dtlto.elf",
+            "args": ["/usr/bin/clang", "-O2", "-c", "-fprofile-sample-use=my.prof"],
+            "inputs": ["my.prof"]
+        },
+        "jobs": [
+            {
+                "args": ["t1.o", "-fthinlto-index=t1.o.thinlto.bc", "-o", "t1.native.o", "-fproc-stat-report=t1.stats.txt"],
+                "inputs": ["t1.o", "t1.o.thinlto.bc"],
+                "outputs": ["t1.native.o", "t1.stats.txt"]
+            },
+            {
+                "args": ["t2.o", "-fthinlto-index=t2.o.thinlto.bc", "-o", "t2.native.o", "-fproc-stat-report=t2.stats.txt"],
+                "inputs": ["t2.o", "t2.o.thinlto.bc"],
+                "outputs": ["t2.native.o", "t2.stats.txt"]
+            }
+        ]
+    }
+
+Each entry in the ``jobs`` array represents a single backend compilation job.
+Each job object records its own command-line arguments and input/output files.
+Shared arguments and inputs are defined once in the ``common`` object.
+
+Reserved Entries:
+
+- The first entry in the ``common.args`` array specifies the compiler
+  executable to invoke.
+- The first entry in each job's ``inputs`` array is the bitcode file for the
+  module being compiled.
+- The second entry in each job's ``inputs`` array is the corresponding
+  individual summary index file.
+- The first entry in each job's ``outputs`` array is the primary output object
----------------
teresajohnson wrote:

Maybe just note some of this in a comment (specifically about only the first entry being a reserved name, and the part from the last paragraph above about supporting LTO options implying additional output files).

https://github.com/llvm/llvm-project/pull/127749


More information about the llvm-commits mailing list