[llvm] [MLGO][Docs] Add documentation on corpus tooling (PR #139362)

Mircea Trofin via llvm-commits llvm-commits at lists.llvm.org
Sat May 10 08:16:15 PDT 2025


================
@@ -18,8 +18,161 @@ This document is an outline of the tooling that composes MLGO.
 Corpus Tooling
 ==============
 
-..
-    TODO(boomanaiden154): Write this section.
+Within upstream LLVM, there is the ``mlgo-utils`` python packages that lives at
+``llvm/utils/mlgo-utils``. This package primarily contains tooling for working
+with corpora, or collections of LLVM bitcode. We use these corpora to 
+
+.. program:: extract_ir.py
+
+Synopsis
+--------
+
+Extracts a corpus from some form of a structured compilation database. This
----------------
mtrofin wrote:

Explain what the corpus structure is, and what it guarantees (bit-identical .o to yiur original compilation...)

https://github.com/llvm/llvm-project/pull/139362


More information about the llvm-commits mailing list