[compiler-rt] [llvm] [docs][IRPGO]Document two binary formats for IRPGO profiles (PR #76105)

via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 20 14:39:36 PST 2023


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-pgo

Author: Mingming Liu (minglotus-6)

<details>
<summary>Changes</summary>

A preview of the HTML is in this [link](https://htmlpreview.github.io/?https://github.com/minglotus-6/llvm-project/blob/preview/docs/PGOProfileFormat.html).
- The source html was pushed to a branch in my fork. And https://htmlpreview.github.io/ previews it.

---
Full diff: https://github.com/llvm/llvm-project/pull/76105.diff


2 Files Affected:

- (added) llvm/docs/PGOProfileFormat.rst (+387) 
- (modified) llvm/docs/UserGuides.rst (+4) 


``````````diff
diff --git a/llvm/docs/PGOProfileFormat.rst b/llvm/docs/PGOProfileFormat.rst
new file mode 100644
index 00000000000000..5602172e147f00
--- /dev/null
+++ b/llvm/docs/PGOProfileFormat.rst
@@ -0,0 +1,387 @@
+=====================
+IRPGO Profile Format
+=====================
+
+.. contents::
+   :local:
+
+
+Overview
+==========
+
+IR-based instrumentation (IRPGO) and its context-sensitive variant (CS-IRPGO)
+inserts `llvm.instrprof.*` `code generator intrinsics <https://llvm.org/docs/LangRef.html#code-generator-intrinsics>`_
+in LLVM IR to generate profiles. This document describes two binary profile
+formats (raw and indexed) used by IR-based instrumentation.
+
+.. note::
+
+  Both the compiler-rt profiling infrastructure and profile format are general
+  and could support other use cases (e.g., coverage and temporal profiling).
+  This document will focus on IRPGO while briefly introducing other use cases
+  with pointers.
+
+Raw PGO Profile Format
+========================
+
+The raw PGO profile is generated by running the instrumented binary. It is a
+memory dump of the profile data.
+
+Two kinds of frequently used profile information are function's basic block
+counters and its (various flavors of) value profiles. A function's profiled
+information span across several sections in the profile.
+
+General Storage Layout
+-----------------------
+
+A raw profile for an executable [1]_ consists of a profile header and several
+sections. The storage layout is illustrated below. Generally, when raw profile
+is read into an memory buffer, the actual byte offset of a section is inferred
+from the section's order in the layout and size information of all sections
+ahead of it.
+
+::
+
+  +----+-----------------------+
+  |    |        Magic          |
+  |    +-----------------------+
+  |    |        Version        |
+  |    +-----------------------+
+  H    |   Size Info for       |
+  E    |      Section 1        |
+  A    +-----------------------+
+  D    |   Size Info for       |
+  E    |      Section 2        |
+  R    +-----------------------+
+  |    |          ...          |
+  |    +-----------------------+
+  |    |   Size Info for       |
+  |    |      Section N        |
+  +----+-----------------------+
+  P    |       Section 1       |
+  A    +-----------------------+
+  Y    |       Section 2       |
+  L    +-----------------------+
+  O    |          ...          |
+  A    +-----------------------+
+  D    |       Section N       |
+  +----+-----------------------+
+
+
+.. note::
+   Sections might be padded to meet platform-specific alignment requirements.
+   For simplicity, header fields and data sections solely for padding purpose
+   are omitted in the data layout graph above and the rest of this document.
+
+Header
+-------
+
+``Magic``
+  With the magic number, data consumer could detect profile format and
+  endianness of the data, and quickly tells whether/how to continue reading.
+
+``Version``
+  The lower 32 bits specifies the actual version and the most significant 32
+  bits specify the variant types of the profile. IRPGO and CS-IRPGO are two
+  variant types.
+
+``BinaryIdsSize``
+  The byte size of binary id section.
+
+``NumData``
+  The number of per-function profile data control structures. The byte size of
+  profile data section could be computed with this field.
+
+``NumCounter``
+  The number of entries in the profile counter section. The byte size of counter
+  section could be computed with this field.
+
+``NumBitmapBytes``
+  The number of bytes in the profile bitmap section.
+
+``NamesSize``
+  The number of bytes in the name section.
+
+``CountersDelta``
+  Records the in-memory address difference between the data and counter section,
+  i.e., `start(__llvm_prf_cnts) - start(__llvm_prf_data)`. It's used jointly
+  with the in-memory address difference of profile data record and its counter
+  to find the counter of a profile data record. Check out calculation-of-counter-offset_
+  for details.
+
+``BitmapDelta``
+  Records the in-memory address difference between the data and bitmap section,
+  i.e., `start(__llvm_prf_bits) - start(__llvm_prf_data)`. It's used jointly
+  with the in-memory address difference of a profile data record and its bitmap
+  to find the bitmap of a profile data record, in a similar to how counters are
+  referenced as explained by calculation-of-counter-offset_ .
+
+``NamesDelta``
+  Records the in-memory address of compressed name section. Not used except for
+  raw profile reader error checking.
+
+``ValueKindLast``
+  Records the number of value kinds. As of writing, two kinds of value profiles
+  are supported. `IndirectCallTarget` is to profile the frequent callees of
+  indirect call instructions and `MemOPSize` is for memory intrinsic function
+  size profiling.
+
+  The number of value kinds affects the byte size of per function profile data
+  control structure.
+
+Payload Sections
+------------------
+
+Binary Ids
+^^^^^^^^^^^
+Stores the binary ids of the instrumented binaries to associate binaries with
+profiles for source code coverage. See `Binary Id RFC`_ for introduction.
+
+.. _`Binary Id RFC`: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html
+
+Profile Data
+^^^^^^^^^^^^^
+
+This section stores per-function profile data control structure. The in-memory
+representation of the control structure is `__llvm_profile_data` and the fields
+are defined by `INSTRPROFDATA` macro. Some fields are used to reference data
+from other sections in the profile. The fields are documented as follows:
+
+``NameRef``
+  The MD5 of the function's IRPGO name. IRPGO name has the format
+  `[<filepath>;]<linkage-name>` where `<filepath>;` is provided for local-linkage
+  functions to tell possibly identical function names.
+
+``FuncHash``
+  A fingerprint of the function's control flow graph.
+
+``CounterPtr``
+  The in-memory address difference between profile data and its corresponding counters.
+
+``BitmapPtr``
+  The in-memory address difference between profile data and its bitmap.
+
+``FunctionPointer``
+  Records the function address when instrumented binary runs. This is used to
+  map the profiled callee address of indirect calls to the `NameRef` during
+  conversion from raw to indexed profiles.
+
+``Values``
+  Represents value profiles in a two dimensional array. The number of elements
+  in the first dimension is the number of instrumented value sites across all
+  kinds. Each element in the first dimension is the head of a linked list, and
+  the each element in the second dimension is linked list element, carrying
+  `<profiled-value, count>` as payload. This is used by compiler runtime when
+  writing out value profiles.
+
+``NumCounters``
+  The number of counters for the instrumented function.
+
+``NumValueSites``
+  This is an array of counters, and each counter represents the number of
+  instrumented sites for a kind of value in the function.
+
+``NumBitmapBytes``
+  The number of bitmap bytes for the function.
+
+Profile Counters
+^^^^^^^^^^^^^^^^^
+
+For IRPGO [2]_, the counters within an instrumented function are stored contiguously
+and in an order that is consistent with basic block selection in the instrumentation
+pass.
+
+.. _calculation-of-counter-offset:
+
+So how are function counters associated with a function?
+
+Basically, the profile reader iterates per-function control structure (from the
+profile data section) and makes use of the recorded relative distances, as
+illustrated below.
+
+::
+
+        + --> start(__llvm_prf_data) --> +---------------------+ ------------+
+        |                                |       Data 1        |             |
+        |                                +---------------------+  =====||    |
+        |                                |       Data 2        |       ||    |
+        |                                +---------------------+       ||    |
+        |                                |        ...          |       ||    |
+ Counter|                                +---------------------+       ||    |
+  Delta |                                |       Data N        |       ||    |
+        |                                +---------------------+       ||    |   CounterPtr1
+        |                                                              ||    |
+        |                                              CounterPtr2     ||    |
+        |                                                              ||    |
+        |                                                              ||    |
+        + --> start(__llvm_prf_cnts) --> +---------------------+       ||    |
+                                         |        ...          |       ||    |
+                                         +---------------------+  -----||----+
+                                         |      Counter 1      |       ||
+                                         +---------------------+       ||
+                                         |        ...          |       ||
+                                         +---------------------+  =====||
+                                         |      Counter 2      |
+                                         +---------------------+
+                                         |        ...          |
+                                         +---------------------+
+                                         |      Counter N      |
+                                         +---------------------+
+
+
+In the graph,
+
+* The profile header records `CounterDelta` with the value as `start(__llvm_prf_cnts) - start(__llvm_prf_data)`.
+  We will call it `CounterDeltaInitVal` below for convenience.
+* For each profile data record, `CounterPtrN` is recorded as `start(Counter) - start(ProfileData)`.
+
+Each time the reader advances to the next data record, it updates `CounterDelta` to minus the size of one `ProfileData`.
+
+For the counter corresponding to the first data record, the byte offset
+relative to the start of the counter section is calculated as `CounterPtr1 - CounterDeltaInitVal`.
+When profile reader advances to the second data record, note `CounterDelta` is now `CounterDeltaInitVal - sizeof(ProfileData)`.
+Thus the byte offset relative to the start of the counter section is calculated as `CounterPtr2 - (CounterDeltaInitVal - sizeof(ProfileData))`.
+
+Bitmap
+^^^^^^^
+This section is used for source-based MC/DC code coverage. Check out `Bitmap RFC`_
+if interested.
+
+.. _`Bitmap RFC`: https://discourse.llvm.org/t/rfc-source-based-mc-dc-code-coverage/59244
+
+Names
+^^^^^^
+
+This section contains the concatenated string of function IRPGO names. If
+compressed, zlib compression algorithm is used.
+
+Function names serve as keys in the PGO data hash table when raw profiles are
+converted into indexed profiles. They are also crucial for `llvm-profdata` to
+show the profiles in a human-readable way.
+
+Value Profile Data
+^^^^^^^^^^^^^^^^^^^^
+
+This section contains the profile data for value profiling.
+
+The value profiles corresponding to a profile data are serialized contiguously
+as one record, and value profile records are stored in the same order as the
+respective profile data, such that a raw profile reader advances the pointer to
+profile data and the pointer to value profile records simutaneously [3]_ to find
+value profiles for a per function, per cfg fingerprint profile data.
+
+Indexed PGO Profile Format
+===========================
+
+General Storage Layout
+-----------------------
+
+::
+
+                            +-----------------------+---+
+                            |        Magic          |   |
+                            +-----------------------+   |
+                            |        Version        |   |
+                            +-----------------------+   |
+                            |        HashType       |   H
+                            +-----------------------+   E
+                    +-------|       HashOffset      |   A
+                    |       +-----------------------+   D
+                +-----------|     MemProfOffset     |   E
+                |   |       +-----------------------+   R
+                |   |       |     BinaryIdOffset    |   |
+                |   |       +-----------------------+   |
+            +---------------|      TemporalProf-    |   |
+            |   |   |       |      TracesOffset     |   |
+            |   |   |       +-----------------------+---+
+            |   |   |       |   Profile Summary     |   |
+            |   |   |       +-----------------------+   P
+            |   |   +------>|  Function PGO data    |   A
+            |   |           +-----------------------+   Y
+            |   +---------- |  MemProf profile data |   L
+            |               +-----------------------+   O
+            |               |    Binary Ids         |   A
+            |               +-----------------------+   D
+            +-------------->|  Temporal profiles    |   |
+                            +-----------------------+---+
+
+Header
+--------
+
+``Magic``
+  The purpose of the magic number is to be able to quickly tell if the profile
+  is an indexed profile.
+
+``Version``
+  Similar to raw profile version, the lower 32 bits specifies the version of the
+  indexed profile and the most significant 32 bits are reserved to specify the
+  variant types of the profile.
+
+``HashType``
+  The hashing scheme for on-disk hash table keys. Only MD5 hashing is used as of
+  writing.
+
+``HashOffset``
+  An on-disk hash table stores the per-function profile records.
+  Precisely speaking, `HashOffset` records the offset of this hash table's
+  metadata (i.e., the number of buckets and entries), which follows right after
+  the payload of the entire hash table.
+
+``MemProfOffset``
+  Records the byte offset of MemProf profiling data.
+
+``BinaryIdOffset``
+  Records the byte offset of binary id sections.
+
+``TemporalProfTracesOffset``
+  Records the byte offset of temporal profiles.
+
+Payload Sections
+------------------
+
+(CS) Profile Summary
+^^^^^^^^^^^^^^^^^^^^^
+This section is right after profile header. It stores the serialized profile
+summary. For context-sensitive IRPGO, this section stores an additional profile
+summary corresponding to the context-sensitive profiles.
+
+Function PGO data
+^^^^^^^^^^^^^^^^^^
+This section stores functions and their PGO profiling data as an on-disk hash
+table. The key of a hash table entry is function's PGO name, and the in-memory
+representation of value is a map. The key of this map is CFG hash, and the value
+is C++ struct `llvm::InstrProfRecord`. The C++ struct collects the profiling
+information like counters and value profiles.
+
+MemProf Profile data
+^^^^^^^^^^^^^^^^^^^^^^
+This section stores function's memory profiling data. See
+`MemProf binary serialization format RFC`_ for the design.
+
+.. _`MemProf binary serialization format RFC`: https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html
+
+Binary Ids
+^^^^^^^^^^^^^^^^^^^^^^
+The section to carry on binary-id information from raw profiles.
+
+Temporal Profile Traces
+^^^^^^^^^^^^^^^^^^^^^^^^
+The section to carry on temporal profile information from raw profiles.
+See `Temporal profiling RFC`_ for an overview.
+
+.. _`Temporal profiling RFC`: https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068
+
+Profile Data Usage
+=======================================
+
+`llvm-profdata` is the command line tool to display and process profile data.
+For supported usages, check out its `documentation <https://llvm.org/docs/CommandGuide/llvm-profdata.html>`_.
+
+
+.. [1] A raw profile file could contain multiple raw profiles. Raw profile
+   reader could parse all raw profiles from the file correctly.
+.. [2] The counter section is used by a few variant types (like coverage and
+   temporal profiling) and might have different semantics there.
+.. [3] The step size of data pointer is the `sizeof(ProfileData)`, and the step
+   size of value profile pointer is calcuated based on the number of collected
+   values.
diff --git a/llvm/docs/UserGuides.rst b/llvm/docs/UserGuides.rst
index 006df613bc5e7d..14a2e161ea54cb 100644
--- a/llvm/docs/UserGuides.rst
+++ b/llvm/docs/UserGuides.rst
@@ -58,6 +58,7 @@ intermediate LLVM representation.
    NVPTXUsage
    Phabricator
    Passes
+   PGOProfileFormat
    ReportingGuide
    ResponseGuide
    Remarks
@@ -177,6 +178,9 @@ Optimizations
    referencing, to determine variable locations for debug info in the final
    stages of compilation.
 
+:doc:`PGOProfileFormat`
+   This document explains two binary formats of IRPGO profiles.
+
 Code Generation
 ---------------
 

``````````

</details>


https://github.com/llvm/llvm-project/pull/76105


More information about the llvm-commits mailing list