[LLVMdev] RFC: ThinLTO File Format

Mon Aug 3 09:19:07 PDT 2015

As discussed in the high-level ThinLTO RFC (
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-May/086211.html), we would
like to add support for native object wrapped bitcode and ThinLTO
information. Based on comments on the mailing list, I am adding support for
ThinLTO in both normal bitcode files, as well as native-object wrapped
bitcode.

The following RFC describes the planned file format of ThinLTO information
both in the bitcode-only and native object wrapped cases. It doesn't yet
define the exact record format, as I would like feedback on the overall
block design first.

I've also implemented the support for reading and writing the bitcode
blocks in the following patch:
http://reviews.llvm.org/D11722

The ThinLTO data structures and the file APIs are described in a separate
RFC I will be sending simultaneously, with pointers to the patches
implementing them.

Looking forward to your feedback. Thanks!
Teresa

ThinLTO File Format

Bitcode ThinLTO Support
    ThinLTO Bitcode Blocks
        THINLTO_SYMTAB_BLOCK
        THINLTO_MODULE_STRTAB_BLOCK
        THINLTO_FUNCTION_SUMMARY_BLOCK
    Bitcode Combined Function Summary
Native-Wrapped ThinLTO Support
    Native-wrapped bitcode
    Native-Wrapped Combined Function Summary

This document discusses the high-level file format used to represent
ThinLTO function index/summary information. It covers the index created at
the module level (produced by the phase-1 -c compiles) and the combined
function index/summary generated by the phase-2 linker step of a ThinLTO
compile. More information about ThinLTO compilation can be found in the
Updated RFC at:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-May/086211.html

As discussed in that document and subsequent mailing list discussions, we
will add support for ThinLTO with both normal bitcode-only intermediate
files, as well as native-wrapped bitcode files. This document describes the
ThinLTO format for both of these file types. The formats were designed to
allow as much sharing of APIs and implementation as possible between the
two file types.

The ThinLTO information is written to the per-module (translation unit)
intermediate files during the phase-1 (-c) compile. They are read by the
phase-2 linker step, which aggregates them into a combined function
index/summary file, and which by default does not need to parse the rest of
the module IR. The phase-3 parallel backend processes that each compile a
single module into a final object file read the combined function
index/summary file during importing, but do not need to look at the
module’s own ThinLTO information.

Since as noted above the usual normal non-ThinLTO module IR and its ThinLTO
information are not typically needed in the same compile step, the
following design tries to minimize the required parsing of the normal
module bitcode IR when reading the ThinLTO information, and vice versa.
Bitcode ThinLTO Support

This section describes the representation of ThinLTO for bitcode-only
intermediate files.
ThinLTO Bitcode Blocks

There will be three ThinLTO bitcode blocks nested within an outer
THINLTO_BLOCK, which itself is nested within the outer MODULE_BLOCK:

<MODULE_BLOCK>

 ...

 <THINLTO_BLOCK BlockID=19 ...>

   <THINLTO_SYMTAB_BLOCK BlockID=20 ...>

   </THINLTO_SYMTAB_BLOCK>

   <THINLTO_MODULE_STRTAB_BLOCK BlockID=21 ...>

   </THINLTO_MODULE_STRTAB_BLOCK>

   <THINLTO_FUNCTION_SUMMARY_BLOCK BlockID=22 ...>

   </THINLTO_FUNCTION_SUMMARY_BLOCK>

 </THINLTO_BLOCK>

</MODULE_BLOCK>

These block IDs are defined along with other LLVM bitcode IDs in
include/llvm/Bitcode/LLVMBitCodes.h:

namespace llvm {

namespace bitc {

 // The only top-level block type defined is for a module.

 enum BlockIDs {

   // Blocks

   MODULE_BLOCK_ID          = FIRST_APPLICATION_BLOCKID,

   // Module sub-block id's

   ...

   THINLTO_BLOCK_ID,

   // ThinLTO sub-block id's.

   THINLTO_SYMTAB_BLOCK_ID,

   THINLTO_MODULE_STRTAB_BLOCK_ID

   THINLTO_FUNCTION_SUMMARY_BLOCK_ID,

 };

The outer THINLTO_BLOCK will contain a record with the version ID of the
ThinLTO information, which will evolve as the importing algorithm is tuned.

The exact record formats within each of the THINLTO_*_BLOCKs are still TBD,
but the following is an overview of what they will contain:

THINLTO_SYMTAB_BLOCK

This block contains a record for each function in the summary. The record
will contain the ValueID of the corresponding function symbol in the
VALUE_SYMTAB_BLOCK (which contains the function’s name string), as well as
the bitcode offset of the corresponding function summary record. The latter
enables fast seeking when the function summary section is read lazily.
THINLTO_MODULE_STRTAB_BLOCK

This block contains a record for each module with functions in the combined
function index/summary file, holding the module ID and its path string (so
that the module can be located during phase-3 importing). This block is not
needed in the per-module function index/summary, as the module path is
known by the linker when the file is loaded. Additionally, the unique
module ID is assigned to each module by the phase-2 linker step when
creating the combined index (used to attain consistent renaming during
static promotion in the phase-3 backend).
THINLTO_FUNCTION_SUMMARY_BLOCK

This block contains a record for each function available for importing. At
a minimum, it holds the index into the THINLTO_MODULE_STRTAB_BLOCK of the
module containing the function, as well as the bitcode offset of the
function’s FUNCTION_BLOCK within that module. The
THINLTO_MODULE_STRTAB_BLOCK index will be 0 in the per-module function
summary, as that section does not exist yet, but will be non-zero in the
combined index/summary file (see Bitcode Combined Function Summary section
below). It also will be used to hold information about the function that is
useful in making importing decisions (e.g. its instruction count and
profile entry count).

There are several reasons for this block organization:

   1.

   Nesting ThinLTO subblocks within an parent THINLTO_BLOCK allows the
   ThinLTO information to be quickly skipped during frontend parsing in the
   backend phase-3 parallel backend compile steps, when the ThinLTO
   information in the module is not needed.
   2.

   Nesting within the MODULE_BLOCK allows the THINLTO_SYMTAB_BLOCK records
   to share the function name strings with the MODULE_BLOCK’s function
   symbols. These strings are saved in the VALUE_SYMTAB_BLOCK nested within
   the Module block. Note that it would be faster to parse ThinLTO blocks
   during the phase-2 linker step if they were not nested within the
   MODULE_BLOCK (which could be skipped in one step using the size in the
   MODULE_BLOCK entry), since the phase-2 parser is only interested in the
   ThinLTO blocks. But this block placement enables greater size efficiency.
   3.

   Separating the ThinLTO function symtab information from the rest of the
   function summary has a couple of benefits:
   1.

      Mirrors how the information is structured in the native-wrapped case,
      where the native object symbol table is leveraged for holding the symbol
      name plus index into the summary section. This in turn enables better
      sharing of the bitcode parsing code and interfaces (discussed in more
      detail below in the native-wrapped description).
      2.

      Enables lazy reading of the function’s summary information, delayed
      until we are considering importing that function, while allowing fast
      checking of whether the function is available for importing (via presence
      in the ThinLTO function symtab).

Because, as mentioned earlier, the THINLTO_BLOCK and the rest of the
MODULE_BLOCK are not typically both needed in a single compile step, we
will implement a ThinLTO-specific bitcode reader class
(ThinLTOBitcodeReader) to handle parsing of the ThinLTO blocks. This
bitcode reader will hold a pointer to the ThinLTO data structure to be
populated with the ThinLTO information (data structures described in a
separate “ThinLTO File API and Data Structures” RFC which should be sent
out at the same time). It will ignore all MODULE_BLOCK subblocks except the
THINLTO_BLOCK, the BLOCKINFO_BLOCK containing abbrev IDs, and the
VALUE_SYMTAB_BLOCK. The VALUE_SYMTAB_BLOCK parser is specialized/simplified
since there will not be any Value objects created during ThinLTO parsing,
we simply need to correlate each string with its ValueID in the
VALUE_SYMTAB_BLOCK record.

Bitcode Combined Function Summary

The combined function index/summary (thin archive) file created by the
phase-2 linker step will also be bitcode. It will consist of a MODULE_BLOCK
containing only a THINLTO_BLOCK, a BLOCKINFO_BLOCK, and a
VALUE_SYMTAB_BLOCK. The THINLTO_BLOCK will contain all three subblocks,
with the THINLTO_SYMTAB_BLOCK and the THINLTO_FUNCTION_SUMMARY_BLOCK
holding the aggregated per-module ThinLTO information. As noted earlier, it
will also contain a THINLTO_MODULE_STRTAB_BLOCK created from the linked
modules. The combined index will exclude symbols that are undefined,
duplicate (e.g. comdats) or unlikely to benefit from importing. The
THINLTO_FUNCTION_SUMMARY_BLOCK offsets in the THINLTO_SYMTAB_BLOCK records
are updated to reflect the new offset into the combined
THINLTO_FUNCTION_SUMMARY_BLOCK, and the THINLTO_FUNCTION_SUMMARY_BLOCK
records are updated to include the appropriate module index into the
THINLTO_MODULE_STRTAB_BLOCK.

Native-Wrapped ThinLTO Support

This section describes the representation of ThinLTO for native-wrapped
bitcode intermediate files. The discussion here uses ELF as an example, but
should also apply to other formats such as COFF and Mach-O [1].
Native-wrapped bitcode

There is already support in LLVM for reading native-wrapped bitcode, where
the bitcode is contained within an .llvmbc section. For ThinLTO, unlike in
the earlier bitcode-only case, the ThinLTO information is not nested within
the MODULE_BLOCK contained within the .llvmbc section. Instead, the native
object will contain a symbol table, and special sections holding the
additional ThinLTO information. These sections are the function summary
section (.llvm_thinlto_funcsum) containing the function’s bitcode offset
and summary information for importing decisions, as well as the module path
string table (.llvm_thinlto_modstrtab).

For simplicity and consistency with the bitcode-only format and interfaces,
the contents of the .llvm_thinlto_funcsum and .llvm_thinlto_modstrtab will
be encoded with bitcode. The .llvm_thinlto_modstrtab section will contain
bitcode for a single THINLTO_MODULE_STRTAB_BLOCK. The format and contents
of this block will be identical to the equivalent block in the bitcode-only
case. Similarly, the .llvm_thinlto_funcsum section will contain bitcode for
a single THINLTO_FUNCTION_SUMMARY_BLOCK. The format and contents will be
identical to the equivalent block in the bitcode-only case, however note
that the bitcode offset for the FUNCTION_BLOCK is the offset within the
.llvmbc section bitcode (which contains the function IR).

As with the symbol table in a normal object file, the symbol table for the
native-object wrapped bitcode file will hold entries for both defined and
undefined but referenced symbols. The entries for functions defined in the
module specify the location of that function’s summary via the st_shndx
(index of .llvm_thinlto_funcsum section) and st_value (bitcode offset
within .llvm_thinlto_funcsum section). The st_size field will hold the size
of the function summary entry in the .llvm_thinlto_funcsum section. Note
that for functions that are deemed unlikely to benefit from importing (e.g.
large and cold), the summary data will be suppressed and the symtab entry
will simply have a zero offset and size.

The symbol’s visibility can be emitted in the st_other field which
typically holds the visibility info. If a tool such as objcopy or ld -r
modifies the symbol visibility, this change is recorded in the symbol
table. The change will be propagated to the bitcode when the backend
compiles the native-wrapped bitcode.

E.g.:

Section Headers:

 [Nr] Name              Type             Address           Offset

      Size              EntSize          Flags  Link  Info  Align

 [ 0]                   NULL             0000000000000000  00000000

      0000000000000000  0000000000000000           0     0     0

 [ 1] .shstrtab         STRTAB           0000000000000000  0000024b

      0000000000000059  0000000000000000           0     0     1

 [ 2] .text             PROGBITS         0000000000000000  00000040

      0000000000000000  0000000000000000  AX       0     0     16

…

 [ 5] .llvmbc           PROGBITS         0000000000000000  00000040

      000000000000044c  0000000000000000   E       0     0     4

 [ 6] .llvm_thinlto_funcsum PROGBITS     0000000000000000  00000040

      0000000000000400  0000000000000000   E       0     0     4

 [ 7] .llvm_thinlto_modstrtab PROGBITS   0000000000000000  00000440
      0000000000000013  0000000000000000   E         0     0    4

Symbol table '.symtab' contains 11 entries:

  Num:    Value          Size Type    Bind   Vis      Ndx Name

    0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND

    1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS t1.c

    2: 0000000000000000     0 SECTION LOCAL  DEFAULT    2

    3: 0000000000000000     0 SECTION LOCAL  DEFAULT    4

    4: 0000000000000000     0 SECTION LOCAL  DEFAULT    5

    5: 0000000000000000     0 SECTION LOCAL  DEFAULT    6

    6: 0000000000000000     0 SECTION LOCAL  DEFAULT    7

    7: 0000000000000000     0 SECTION LOCAL  DEFAULT    8

    8: 0000000000000040    40 FUNC    GLOBAL DEFAULT    6 bar

    9: 0000000000000000    40 FUNC    GLOBAL DEFAULT    6 foo

   10: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND blah

The section index value in the symtab entry for ‘foo’ are 6 and 0x0,
respectively, meaning that the function summary info for ‘foo’ can be found
in section 6 (.llvm_thinlto_funcsum) at offset 0x0. Similarly, the function
summary info for ‘bar’ can be found in .llvm_thinlto_funcsum at offset
0x40. The size refers to the size of the corresponding function summary
entry.

Native-Wrapped Combined Function Summary

The combined function summary (thin archive) file created by the phase-2
linker step can also be in native object format. It will contain the symbol
table, and just the .llvm_thinlto_funcsum and .llvm_thinlto_modstrtab
sections, combined across all of the linked modules. The combined symbol
table, .llvm_thinlto_funcsum and .llvm_thinlto_modstrtab sections will
exclude symbols that are undefined, duplicate (e.g. comdats) or unlikely to
benefit from importing. The offsets in the symbol table are updated to
reflect the new offset into the .llvm_thinlto_funcsum section, and the
.llvm_thinlto_funcsum updated to include the appropriate module path index
in the new .llvm_thinlto_modstrtab section.

________________
[1] COFF and Mach-O symbol tables have similar fields. The main differences
are that Mach-O symbol table entries don’t contain the symbol size, so the
size is deduced by looking for the next symbol start address, and COFF
holds the symbol sizes in auxiliary info that follows each symbol entry.
See also http://www.delorie.com/djgpp/doc/coff/symtab.html and
https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/index.html#//apple_ref/c/tag/nlist_64

-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150803/ab0d5d8b/attachment.html>