[llvm-dev] A Section-Metadata-Based Approach for Mapping Binary Addresses to Machine Basic Blocks
Rahman Lavaee via llvm-dev
llvm-dev at lists.llvm.org
Fri Jul 17 13:30:12 PDT 2020
Greetings,
TLDR: We propose emitting a new section in the binary which would contain
information required to associate executable PC addresses to basic blocks.
The new proposal decouples the BB information from the symbol table,
allowing for more flexibility (stripping the section, adding new BB
information fields). Furthermore, the new approach cuts the binary size
overhead by ~60% compared to the current implementation (using special BB
symbols to pass the information).
Background
Today Propeller (proposed in
https://lists.llvm.org/pipermail/llvm-dev/2019-September/135393.html) uses
the *-fbasicblock-sections=labels* option to create a symbol for every
basic block to be able to map instruction addresses to individual basic
blocks. To lower the .strtab bloat, these symbols are encoded using a unary
naming scheme which allows for string compression. For instance, a function
“foo” with four basic blocks will have three basic block labels (the entry
block doesn’t need a BB label since it can be found using the function
symbol).
foo:
…
je ra.BB.foo
a.BB.foo:
…
je rra.BB.foo
ra.BB.foo:
…
ret
rra.BB.foo:
…
ret
While this serves the functionality of mapping addresses to basic blocks,
it poses several challenges:
1.
Unary symbol names are great for compression, but require changes to the
demangler for readability.
2.
Stripping BB symbols from the binary requires special linker support as
they are placed in the symbol table along with other symbols.
3.
Profilers and debuggers need to be changed to accommodate these symbols.
For instance, the profiles from all BBs of a function must be aggregated
and the debugger must be able to map to the function symbol, rather than
the BB symbol, to show a function-level backtrace.
4.
Relying on the ELF symbol format provides little flexibility for
extension of BB information for passing other information about basic
blocks (e.g., whether this is a return block, or an exception handling
block). Today Propeller encodes this information using special characters
in the symbol name (‘r’ for return blocks as in the above example).
The BB Info Section Design
To solve these problems, we propose emitting the BB information in a
separate “.bb_info” section. Each function will have its own BB info table
emitted into this section. The structure of this table is as follows. The
table has a header which includes the address of the function and the
number of basic blocks in that function.
+-----------------------------------------------+
| Address of the function | -> 8 bytes (pointer
size)
+-----------------------------------------------+
| Number of basic blocks in this function (>0) | -> ULEB128
+-----------------------------------------------+
The header is followed by the BB information record for every basic block.
These records are ordered in the same order as MBBs are placed in the
function. Each BB Info is structured as follows:
+-------------------------------------------------------+
| Offset of the basic block relative to function begin | -> ULEB128
+-------------------------------------------------------+
| Binary size of the basic block | -> ULEB128
+-------------------------------------------------------+
| BB metadata |
| [MBB.isReturn() | MBB.hasTailCall() << 1 | MBB.isEHPad() << 2 ] | ->
ULEB128
+-------------------------------------------------------+
The BB record for the entry basic block excludes the offset field, as it is
always zero.
Along with this information, Propeller needs the corresponding MBB number
of every BB info record. Instead of emitting another field in the record,
we implicitly derive the index assuming that MBBs are renumbered from 0 to
N-1 prior to bb-info section emission.
Remapping PC addresses to BBs using the .bb_info section
Propeller can use the BB Info table, along with function addresses to map
sampled PCs to their corresponding basic blocks. Given an address X, the
workflow is as follows:
1.
Use the symbol table to find the function F containing X.
2.
Lookup the BB info table for function F.
3.
Use the offset fields of the BB info table to find the BB which contains
address X.
Complexity
All steps can use binary search.
1.
Step 1 uses a binary search into the symbol table and has time
complexity of O(lg N), with N being the number of symbols in the binary.
2.
Step 2 also uses binary search to find the BB info table corresponding
to a function address and it has time complexity of O(lg M), with M being
the number of functions.
3.
Finally, step 3 uses a binary search into the BB info table to find the
offset containing the address offset relative to the function entry and has
time complexity of O(lg N_F), with N_F being the number of basic blocks in
function F.
Advantage over the BB symbols approach
The section-based approach solves the problems mentioned above by
decoupling the BB information from the symbol table. Furthermore, the new
approach significantly reduces the binary and object size overheads. The
following table lists the total binary size of Clang for vanilla,
BB-symbols, and BB-info builds, all using PGO.
Flavor
Binary size
PGO
86MB
PGO with BB Info section
97MB
PGO with BB symbols
146MB
We see that using the .bb_info section increases the binary size by about
13% while BB symbols (with strtab compression) amounts to about 70% binary
size (60MBs) overhead. Most of this overhead comes from the .symtab bloat
(56MBs). The .strtab bloat is lower (4MB) due to string compression.
Implementation and challenges
Our implementation of the .bb_info section is inspired by the .stack_sizes
section feature in LLVM. Each function will have its BB info table emitted
into a section which is linked to the section containing that function (The
BB info section uses ELF:SHF_LINK_ORDER and is linked to the beginning
symbol of the section containing that function). With -function-sections this
emits a unique .bb_info section for each function (we don’t need a unique
id for sections since they are already distinguished by the linked-to
symbol). Eventually, these sections are concatenated by the linker in an
order consistent with how their function sections are laid out.
Several linker and compiler features may affect the .bb_info section:
1.
COMDATs: For COMDAT functions, we add their .bb_info sections to the
COMDAT group too, so that conflicts between .bb_info sections of the
COMDAT group are resolved the same way as it is resolved for the function
sections. Besides this, since we make the BB info table header of every
function refer not to the global function symbol, but to the local
.Lfunc_begin symbol, the treatment of the bb info section for every
function is independent of symbol resolution for COMDAT functions (This is
the same as how stack_sizes section is emitted:
https://reviews.llvm.org/D44669).
1.
Garbage Collection: If a function section is garbage-collected by the
linker its .bb_info section must be removed as well, since the section
only refers to that particular function.
1.
Code Folding: If two functions are identical, their section is folded by
identical code folding and their .bb_info sections must be folded as
well. This is a safe assumption as the .bb_info sections should be
identical as well, except for the function symbol that they refer to, and
ICF can correctly identify that those function symbols are folded together.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200717/137e6d32/attachment.html>
More information about the llvm-dev
mailing list