[LLVMdev] Proposal: MCLinker - an LLVM integrated linker
Jush Lu (盧育龍)
Jush.Lu at mediatek.com
Tue Nov 1 07:34:43 PDT 2011
We are developing a linker, MCLinker.
MCLinker is a linker for LLVM. It leverages the LLVM machine code (MC) layer to link object files and bitcodes, and generate shared objects and executable files.
The development of MCLinker was started out of the need for an LLVM integrated linker. LLVM lacks an integrated linker; hence, it relies on external linkers to generate executables and dynamic shared objects (DSO, .so). MCLinker complements LLVM toolchain for direct generation of all kinds of outputs without temporary files. Therefore, LLVM toolchain can remove all unnecessary read and emission of temporary files. As we know, MC layer can generate object files without external assemblers, we look forward MCLinker can help LLVM to get rid of external linkers as well. Since MCLinker derives from MC layer, it supports multiple targets and formats naturally. As a result, it can make the cross-compilation process easier.
Besides being an integrated linker, MCLinker has other ambition. Embedded systems developers are looking for a linker which performs well even on platforms with limited memory. MCLinker performs fast and format-independent linking with low memory usage. We will illustrate how we achieve this later. Furthermore, MCLinker is under UIUC BSD-Style license. With that, we hope that more platforms and commercial products can adopt it.
MCLinker provides the following features:
* Support Various Formats of Object Files
MCLinker, leveraging LLVM machine code (MC) layer, provides a unified representation for various formats of object files. Different formats of object files, including ELF, COFF, and Mach-O, are read by the corresponding readers and translated into the same representation. This clearly separates the linking algorithms from reading and writing files in different formats. That is, the linking algorithms of MCLinker, such as merging symbol tables, applying relocations and relaxing instructions, are format-independent.
* Fast Linking and Low Memory Usage
The linking algorithms in MCLinker are fast and efficient with limited memory budget. Here is how we achieve these.
+ Improve cache locality of both string tables and symbol tables
MCLinker keeps symbols and corresponding strings in the same cache line. Because symbols and strings are often used in pairs, putting them together can improve spatial locality. Furthermore, MCLinker uses a global symbol table and avoids copying symbols from input to output symbol tables. Since MCLinker always uses the same instance of the symbols with the same name, the locality is improved.
+ Reduce the number of walks over symbol tables
Walk over global symbol table is time-consuming, because programs may have more than 100,000 symbols. GNU ld reads symbols and relocation entries at the same time. Since symbols in that stage are not resolved, GNU ld needs to track relocation entries applied to each symbol. This bookkeeping causes additional traversals of the symbol table. Following Google gold's approach, MCLinker keeps the number of walks over symbol table as few as possible. MCLinker resolves symbols before reading relocation entries, and this avoids extra symbol table traversals.
+ Reduce the overhead of traversing all the symbols
The symbols are categorized into different groups according to their types and bindings. As a result, MCLinker visits a specific group to get the symbols it needs instead of traversing the whole symbol pool. For example, dynamic symbols are grouped together. When MCLinker generates the dynamic symbol section, it only visits the symbols in the dynamic group.
+ Efficient use of memory mapped I/O
Linkers are sensitive to the use of memory mapped I/O. Memory mapped I/O brings data to the physical memory only when accessing it, so that it can reduce the physical memory usage. However, memory mapped I/O will not release pages until pages are un-mapped. In embedded system with limited memory, this means the throughput of the system is degraded. MCLinker optimizes the use of memory mapped I/O. It requests mapped memory according to the average size of object files in the typical case. Thus, more pages are available during linking, and it improves the system throughput.
The Design of MCLinker
MCLinker is an integrated linker for LLVM. "Integrated" means MCLinker has an adapter to LLVM. MC Layer uses a function pass, called AsmPrinter, as an adapter to the standard compilation flow. Like MC Layer, MCLinker also provides a function pass, called SectLinker, to be integrated into the last stage of the standard compilation flow.
Traditional toolchain has a linker driver to prepare parameters for linker. In GCC, collect2 is one of such tools to prepare parameters such as sysroot, the path of glibc, and so on. In MCLinker, MCLDDriver plays the same role as collect2 to prepare all implicit parameters for MCLinker.
MCLinker, a class with the same name as the project, provides numerous APIs to perform major linking tasks, such as symbol resolution, applying relocations, sections merge, and so on. These APIs are defined at high level abstraction which is target and format independent. Those target and format dependent operations are encapsulated in another class, called TargetLDBackend. Therefore, MCLinker can perform linking at a high level abstraction without the concern about targets and formats.
In order to simplify linking and improve performance, MCLinker defines its own symbol data structures, called LDSymbol, instead of using MCSymbol directly. Symbols defined in LLVM are separated into two different classes – MCSymbol and MCSymbolData, and they are very different from the symbols' definition in most object files. If a linker adopts them as the data structure of symbols, it needs extra overhead to convert symbols of object files into MCSymbol and MCSymbolData. In order to overcome this problem, the definition of LDSymbol is close to the symbols' definition in the object files, and the overhead of conversion is low. Actually, not only reading object files, but also symbol resolution become faster and easier when adopting LDSymbol.
MCLinker also defines a unified representation for relocation entries, called LDRelocation, since LLVM does not provide it. The design of relocation in MCLinker is similar to the other portable linkers. Readers convert format-dependent relocation entries into general LDRelocation. All targets need to provide functions for applying relocations. MCLinker connects the functions with corresponding LDRelocations during the initialization.
GNU ld and Google gold are well-known linkers with unique features. These linkers have their own linking algorithms and data structures. We discuss GNU ld and Google gold below.
* GNU ld
GNU ld is designed for the portable manipulation to support various formats of object files. Object files in various formats are represented in a common abstraction, provided by the Binary File Descriptor (BFD) library. The main job of GNU ld is to read and parse link scripts, that is, it behaves as a frontend of the BFD library, and the real linking is done by BFD.
GNU ld reads symbols and relocations at the same time. This means it suffers from the extra overhead to do bookkeeping for relocations. That is what both Google gold and MCLinker want to eliminate. Additionally, BFD is mainly designed for COFF, so ELF is not handled efficiently in BFD. Both Google gold and MCLinker are designed mainly for ELF, and can take advantage of ELF features.
* Google gold
Google gold is designed to be a fast ELF linker. Unlike GUN ld, Google gold does not use the BFD library, so it can efficiently use ELF features to speed up performance.
Another feature of Google gold is multithreading. Google gold uses threads to run tasks in parallel, e.g., reading multiple input symbol tables in the same time. This improves performance in some cases.
In contrast to GNU ld, the linking flow of gold is simplified. This makes Google gold's linking stages significantly less than GNU ld, thus Google gold saves more linking time.
Moreover, on the file operations, Google gold is faster than GNU ld as well. GNU ld uses function pointers to deal with format-dependent issues, e.g., byte swapping for endianness. In contrast, Google gold uses C++ template specialization. In GNU ld, all sections and symbols of object files are read by calling function pointers, and this leads GNU ld to being slower.
So far, the framework of MCLinker is established. Symbol resolution is completed and tested. Sections merge, applying relocation, instruction relaxation, and writers are ongoing works. If you are interested in MCLinker, please find design documents and source code in our website. http://code.google.com/p/mclinker/
Tested platform: Currently, only Linux, Mac OS X 10.7, and FreeBSD 9.0 are tested. However, we think you won't run into to any problem on modern UN*X machines.
************* Email Confidentiality Notice ********************
The information contained in this e-mail message (including any
attachments) may be confidential, proprietary, privileged, or otherwise
exempt from disclosure under applicable laws. It is intended to be
conveyed only to the designated recipient(s). Any use, dissemination,
distribution, printing, retaining or copying of this e-mail (including its
attachments) by unintended recipient(s) is strictly prohibited and may
be unlawful. If you are not an intended recipient of this e-mail, or believe
that you have received this e-mail in error, please notify the sender
immediately (by replying to this e-mail), delete any and all copies of
this e-mail (including any attachments) from your system, and do not
disclose the content of this e-mail to any other person. Thank you!
More information about the llvm-dev