[LLVMdev] Proposal: MCLinker - an LLVM integrated linker

Michael Spencer bigcheesegs at gmail.com
Tue Nov 1 18:26:39 PDT 2011


2011/11/1 Jush Lu (盧育龍) <Jush.Lu at mediatek.com>:
> Hi all,
>
> We are developing a linker, MCLinker.
>
> MCLinker is a linker for LLVM. It leverages the LLVM machine code (MC) layer to link object files and bitcodes, and generate shared objects and executable files.
>
>
> Motivation
> ----------
>
> The development of MCLinker was started out of the need for an LLVM integrated linker. LLVM lacks an integrated linker; hence, it relies on external linkers to generate executables and dynamic shared objects (DSO, .so). MCLinker complements LLVM toolchain for direct generation of all kinds of outputs without temporary files. Therefore, LLVM toolchain can remove all unnecessary read and emission of temporary files. As we know, MC layer can generate object files without external assemblers, we look forward MCLinker can help LLVM to get rid of external linkers as well. Since MCLinker derives from MC layer, it supports multiple targets and formats naturally. As a result, it can make the cross-compilation process easier.
>
> Besides being an integrated linker, MCLinker has other ambition. Embedded systems developers are looking for a linker which performs well even on platforms with limited memory. MCLinker performs fast and format-independent linking with low memory usage. We will illustrate how we achieve this later. Furthermore, MCLinker is under UIUC BSD-Style license. With that, we hope that more platforms and commercial products can adopt it.
>
>
> Features
> --------
>
> MCLinker provides the following features:
>
> * Support Various Formats of Object Files
>
> MCLinker, leveraging LLVM machine code (MC) layer, provides a unified representation for various formats of object files. Different formats of object files, including ELF, COFF, and Mach-O, are read by the corresponding readers and translated into the same representation. This clearly separates the linking algorithms from reading and writing files in different formats. That is, the linking algorithms of MCLinker, such as merging symbol tables, applying relocations and relaxing instructions, are format-independent.
>
> * Fast Linking and Low Memory Usage
>
> The linking algorithms in MCLinker are fast and efficient with limited memory budget. Here is how we achieve these.
>
>  + Improve cache locality of both string tables and symbol tables
>
> MCLinker keeps symbols and corresponding strings in the same cache line. Because symbols and strings are often used in pairs, putting them together can improve spatial locality. Furthermore, MCLinker uses a global symbol table and avoids copying symbols from input to output symbol tables. Since MCLinker always uses the same instance of the symbols with the same name, the locality is improved.
>
>  + Reduce the number of walks over symbol tables
>
> Walk over global symbol table is time-consuming, because programs may have more than 100,000 symbols. GNU ld reads symbols and relocation entries at the same time. Since symbols in that stage are not resolved, GNU ld needs to track relocation entries applied to each symbol. This bookkeeping causes additional traversals of the symbol table. Following Google gold's approach, MCLinker keeps the number of walks over symbol table as few as possible. MCLinker resolves symbols before reading relocation entries, and this avoids extra symbol table traversals.
>
>  + Reduce the overhead of traversing all the symbols
>
> The symbols are categorized into different groups according to their types and bindings. As a result, MCLinker visits a specific group to get the symbols it needs instead of traversing the whole symbol pool. For example, dynamic symbols are grouped together. When MCLinker generates the dynamic symbol section, it only visits the symbols in the dynamic group.
>
>  + Efficient use of memory mapped I/O
>
> Linkers are sensitive to the use of memory mapped I/O. Memory mapped I/O brings data to the physical memory only when accessing it, so that it can reduce the physical memory usage. However, memory mapped I/O will not release pages until pages are un-mapped. In embedded system with limited memory, this means the throughput of the system is degraded. MCLinker optimizes the use of memory mapped I/O. It requests mapped memory according to the average size of object files in the typical case. Thus, more pages are available during linking, and it improves the system throughput.
>
>
> The Design of MCLinker
> ----------------------
>
> MCLinker is an integrated linker for LLVM. "Integrated" means MCLinker has an adapter to LLVM. MC Layer uses a function pass, called AsmPrinter, as an adapter to the standard compilation flow. Like MC Layer, MCLinker also provides a function pass, called SectLinker, to be integrated into the last stage of the standard compilation flow.
>
> Traditional toolchain has a linker driver to prepare parameters for linker. In GCC, collect2 is one of such tools to prepare parameters such as sysroot, the path of glibc, and so on. In MCLinker, MCLDDriver plays the same role as collect2 to prepare all implicit parameters for MCLinker.
>
> MCLinker, a class with the same name as the project, provides numerous APIs to perform major linking tasks, such as symbol resolution, applying relocations, sections merge, and so on. These APIs are defined at high level abstraction which is target and format independent. Those target and format dependent operations are encapsulated in another class, called TargetLDBackend. Therefore, MCLinker can perform linking at a high level abstraction without the concern about targets and formats.
>
> In order to simplify linking and improve performance, MCLinker defines its own symbol data structures, called LDSymbol, instead of using MCSymbol directly. Symbols defined in LLVM are separated into two different classes – MCSymbol and MCSymbolData, and they are very different from the symbols' definition in most object files. If a linker adopts them as the data structure of symbols, it needs extra overhead to convert symbols of object files into MCSymbol and MCSymbolData. In order to overcome this problem, the definition of LDSymbol is close to the symbols' definition in the object files, and the overhead of conversion is low. Actually, not only reading object files, but also symbol resolution become faster and easier when adopting LDSymbol.
>
> MCLinker also defines a unified representation for relocation entries, called LDRelocation, since LLVM does not provide it. The design of relocation in MCLinker is similar to the other portable linkers. Readers convert format-dependent relocation entries into general LDRelocation. All targets need to provide functions for applying relocations. MCLinker connects the functions with corresponding LDRelocations during the initialization.
>
>
> Related Work
> ------------
>
> GNU ld and Google gold are well-known linkers with unique features. These linkers have their own linking algorithms and data structures. We discuss GNU ld and Google gold below.
>
> * GNU ld
>
> GNU ld is designed for the portable manipulation to support various formats of object files. Object files in various formats are represented in a common abstraction, provided by the Binary File Descriptor (BFD) library. The main job of GNU ld is to read and parse link scripts, that is, it behaves as a frontend of the BFD library, and the real linking is done by BFD.
>
> GNU ld reads symbols and relocations at the same time. This means it suffers from the extra overhead to do bookkeeping for relocations. That is what both Google gold and MCLinker want to eliminate. Additionally, BFD is mainly designed for COFF, so ELF is not handled efficiently in BFD. Both Google gold and MCLinker are designed mainly for ELF, and can take advantage of ELF features.
>
> * Google gold
>
> Google gold is designed to be a fast ELF linker. Unlike GUN ld, Google gold does not use the BFD library, so it can efficiently use ELF features to speed up performance.
>
> Another feature of Google gold is multithreading. Google gold uses threads to run tasks in parallel, e.g., reading multiple input symbol tables in the same time. This improves performance in some cases.
>
> In contrast to GNU ld, the linking flow of gold is simplified. This makes Google gold's linking stages significantly less than GNU ld, thus Google gold saves more linking time.
>
> Moreover, on the file operations, Google gold is faster than GNU ld as well. GNU ld uses function pointers to deal with format-dependent issues, e.g., byte swapping for endianness. In contrast, Google gold uses C++ template specialization. In GNU ld, all sections and symbols of object files are read by calling function pointers, and this leads GNU ld to being slower.
>
>
> Current Status
> --------------
>
> So far, the framework of MCLinker is established. Symbol resolution is completed and tested. Sections merge, applying relocation, instruction relaxation, and writers are ongoing works. If you are interested in MCLinker, please find design documents and source code in our website. http://code.google.com/p/mclinker/
>
> Tested platform: Currently, only Linux, Mac OS X 10.7, and FreeBSD 9.0 are tested. However, we think you won't run into to any problem on modern UN*X machines.

Hello, I'm glad to see other people are interested in building an LLVM linker.

I have been working in the dirrection of an LLVM linker for some time
now. The first key component of this is the libObject library in tree.
See my talk from the November 2010 Dev Meeting called "Object Files in
LLVM" <http://www.llvm.org/devmtg/2010-11/>.

We are currently seeking to remove the inherent duplication of object
file handling in all of the llvm projects.

In your design a lot of the structure seems to be dedicated to the
idea of integrated linking. While we want to support this kind of use,
I believe it can be accomplished without as much structure. Most
projects are built as libraries, and the final executable is generally
one source file which links to a bunch
of rather large libraries. The small speedup you get from not writing
this one object file to disk is negligible, considering it's going to
be in cache when the link starts, and it will be the first object
processed.

This structure might make it hardter to use the facilities the linker
provides to do dynamic loading, and linking JITed code. Both of these
are cases where having the linker integrated makes a huge performance
and complexity difference.

A major feature of the linker design I have been working on is the
object file IR it is based on. It takes the concept of an atom (a
named range of data) and formalizes it into a graph based data
structure where edges represent the relations between atoms such as
relocations, groupings, and layout constraints. This provides a format
for the core linker algorithms to work on, and also for plugins and
file-format specific processing to happen. This can also enable more
efficent intermidate object and debug file formats.

We would all like to avoid duplication and have the best linker
possible for LLVM. So I think it's important to discuss these issues
and decide how to best combine our efforts.

- Michael Spencer




More information about the llvm-dev mailing list