[LLVMdev] [lld] Linker script findings.

Sun Dec 30 20:53:30 PST 2012

Hi all, I have been investigating linker scripts and the functionality
needed to support them in lld. I have attached my findings about the
usage of ldscripts. My findings have been collected from:

- Reading all the GNU ld manual sections about linker scripts.
- Looking at the GNU ld and gold source code.
- Digging through a couple embedded programming tutorials.
- Reading through all of the linker scripts in the Linux kernel tree.
- Other random sources across the net that I'm forgetting about.

In particular, the second to last section (comprising about half the
document) describes all of the functionality that LLD API's will have
to expose to the ldscript language processor in order to link the
Linux kernel.

-- Sean Silva
-------------- next part --------------
==============
Linker Scripts
==============

This is a write-up of what I have been able to find out about linker
scripts and their use. I will describe:

1. GNU ld's default linker scripts.
2. Linker scripts in embedded toolchains.
3. All uses of linker scripts in the Linux kernel.

Linker Script Primer
====================

As far as I can tell, ldscripts usually have suffix `.lds` and more rarely
`.ld`.

In order to provide some context for what follows, here is 90% of what you
need to know about linker scripts:

SECTIONS {                                              /* (1) */
        __text_start = .;                               /* (2) */
        .text : {                                       /* (3) */
                *(.text);                               /* (4) */
                *(.text.*);
        }
        __text_end = .;
        __text_size = __text_end - __text_start;        /* (5) */
}

(1) The fundamental command in linker scripts is `SECTIONS`, in which you
define what input sections go into what output sections.
(2) The symbol `.` represents the "location counter", a special symbol that
indicates the "current address" that we are outputting into; this
assignment creates an absolute symbol `__text_start` in the output file
with the address of the location counter at that point.
(3) This begins the description of the output section `.text`.
(4) The basic format here is `<filespec>(<sectionspec>)`. `<filespec>`
selects input files, and `<sectionspec>` selects which sections of those
input files will be put into the output section. Shell-like wildcarding is
permitted in both parts. So this statement says to combine all `.text` and
`.text.*` sections from all input files into the `.text` section of the
output file. By far the most common output section description in linker
scripts is simply `foo : { *(foo) }`.
(5) C-like expressions can be performed on symbol values.

GNU ld default linker scripts
=============================

There are a peculiar set of linker script that have suffixes `.x`, `.xn`,
and other `.x...` names: these are the linker scripts that GNU ld uses by
default for link jobs. On my machine, these are found in
`/usr/lib/ldscripts/` and there are quite a few of them. However, many
share the same filename excluding the extension. On my machine, I see

$ ls /usr/lib/ldscripts | sed 's/\.x[a-z]*$//' | sort | uniq -c
     13 elf32_x86_64
     13 elf_i386
     13 elf_k1om
     13 elf_l1om
     13 elf_x86_64
      5 i386linux

These appear to correspond to different "supported emulations" (the `-m`
option), since they align with the "Supported emulations" output of ld's
`-V` option. The `-V` output also prints the default linker script (one of
the `.x...` files) which ld selected for that job.

Each suffix corresponds to a different kind of "link job" corresponding to
different command line arguments. Each script indicates in a comment at the
top which command line arguments it is for. For example, `.xr` has at the
top:

        /* Script for ld -r: link without relocation */

These scripts are produced by `ld/genscripts.sh` in the binutils tree. This
script sources one of the files in `ld/emulparams/` which sets some
parameters; notably, it selects one of the templates in `ld/scripttempl/`.
This template is then filled in with the parameters once for each different
kind of "job". Here is a listing of the ones on my machine matching
`elf_x86_64.x*` (corresponding to `ld/emulparams/elf_x86_64.sh`):

        ==> elf_x86_64.x <==
        /* Default linker script, for normal executables */

        ==> elf_x86_64.xbn <==
        /* Script for -N: mix text and data on same page; don't align data */

        ==> elf_x86_64.xc <==
        /* Script for -z combreloc: combine and sort reloc sections */

        ==> elf_x86_64.xd <==
        /* Script for ld -pie: link position independent executable */

        ==> elf_x86_64.xdc <==
        /* Script for -pie -z combreloc: position independent executable, combine & sort relocs */

        ==> elf_x86_64.xdw <==
        /* Script for -pie -z combreloc -z now -z relro: position independent executable, combine & sort relocs */

        ==> elf_x86_64.xn <==
        /* Script for -n: mix text and data on same page */

        ==> elf_x86_64.xr <==
        /* Script for ld -r: link without relocation */

        ==> elf_x86_64.xs <==
        /* Script for ld --shared: link shared library */

        ==> elf_x86_64.xsc <==
        /* Script for --shared -z combreloc: shared library, combine & sort relocs */

        ==> elf_x86_64.xsw <==
        /* Script for --shared -z combreloc -z now -z relro: shared library, combine & sort relocs */

        ==> elf_x86_64.xu <==
        /* Script for ld -Ur: link w/out relocation, do create constructors */

        ==> elf_x86_64.xw <==
        /* Script for -z combreloc -z now -z relro: combine and sort reloc sections */

Since they are all instantiated from the same template, large parts of them
are quite similar; sometimes the diff is just a single line.

Linker scripts in embedded toolchains
=====================================

Linker scripts are important for embedded targets, since the memory map of
the device is fairly specific, and so the linker can't just put things
wherever it wants.

The default linker scripts that are used for the GNU AVR toolchain that
ships with Arduino (found in `hardware/tools/avr/lib/avr/lib/ldscripts/` of
the arduino distribution that I downloaded) contain descriptions of the
memory maps:

MEMORY
{
  text      (rx)   : ORIGIN = 0, LENGTH = 8K
  data      (rw!x) : ORIGIN = 0x800060, LENGTH = 0
  eeprom    (rw!x) : ORIGIN = 0x810000, LENGTH = 64K
  fuse      (rw!x) : ORIGIN = 0x820000, LENGTH = 1K
  lock      (rw!x) : ORIGIN = 0x830000, LENGTH = 1K
  signature (rw!x) : ORIGIN = 0x840000, LENGTH = 1K
}

It's pretty self-explanatory; pretty much the linker just has to respect
these boundaries for sections that are not explicitly placed somewhere.
Also, presumably the linker will give a useful diagnostic if the user has
requested something impossible, such as if the program's executable code
won't fit in memory since `text` has `LENGTH = 8K`. This MEMORY command is
just a convenience to the person writing the linker script and enables
simpler (and less error-prone) ways of writing certain things.

Also, on embedded targets the difference between load memory address
(LMA, `p_paddr`) and virtual memory address (VMA, `p_vaddr`) can be
significant. In many situations the entire program image is stored in
nonvolatile flash memory or ROM that is mapped at a particular address.
However, the initialized data variable generated by e.g. `int x = 3;` (in
global scope) will be expected to be in RAM by the program, but this is a
problem since the initialized data in the program image initially lies in
ROM and is thus not writable. A loader inside the program itself must copy
the relevant parts of its program image to the address in RAM where the
rest of the program expects them to be (or, for bss, to just zero out the
correct range of addresses).

In order to accomplish this, it is necessary to be able to reason about
both the LMA and VMA. Linker scripts provide a variety of syntactic means
to do this, but it basically boils down to two capabilities:

1. Apply relocations to the output section as though it were at the VMA.
2. Actually place the output section at the LMA.

Analysis of Functionality used by Linux
=======================================

The version of linux that I'm currently looking at is 36 commits past
v3.8-rc1 (commit ecccd12 to be exact).

Inside Linux, most linker scripts have suffix `.lds.S`, the `.S`
signifying that indicates that they need to be passed through the C
preprocessor.

This is not as ugly as you may think at first. The C preprocessor is used
primarily as a means of sharing common constants and definitions between
the linker scripts, assembly code, and C code; it's not clear that there is
a better way to accomplish this.

The totality of the linker scripts (and related files) I could find in Linux
is comprised of:
        scripts/module-common.lds
        include/asm-generic/vmlinux.lds.h
                - common preprocessor definitions used for the linker
                  scripts for kernel image ("vmlinux").
        arch/**/*.lds*
                - There are 71 of these total. Most of these
                  are related to linking the kernel image. The rest are
                  composed of mostly vDSO and bootloader stuff.

Description of used features in Linux
-------------------------------------

What follows is a description of the different features used by the
ldscripts in the Linux tree. The intent is basically "if we implement all
these features, then we can link a mainline Linux kernel".

Where possible, I will point out if a linker script command is effectively
equivalent to a particular commandline option, since such features will
surely need to share a common code path. Another reason is that Michael
seems to be enthusiastic about adding functionality needed for commandline
compatibility but has a strong aversion to linker scripts; hopefully, this
will help to more accurately reflect the amount of extra work entailed for
linker script support.

I will also attempt to describe the features from the perspective of what
functionality the LLD libraries must expose in order to implement the
feature. I will avoid discussion of things which are primarily relevant to
implementing the parts of the ldscript language processor that do not
directly interface with the LLD libraries (such as syntax).

/* Specifies the output format. Equivalent to `--oformat`. */
OUTPUT_FORMAT("elf64-alpha")
/* This variant is basically the same, except that the second entry
 * specifies the format to use if `-EB` is in effect, the third entry
 * specifies the format if `-EL`, and the first is the default if neither
 * are specified. */
OUTPUT_FORMAT("elf64-littleaarch64", "elf64-bigaarch64", "elf64-littleaarch64")

/* Selects the binutils BFD to use. From the manpage and the GNU ld manual,
 * it isn't clear whether this corresponds to the `-A` option. This is what
 * the gold's yyscript.y (linker script parser) has to say about it:
 *
 *      /* Top level commands which we ignore.  The GNU linker uses these to
 *         select the output format, but we don't offer a choice.  Ignoring
 *         these is more-or-less OK since most scripts simply explicitly
 *         choose the default.  */
 *      ignore_cmd:
 *                OUTPUT_ARCH '(' string ')'
 *              ;
 */
OUTPUT_ARCH(hppa:hppa2.0w)

/* This specifies the entry symbol. Equivalent to `-e`/`--entry`*/
ENTRY(__start)

/* This defines the program headers that should be present in the output.
 * Output section descriptions inside the SECTIONS command can then specify
 * a specific segment that they be put in.
 * It is also possible to specify the flags that should be present on
 * the segment and that the ELF file header and/or the program headers
 * themselves should be included in the segment.
 * Although Linux does not use this, it is also possible to specify the
 * load memory address (I just mention this because it is probably going to
 * be easy to add this, and then PHDRS will be feature-complete).
 */
PHDRS { kernel PT_LOAD; note PT_NOTE; }

/* Assignments can appear as a command in their own right (outside of a
 * SECTIONS command). This is equivalent to `--defsym`.
 */
jiffies = jiffies_64;

The SECTIONS command contains the bulk of the functionality of linker
scripts. The ldscript primer above already gave an overview of the main
functionality needed, but here is a detailed list of the specific
capabilities that are needed:

- Must be able to create sections in the output.
- Must be able to filter input files and their sections with wildcards in
  order to selectively include them in an output section.
- Must be able to assign an output section to a specific segment (and
  control its LMA and VMA).
- Must be able to get the size of an output section.
- Must be able to discard sections from the output.
- Must be able to add arbitrary data to a section.
- Must be able to specify a specific byte pattern to be used to fill empty
  space between sections (e.g. due to alignment or a deliberate gap); this
  might be able to be implemented in terms of "add arbitrary data to a
  section".

As mentioned in the ldscript primer, arbitrary C-like expressions are
allowed to be performed on symbol values. This requires having access to
the symbol table of the in-progress output file (the expressions are
evaluated lazily). There are also some built-in functions which can return
information about output sections, such as their VMA and LMA.

We also need to be able to interpret the VERSION command, which basically
wraps a version script as would be passed to `--version-script`. The
features mentioned above as needed for SECTIONS are probably enough to
implement this, but it is probably worth reiterating this specific use
case. Effectively what the version script does is say "make these symbols
local", "make these symbols global", and "this version is the parent of
this other version" (even though the versions usually have some
human-discernible order, the programs that manipulate them treat them as
opaque strings, so the parent-child relationship must be explicitly
specified). This feature is primarily of interest in building the vDSO.
Also, looking beyond just the kernel, this feature is essential for
building DSO's in general.

Opportunity for good diagnostics
================================

In a few linker scripts in the kernel, there is a comment like this:

  /DISCARD/ : {
    /*
     * Discard any r/w data - this produces a link error if we have any,
     * which is required for PIC decompression.  Local data generates
     * GOTOFF relocations, which prevents it being relocated independently
     * of the text/got segments.
     */
    *(.data)
  }

It would be nice if there were some way to directly provide them with the
check that they want here and give a nice explanatory diagnostic if it
fails.

Another place for good diagnostics is when a script has failed to assign a
given input section into an output section, causing the input section to
become "orphaned". The linker then can put it pretty much wherever it
wants, potentially with disastrous consequences. It would be nice to warn
on this.