[llvm-dev] [LLD] Linker Relaxation

Wed Jul 12 00:26:51 PDT 2017

Hi,

On Wed, Jul 12, 2017 at 2:21 AM, Rui Ueyama via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
> Thanks, Bruce. This is a very interesting optimization.
>
> lld doesn't currently have code to support that kind of code shrinking
> optimization, but we can definitely add it. It seems that essentially we
> need to iterate over all relocations while rewriting instructions until a
> convergence is obtained. But the point is how to do do it efficiently --
> link speed really matters. I can't come up with an algorithm to parallelize
> it. Do you have any idea?
>
> In order to shrink instructions, all address references must be explicitly
> represented as relocations even if they are in the same section. I think
> that means object files for RISC-V have many more relocations than the other
> architectures. Is this correct?

Indeed. RISC-V would need to emit relocations for PC-relative offsets
otherwise those offsets will become incorrect after relaxation.

On Wed, Jul 12, 2017 at 2:27 AM, Rui Ueyama via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> By the way, since this is an optional code relaxation, we can think about it
> later. The first thing I would do is to add RISC-V support to lld without
> code shrinking relaxations, which I believe is doable by at most a few
> hundreds lines of code.

Yes, we have a working target for RISC-V in lld now (with relaxation)
and is passing our internal tests. Our iterated relaxation currently
makes a copy of each input section, loops over each of them to process
relaxation and then adjust symbol address and relocation entries
accordingly, just before they are written out. This works but isn't
optimal. Since we intend to contribute this target back to upstream
later on, we'd like to discuss how this should be properly handled.

Note that RISC-V also handles alignment as part of relaxation, so it
isn't really optional. For example:

_start:
    mv      a0, a0
    .p2align 2
    li      a0, 0

The assembler inserts a 3-byte padding (note: this behavior isn't
merged yet, see: https://github.com/riscv/riscv-binutils-gdb/pull/88):

00000000 <_start>:
   0:   852a                    mv      a0,a0
   2:   00 01 00                        # R_RISCV_ALIGN
                        2: R_RISCV_ALIGN        *ABS*+0x3
   5:   4501                    li      a0,0

The linker then remove 1 byte from padding to align to the desired width:

00010054 <_start>:
   10054:       852a                    mv      a0,a0
   10056:       0001                    nop
   10058:       4501                    li      a0,0

This essentially shrinks code size and must be performed as RISC-V
instructions must be 2-byte aligned. Therefore lld must be able to
accommodate changes of content in an input section.

Chih-Mao Chen (PkmX)
Software R&D, Andes Technology