[LLVMdev] LLD improvement plan

Fri May 1 18:19:52 PDT 2015

I am on the airport waiting to go on vacations, but I must say I am
extremely happy to see this happen!

I agree with the proposed direction and steps:

Implement section based linking for coff.

Use that for elf.

If it makes sense, use it for macho.
 On May 1, 2015 3:32 PM, "Rui Ueyama" <ruiu at google.com> wrote:

> Hi guys, After working for a long period of time on LLD, I think I found a
> few things that we should improve in the LLD design for both development
> ease and runtime performance. I would like to get feedback on this
> proposal. Thanks! *Problems with the current LLD architecture *The
> current LLD architecture has, in my opinion, two issues.
>
> *The atom model is not the best model for some architectures *The atom
> model makes sense only for Mach-O, but it’s used everywhere. I guess that
> we originally expected that we would be able to model the linker’s behavior
> beautifully using the atom model because the atom model seemed like a
> superset of the section model. Although it *can*, it turned out that it’s
> not necessarily natural and efficient model for ELF or PE/COFF on which
> section-based linking is expected. On ELF or PE/COFF, sections are units of
> atomic data. We divide a section into smaller “atoms” and then restore the
> original data layout later to preserve section’s atomicity. That
> complicates the linker internals. Also it slows down the linker because of
> the overhead of creating and manipulating atoms. In addition to that, since
> section-based linking is expected on the architectures, some linker
> features are defined in terms of sections. An example is “select largest
> section” in PE/COFF. In the atom model, we don’t have a notion of sections
> at all, so we had to simulate such features using atoms in tricky ways.
>
> *One symbol resolution model doesn’t fit all *The symbol resolution
> semantics are not the same on three architectures (ELF, Mach-O and
> PE/COFF), but we only have only one "core" linker for the symbol
> resolution. The core linker implements the Unix linker semantics; the
> linker visits a file at a time until all undefined symbols are resolved.
> For archive files having circular dependencies, you can group them to tell
> the linker to visit them more than once. This is not the only model to
> create a linker. It’s not the simplest nor fastest. It’s just that the Unix
> linker semantics is designed this way, and we all follow for compatibility.
> For PE/COFF, the linker semantics are different. The order of files in the
> command line doesn’t matter. The linker scans all files first to create a
> map from symbols to files, and use the map to resolve all undefined
> symbols. The PE/COFF semantics are currently simulated using the Unix
> linker semantics and groups. That made the linker inefficient because of
> the overhead to visit archive files again and again. Also it made the code
> bloated and awkward. In short, we generalize too much, and we share code
> too much.
>
> *Proposal*
>
>    1. Re-architect the linker based on the section model where it’s
>    appropriate.
>    2. Stop simulating different linker semantics using the Unix model.
>    Instead, directly implement the native behavior.
>
> When it’s done, the atom model will be used only for Mach-O. The other two
> will be built based on the section model. PE/COFF will have a different
> "core" linker than Unix’s. I expect this will simplify the design and also
> improve the linker’s performance (achieving better performance is probably
> the best way to convince people to try LLD). I don’t think we can gradually
> move from the atom model to the section model because atoms are everywhere.
> They are so different that we cannot mix them together at one place.
> Although we can reuse the design and the outline the existing code, this is
> going to be more like a major rewriting rather than updating. So I propose
> developing section-based ports as new "ports" of LLD. I plan to start
> working on PE/COFF port first because I’m familiar with the code base and
> the amount of code is less than the ELF port. Also, the fact that the ELF
> port is developed and maintained by many developers makes porting harder
> compared to PE/COFF, which is written and maintained only by me. Thus, I’m
> going to use PE/COFF as an experiment platform to see how it works. Here is
> a plan.
>
>    1. Create a section-based PE/COFF linker backend as a new port
>    2. If everything is fine, do the same thing for ELF. We may want to
>    move common code for a section-based linker out of the new PE/COFF port to
>    share it with ELF.
>    3. Move the library for the atom model to the sub-directory for the
>    Mach-O port.
>
> The resulting linker will share less code between ports. That’s not
> necessarily a bad thing -- we actually think it’s a good thing because in
> order to share code we currently have too many workarounds. This change
> should fix the balance so that we get (1) shared code that’s naturally able
> to be shared by multiple ports, and (2) simpler, faster code.
> *Work Estimation *It’s hard to tell, but I’m probably able to create a
> PE/COFF linker in a few weeks, which works reasonably well and ready for
> code review as a first set of patches. I have already built a complete
> linker for Windows, so the hardest part (understanding it) is already done.
> Once it’s done, I can get a better estimation for ELF.
> *Caveat **Why not define a section as an atom and keep using the atom
> model? *If we do this, we would have to allow atoms to have more than one
> name. Each name would have an offset in the atom (to represent symbols
> whose offset from the section start is not zero). But still we need to copy
> section attributes to each atom. The resulting model no longer looks like
> the atom model, but a mix of the atom model and the section model, and that
> comes with the cost of both designs. I think it’s too complicated.
>
> *Notes*
> We want to make sure there’s no existing LLD users who depend on the atom
> model for ELF, or if there’s such users, we want to come up with a
> transition path for them.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150501/5bfb858d/attachment.html>