[LLVMdev] LLD improvement plan

Rui Ueyama ruiu at google.com
Mon May 4 11:18:19 PDT 2015


On Mon, May 4, 2015 at 1:29 AM, James Courtier-Dutton <
james.dutton at gmail.com> wrote:

> On 1 May 2015 at 20:31, Rui Ueyama <ruiu at google.com> wrote:
> >
> > One symbol resolution model doesn’t fit all The symbol resolution
> semantics
> > are not the same on three architectures (ELF, Mach-O and PE/COFF), but we
> > only have only one "core" linker for the symbol resolution. The core
> linker
> > implements the Unix linker semantics; the linker visits a file at a time
> > until all undefined symbols are resolved. For archive files having
> circular
> > dependencies, you can group them to tell the linker to visit them more
> than
> > once. This is not the only model to create a linker. It’s not the
> simplest
> > nor fastest. It’s just that the Unix linker semantics is designed this
> way,
> > and we all follow for compatibility. For PE/COFF, the linker semantics
> are
> > different. The order of files in the command line doesn’t matter. The
> linker
> > scans all files first to create a map from symbols to files, and use the
> map
> > to resolve all undefined symbols. The PE/COFF semantics are currently
> > simulated using the Unix linker semantics and groups. That made the
> linker
> > inefficient because of the overhead to visit archive files again and
> again.
> > Also it made the code bloated and awkward. In short, we generalize too
> much,
> > and we share code too much.
> >
>
> Why can't LLD be free to implement a resolving algorithm that performs
> better.
> The PE/COFF method you describe seems more efficient that the existing
> ELF method.
> What is stopping LLD from using the PE/COFF method for ELF. It could
> also do further optimizations such as caching the resolved symbols.
> To me,the existing algorithms read as ELF == Full table scan,  PE/COEF
> == Indexed.
>

The two semantics are not compatible. The results of the two are not always
the same.

For example, this is why we have to pass -lc after object files instead of
the beginning of the command line. "ln -lc foo.o" would just skip libc
because when it visits the library, there's no undefined symbols to be
resolved. foo.o would then add undefined symbols that could have been
resolved using libc, but it's too late. The link would fail. This is how
linkers on Unix works. There are other differences resulting from the
difference, so we cannot change that unless we break the compatibility.

Also, could some of the symbol resolution be done at compile time?
> E.g. If I include stdio.h, I know which link time library that is
> associated with, so I can resolve those symbols at compile time.
> Maybe we could store that information in the pre-compiled headers file
> format, and subsequently in the .o files.
> This would then leave far fewer symbols to resolve at link time.
>

You can link against an alternative libc, for example, so that's not
usually doable.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150504/319aafae/attachment.html>


More information about the llvm-dev mailing list