[llvm] r211315 - Use lib/LTO directly in the gold plugin.

Tue Jul 1 13:39:55 PDT 2014

On Jun 19, 2014, at 6:05 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:

> On 19 June 2014 19:28, Nick Kledzik <kledzik at apple.com> wrote:
>> Rafael,
>> 
>> What are your thoughts on how LTO support will be added to lld?
> 
> They are fuzzy with regards to the actual implementation. Probably the
> only strong opinion so far is that tools/lto should not be used.
> 
> What I do have is a list of desirable properties that seem possible
> with gold+plugin. I want to experiment with that to get a better
> feeling of what a implementation in lld should look like. Have you
> seen the talk on the last dev meeting? I went into some detail there,
> but some of the nice things I think we can get are
> 
> * Don't read the full bitcode until all of symbol resolution is done.
I don’t know enough about the bitcode format to know how lazy the expansion can 
be, especially the references to undefined symbols.
Another approach would be to add new optional chunk to the start of bitcode files
which is only generated with -flto compiler mode.  The new chunk is just a list of the
symbols defined and referenced by the bitcode.  The linker just parses the
optional chunk to get everything it needs.

> * Drop duplicated comdats/weak symbols even before passing the modules
> to lib/Linker.
Seems like to do this, you need to have already parsed the bit code into functions
then just not adding some functions to the merged module.  If we had the
symbol info chunk I described above, could we have a new variant of the
bitcode reader that takes a list of names to not-extract.  

> * Have at most 2 loaded bitcode files at once.
I can see wanting to do that to reduce memory growth.  But that is somewhat
at odds with the current parallel reading in lld of elf/mach-o/coff files. If we 
had the symbol chunk, lld could do all its work on just proxy atoms based
on the symbol chunk.  Then after it has been decided what to keep, lld can
serially expand each bitcode file limited to the parts are actually needed
and merge that into the main module.  

For ease of adoption, the ld64 model has always been that you can mix and
match -flto files with non-lto files (mach-o).  But this has made the linker’s job
much more difficult.  For instance suppose you have a bitcode file which defines
a bunch of functions.  One of the functions is _foo and _foo calls _bar.  In fact
it is the only use of _bar in the bitcode.  When ld64 reads the bit code, it sees
the use of _bar, so the linker looks and finds _bar defined in a mach-o archive,
so the linker loads that member which in turns drags in more stuff.  Now
when libLTO code generation happens, it turns out that _foo is not used, so
libLTO removes _foo.  But now the linker has _bar (and everything it dragged in)
which is unnecessary, so the linker has to do another pass at dead code
stripping.  This is all because libLTO provides a very coarse grain view of the
contents of a bitcode file. But to get a finer grain view, the bitcode needs to be
fully parsed early on, which slows down the link…

-Nick