[rfc][gold plugin] Fix pr19901

Tue Aug 5 16:43:28 PDT 2014

On 9 July 2014 11:29, Rafael Espíndola <rafael.espindola at gmail.com> wrote:

> I am going on vacations tonight, so I don't think I will have time to
> finish this until I get back (20th), but I wanted to send out a
> preview of the patch so that others can take a look.
>
> The fundamental issue with that bug is the api mismatch between gold
> and our own LTO api. The gold api talks about "symbol foo in file
> bar". Our api talks about "symbol foo in the merged module".
>
> In that particular testcase, what happens is that gold asks us to keep
> a particular linkonce_odr symbol, but the IR linker doesn't copy it to
> the merged module and we never have a chance to ask lib/LTO to keep
> it.
>

Why is that not the bug? The linker can't assume that all callers are
visible, just that it's merging two modules. A flag for the linker that
tells it that it can see the whole program and therefore may drop linkonce
symbols sounds like a nice optimization, but not correct in general.

The fix, I think, is to have a more direct implementation of the gold
> api. If it asks us to keep a symbol, we change the linkage so it is
> not linkonce. If it says we can drop a symbol, we do so. All of this
> before we even send the module to lib/Linker.
>
> Another advantage is that we can drop some linkonce_odr symbols from
> the symbol table without having to read the entire module upfront,
> only after symbol resolution is done.
>
> The attached patched is quiet a hack. The final intention is to
> refactor lib/LTO to expose the API we need, not to copy and paste code
> as the patch does :-)
>
> The patch is also missing some option handling (setting codegen
> attributes for example) and lto logic (don't internalize memcpy for
> example).
>
> Some interesting properties:
>
> * During symmbol resolution we use a temporary LLVMContext and do lazy
> module loading. This allows us to keep the minimum possible amount of
> allocated memory around. This should also allow as much parallelism as
> we want, since there is no shared context.
>
> * During codegen, we can use a plain Module and trim it before
> linking. This will need some small changes to handle non-elf mangling,
> but I don't expect it to be too hard.
>
> * Since we know what to do with each global, it should be able to make
> the reading parallel, but in here llvmcontext synchronization would be
> an issue.
>
> A similar organization is hopefully possible in lld.
>
> I just coded enough for it to bootstrap with lto enabled. Next items on my
> list:
>
> * Vacations :-)
> * Feature parity and LTO refactoring to avoid code duplication. With
> this it should already be an strict improvement.
> * Add support for COMDATs now that we have them in the IR. This should
> allow us to use gcc style comdats for c++ constructors and
> destructors.
> * try to support bfd ld with COFF, as a testcase where mangling in
> relevant.
> * Try to improve how we represent lazy function bodies to make this
> more efficient.
> * Figure out a better interface with lib/Linker, since right now it is
> "fighting" with the plugin to see who is in charge of symbol
> resolution.
> * Implement lazy loading of metadata.
>
> Cheers,
> Rafael
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140805/d0167163/attachment.html>