[llvm-commits] [patch][gold plugin] Don't internalize symbols in objects that we will use as pass-through

Devang Patel devang.patel at gmail.com
Tue Jun 22 22:35:08 PDT 2010


On Tue, Jun 22, 2010 at 4:22 PM, Cary Coutant <ccoutant at google.com> wrote:
>> The last remaining issue that i know is that gold can ask us to load a
>> file because it is used only on the IR. We then internalize the
>> symbols in that file and run codegen. Codegen can then create new
>> undefined references to those files that forces gold to fetch them
>> again (from the native code this time).
>
> I'm not sure I understand the scenario here. As I understand it, you
> have libgcc and libc in archive form, and full of IR objects. You have
> a reference to a symbol in one of those libraries in your IR code, so
> gold is loading a .o from the archive, and your plugin is claiming it.
> During LTO, you determine you don't actually need the symbol (because
> you ended up inlining it, perhaps, and it was IRONLY), so you drop the
> definition only to have later codegen generate a call to that routine,
> assuming that it's a standard library routine that can be found in
> object form. Is this about right?
>
>> There are a number of problem with this
>> *) Gold doesn't implement this all that well. It still has in its
>> symbol table that it loaded a file defining that symbol.
>
> What's still in the symbol table is a placeholder symbol, though. You
> mean that it won't load a definition for that symbol from an archive
> of real objects? That could probably be fixed, although it seems like
> the compiler really ought to be responsible for providing the
> definition.
>
>> *) We would end up with multiple copies of some symbols.
>
> How? If you have references to A and B in the IR, and you define A in
> a replacement file, but then search a real archive where A and B are
> both defined in the same object, yes, you could get a multiple
> definition. If this is the case, you should be breaking up the archive
> library into finer-grained objects.
>
>> *) LLVM can use smaller chunks of files than gold. Consider the case
>> of a file defining functions foo and bar. Function foo is used from
>> elf, and so we don't internalize it. Function bar is not used at that
>> time and we drop it. Now codegen introduces an undefined reference to
>> bar. What should gold do? Bringing in that file will fail because we
>> will have two visible definitions of foo. Not doing so will fail
>> because there is no where else to find bar.
>
> Right. What should gold do? Ideally, LTO would be able to anticipate
> any low-level routines needed by the late stages of codegen, and
> provide definitions of those routines in the replacement files it adds
> to the link.

... or ... The LLVM optimizer do not optimize away the symbols whose
reference may be introduced by LLVM codegen (how to arrange this can
be considered LLVM internal detail). The traditional linker can dead
strip such symbols later on if their reference is not introduced by
LLVM codegen phase.

> Adding a low-level library as a catch-all replacement
> file at the end of the link is just a hack, but to make that work, you
> need to have such symbols each in their own object file, so gold can
> load just what it needs without being forced to load symbols it
> doesn't need.
>
>> The best solution I could find is to disable internalize for any
>> functions defined in a library that is passed through. The attached
>> patch does this. It can be optimized, but I am not sure if that is
>> worth it, since normally there are only two libraries being passed
>> through (libgcc and libc).
>
> By "pass through," I assume you mean those low-level libraries that
> provide the functions that might be called by code generated late
> (after LTO analysis). Does this mean that gold first sees these
> libraries as IR files, then the plugin turns around and adds the
> "real" ELF equivalents as replacement files in order to catch these
> references introduced by late codegen? If you decline to internalize
> these symbols during LTO analysis, what's the point of providing them
> as IR files in the first place? (Maybe I'm not clear on what
> "internalize" means.)
>
> -cary
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>


-
Devang



More information about the llvm-commits mailing list