[patch] First step to fix pr11866 during LTO

Rafael EspĂ­ndola rafael.espindola at gmail.com
Thu Sep 5 17:16:36 PDT 2013


> Ok.  I think I finally see what you are getting at.   The normal rule when building shared libraries (.dylib) is that all external symbols (not hidden or static) must kept.  For instance, the darwin linker will not dead strip away an unused external function in a dylib (but it would dead strip it away in a main executable) because there might be some dynamic client of that symbol.
>
> When that rule is applied to LTO, it creates two cases when the linker calls lto_codegen_add_must_preserve_symbol():
> 1) the symbol is referenced by native code or the command line, or
> 2) a dylib is being built and the symbol is external.
> You want to differentiate those two cases, by continuing to call lto_codegen_add_must_preserve_symbol() for the first case and change to call lto_codegen_add_symtab_symbol() for the second.

Correct.

> But calling lto_codegen_add_must_preserve_symbol on all external symbols when linking dylibs has always been expensive (lots of calls and LTO must then map the string name to a Value object).  Wouldn’t it be more efficient to just have one new function called just once:
>
>     void lto_codegen_preserve_global_symbols();
>
> and have the linker stop calling lto_codegen_add_must_preserve_symbol() in case 2.   The LTO engine then use that bit to do what you are doing for all external linkonce_odr symbols.

Yes, I think it would work. The semantics would be

* If lto_codegen_preserve_global_symbols was not called, there is no
symbol table to worry about (executable and no -export-dynamic). LLVM
should keep only the symbols passed to
lto_codegen_add_must_preserve_symbol and internalize everything else.
* If lto_codegen_preserve_global_symbols is called, LLVM knows that
the globals are going to a symbol table. It must preserve everything
passed to lto_codegen_add_must_preserve_symbol as before. It can
internalize globals only if knows they can be dropped from the symbol
table.

I think I still like the option where the linker provides a full list
for some reasons:

* The information is easy to get in the gold plugin. Not sure how easy
it is to find if export-dynamic is in effect for example.
* I can't think of a case right now, but if for some reason the linker
knows that a global can be internalized, it can just not call any
function on it.
* Both the linker and LLVM will be figuring out that they can
"internalize" hidden functions.
* Internalize has a std::set<std::string>, so there we have lower
hanging fruit if internalize itself is too slow right now.

>
> Alternately, could the linker just stop calling lto_codegen_add_must_preserve_symbol() on weak external symbols when building dylibs?

No, check the case where the only difference is the address being
relevant or not. "bar" has to be in the symbol table. There is also
just __attribute__((weak)), we have to keep those too, but to the
linker they look the same as a c++ inline function.

Cheers,
Rafael




More information about the llvm-commits mailing list