[LLVMdev] [RFC] Linkage of user-supplied library functions in LTO

Duncan P. N. Exon Smith dexonsmith at apple.com
Mon Mar 10 09:49:39 PDT 2014


On Mar 8, 2014, at 3:43 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:

> I believe it doesn't matter if the symbols in sections are internal or external---that only matters for symbol resolution.

Given this...

> I've read the original thread (the 3 emails), and I'm still not sure what the purpose of internalization is in the context of user-provided library functions.

...I’m not sure there is a point.

The general idea is: unless the linker has told us to preserve a symbol,
internalize it, exposing it to other optimizations (like -globalopt).
However, for library functions, this breaks down because later passes
insert calls (e.g., -instcombine converts printf => puts, and
-codegenprepare converts llvm.memcpy => memcpy).  So, add them to
@llvm.compiler.used to protect them temporarily.

If

  - the linker (e.g., /bin/ld) will delete unreferenced symbols (through
    -dead_strip, etc.) only if they have local linkage, or

  - LTO has a pass that will delete unreferenced symbols with local
    linkage *after* @llvm.compiler.used gets dropped (maybe we can add
    this),

then there’s a point.

> If the output of LTO is one giant object file, it could make some sense (since the assembler could potentially do the "symbol resolution”).

The output of LTO *is* one giant object file, but the linker (e.g.,
/bin/ld) may be linking it with other object files.

> Otherwise the problem is in telling the "ld" which definition of "printf" it needs to pick up,

In the LTO API, the linker should call
lto_codegen_add_must_preserve_symbol() on symbols it expects to come out
the other side.  Basically, the user of LTO decides which version of
printf to pick up.  If there are any calls to printf from outside the
bitcode and the linker is using the one in the bitcode, then the one in
the bitcode won’t be internalized.

> or asking the user not to link the program with libc (a bit of a questionable request).

A common case for user-supplied library functions is that users cannot
link against libc, so they supply their own.  This shouldn’t be the only
supported case, though.

> How are optimizations "incorrectly modifying" (user-provided) library functions?

The current problem is that -instcombine will rename the function through
Module::getOrInsertFunction().  getOrInsertFunction() chooses this path
because the function has local linkage.  However, the function is a
member of @llvm.compiler.used, so it shouldn’t really be modified.

I think in the normal case (non-LTO, where -internalize hasn’t run),
Module::getOrInsertFunction() *should* take this path with functions that
have local linkage.  And it’s not trivial to check for membership in
@llvm.compiler.used.



More information about the llvm-dev mailing list