[PATCH] D22356: [ThinLTO] Perform conservative weak/linkonce resolution in distributed backend case

Mon Jul 18 21:52:16 PDT 2016

> On Jul 18, 2016, at 9:24 PM, Xinliang David Li <davidxl at google.com> wrote:
> 
> > We use --start-lib/--end-lib internally instead of regular objects. So it is not a matter of justification, it is a matter of keeping that working.
> 
> 
> I don't believe this is relevant: the fact that the first link is taking libraries as an input does not make it a compelling case to use them for the second link. Static libraries or start-lib/end-lib are a specific semantic model, and I believe it is just wrong to pass them to the final link.
> 
> The reason is that the first link is performing linker resolution: this decision process carry some specific semantic with archives. After this resolution and the ThinLTO process, there is no reason that makes sense to me right now to repeat this process.
> 
> It is possible that it is because I have a different mental model of static archives right now. AFAIK, the semantic difference between plain objects and archive is that an object defined in an archive is loaded and selected by the linker only if one the symbol it defines is referenced.
> 
> Agree.
>  
> 
> Keeping the linker semantic with ThinLTO means that the objects and symbols selected during the first link should be the "source of truth": i.e. we don't want a different linker resolution during the second link. Every objects that was selected for the first link should be included in the second link (hence it is wrong to use --start-lib/--end-lib).
> 
> 
> 
> However, I don't see it being wrong to allow second link to do a final linker resolution.

Well, let me try to convince you (or maybe you’ll point some flaw in my reasoning).
I wrote a simple example to illustrate my thought, taking a common pattern in llvm: with a main() in A.cpp and some component that auto-register itself (like an llvm pass for instance) in B.cpp. 

$ cat A.cpp 
#include <stdio.h>
void registration(const char *component) {
	printf("Registering %s\n", component);
}
void initB(); // defined in B.cpp
int main() {
	initB();
}
$ cat B.cpp 
void registration(const char *component);
static bool AutoRegisterThisFileInGlobalRegistry = [] { registration(__FILE__); return true; }();
void initB() { /* whatever */ } // called from A.cpp
$

Let’s build it with ThinLTO and run it:

$ clang -std=c++11 -c -flto=thin A.cpp 
$ clang -std=c++11 -c -flto=thin B.cpp 
$ ar -rs B.a B.o
$ clang -flto=thin A.o B.a -o first_link
$ ./first_link
Registering B.cpp
$

Great, our global registration in the static archive works as expected.

Now to simulate the second link, I need to get the object file after ThinLTO processing during the first link:

$ clang -flto=thin A.o B.a -o first_link -Wl,-save-temps
$ file first_link.0.thinlto.o first_link.1.thinlto.o 
first_link.0.thinlto.o: Mach-O 64-bit object x86_64
first_link.1.thinlto.o: Mach-O 64-bit object x86_64
$

Let see the two options in practice for the second link, first by passing these objects directly to the linker:

$ clang -std=c++11 -flto=thin -o second_link first_link.0.thinlto.o first_link.1.thinlto.o 
$ ./second_link 
Registering B.cpp
$

Same result, great! 
Now what about packaging the second object (corresponding to B.o) in a library and perform the second link:

$ ar -rs B.after_thinlto.a  first_link.1.thinlto.o 
$ clang -std=c++11 -flto=thin -o second_link_with_archive first_link.0.thinlto.o B.after_thinlto.a
$ ./second_link 
$

Too bad, we broke the registration.
Because initB() was inlined into main, we lost the reference to B.o and it is not loaded anymore.

> The fact is that there is one single source of truth as you said -- a program can be linked in different set of object files with different optimization levels (e.g, O0 and O2 difference due to inlining differences).
> 
> Of course to enable link resolution in second link,  some of the first link's decision needs to be undone --  Teresa's patch is essentially doing that.
> 
>  
> Also, the distributed build system probably needs to handle the case where an object in the archive was not selected to be part of the link at all, won't be processed by ThinLTO, and there won't be any object to pass to the final link. I'm not sure how you're handling this with gold right now though.
> 
> No native object file will be generated for such object in this case, so it has no use for the final link?

The point I was trying to address is that somehow the build system has to adapt the second linker invocation and can’t just reuse the exact same invocation as the first one.

— 
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160718/64028306/attachment-0001.html>