[PATCH] D22356: [ThinLTO] Perform conservative weak/linkonce resolution in distributed backend case

Mon Jul 18 22:40:32 PDT 2016

Good example.  I think I am convinced :)

David

On Mon, Jul 18, 2016 at 9:52 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:

>
> On Jul 18, 2016, at 9:24 PM, Xinliang David Li <davidxl at google.com> wrote:
>
> > We use --start-lib/--end-lib internally instead of regular objects. So
>> it is not a matter of justification, it is a matter of keeping that working.
>>
>>
>> I don't believe this is relevant: the fact that the first link is taking
>> libraries as an input does not make it a compelling case to use them for
>> the second link. Static libraries or start-lib/end-lib are a specific
>> semantic model, and I believe it is just wrong to pass them to the final
>> link.
>>
>> The reason is that the first link is performing linker resolution: this
>> decision process carry some specific semantic with archives. After this
>> resolution and the ThinLTO process, there is no reason that makes sense to
>> me right now to repeat this process.
>>
>> It is possible that it is because I have a different mental model of
>> static archives right now. AFAIK, the semantic difference between plain
>> objects and archive is that an object defined in an archive is loaded and
>> selected by the linker only if one the symbol it defines is referenced.
>>
>
> Agree.
>
>
>>
>> Keeping the linker semantic with ThinLTO means that the objects and
>> symbols selected during the first link should be the "source of truth":
>> i.e. we don't want a different linker resolution during the second link.
>> Every objects that was selected for the first link should be included in
>> the second link (hence it is wrong to use --start-lib/--end-lib).
>>
>>
>
> However, I don't see it being wrong to allow second link to do a final
> linker resolution.
>
>
> Well, let me try to convince you (or maybe you’ll point some flaw in my
> reasoning).
> I wrote a simple example to illustrate my thought, taking a common pattern
> in llvm: with a main() in A.cpp and some component that auto-register
> itself (like an llvm pass for instance) in B.cpp.
>
>
> $ cat A.cpp
> #include <stdio.h>
> void registration(const char *component) {
> printf("Registering %s\n", component);
> }
> void initB(); // defined in B.cpp
> int main() {
> initB();
> }
> $ cat B.cpp
> void registration(const char *component);
> static bool AutoRegisterThisFileInGlobalRegistry = []
> { registration(__FILE__); return true; }();
> void initB() { /* whatever */ } // called from A.cpp
> $
>
>
> Let’s build it with ThinLTO and run it:
>
> $ clang -std=c++11 -c -flto=thin A.cpp
> $ clang -std=c++11 -c -flto=thin B.cpp
> $ ar -rs B.a B.o
> $ clang -flto=thin A.o B.a -o first_link
> $ ./first_link
> Registering B.cpp
> $
>
> Great, our global registration in the static archive works as expected.
>
> Now to simulate the second link, I need to get the object file after
> ThinLTO processing during the first link:
>
> $ clang -flto=thin A.o B.a -o first_link -Wl,-save-temps
> $ file first_link.0.thinlto.o first_link.1.thinlto.o
> first_link.0.thinlto.o: Mach-O 64-bit object x86_64
> first_link.1.thinlto.o: Mach-O 64-bit object x86_64
> $
>
> Let see the two options in practice for the second link, first by passing
> these objects directly to the linker:
>
>
> $ clang -std=c++11 -flto=thin -o second_link
> first_link.0.thinlto.o first_link.1.thinlto.o
> $ ./second_link
> Registering B.cpp
> $
>
> Same result, great!
> Now what about packaging the second object (corresponding to B.o) in a
> library and perform the second link:
>
> $ ar -rs B.after_thinlto.a  first_link.1.thinlto.o
> $ clang -std=c++11 -flto=thin -o second_link_with_archive
> first_link.0.thinlto.o B.after_thinlto.a
> $ ./second_link
> $
>
> Too bad, we broke the registration.
> Because initB() was inlined into main, we lost the reference to B.o and it
> is not loaded anymore.
>
>
>
>
> The fact is that there is one single source of truth as you said -- a
> program can be linked in different set of object files with different
> optimization levels (e.g, O0 and O2 difference due to inlining differences).
>
> Of course to enable link resolution in second link,  some of the first
> link's decision needs to be undone --  Teresa's patch is essentially doing
> that.
>
>
>
>> Also, the distributed build system probably needs to handle the case
>> where an object in the archive was not selected to be part of the link at
>> all, won't be processed by ThinLTO, and there won't be any object to pass
>> to the final link. I'm not sure how you're handling this with gold right
>> now though.
>>
>
> No native object file will be generated for such object in this case, so
> it has no use for the final link?
>
>
> The point I was trying to address is that somehow the build system has to
> adapt the second linker invocation and can’t just reuse the exact same
> invocation as the first one.
>
> —
> Mehdi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160718/d9bdcfa3/attachment.html>