[PATCH] D22356: [ThinLTO] Perform conservative weak/linkonce resolution in distributed backend case

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 19 06:12:14 PDT 2016


On Mon, Jul 18, 2016 at 10:40 PM, Xinliang David Li <davidxl at google.com>
wrote:

> Good example.  I think I am convinced :)
>

Yes, I can't think of another way to get this case to work without being
far more conservative about importing, unfortunately. Ok, let me figure out
what the best way is to transmit this info on to the final link through the
build system. Probably through a file that can be directly used with the @
linker option in place of input files.

Since the gold-plugin sees all bitcode files passed to gold, I need to
figure out how to determine which are included in the final link. Aha -
looks like eugenis recently added a way to do this to both gold and the
gold-plugin. More on that in my next reply...


> David
>
> On Mon, Jul 18, 2016 at 9:52 PM, Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>
>>
>> On Jul 18, 2016, at 9:24 PM, Xinliang David Li <davidxl at google.com>
>> wrote:
>>
>> > We use --start-lib/--end-lib internally instead of regular objects. So
>>> it is not a matter of justification, it is a matter of keeping that working.
>>>
>>>
>>> I don't believe this is relevant: the fact that the first link is taking
>>> libraries as an input does not make it a compelling case to use them for
>>> the second link. Static libraries or start-lib/end-lib are a specific
>>> semantic model, and I believe it is just wrong to pass them to the final
>>> link.
>>>
>>> The reason is that the first link is performing linker resolution: this
>>> decision process carry some specific semantic with archives. After this
>>> resolution and the ThinLTO process, there is no reason that makes sense to
>>> me right now to repeat this process.
>>>
>>> It is possible that it is because I have a different mental model of
>>> static archives right now. AFAIK, the semantic difference between plain
>>> objects and archive is that an object defined in an archive is loaded and
>>> selected by the linker only if one the symbol it defines is referenced.
>>>
>>
>> Agree.
>>
>>
>>>
>>> Keeping the linker semantic with ThinLTO means that the objects and
>>> symbols selected during the first link should be the "source of truth":
>>> i.e. we don't want a different linker resolution during the second link.
>>> Every objects that was selected for the first link should be included in
>>> the second link (hence it is wrong to use --start-lib/--end-lib).
>>>
>>>
>>
>> However, I don't see it being wrong to allow second link to do a final
>> linker resolution.
>>
>>
>> Well, let me try to convince you (or maybe you’ll point some flaw in my
>> reasoning).
>> I wrote a simple example to illustrate my thought, taking a common
>> pattern in llvm: with a main() in A.cpp and some component that
>> auto-register itself (like an llvm pass for instance) in B.cpp.
>>
>>
>> $ cat A.cpp
>> #include <stdio.h>
>> void registration(const char *component) {
>> printf("Registering %s\n", component);
>> }
>> void initB(); // defined in B.cpp
>> int main() {
>> initB();
>> }
>> $ cat B.cpp
>> void registration(const char *component);
>> static bool AutoRegisterThisFileInGlobalRegistry = []
>> { registration(__FILE__); return true; }();
>> void initB() { /* whatever */ } // called from A.cpp
>> $
>>
>>
>> Let’s build it with ThinLTO and run it:
>>
>> $ clang -std=c++11 -c -flto=thin A.cpp
>> $ clang -std=c++11 -c -flto=thin B.cpp
>> $ ar -rs B.a B.o
>> $ clang -flto=thin A.o B.a -o first_link
>> $ ./first_link
>> Registering B.cpp
>> $
>>
>> Great, our global registration in the static archive works as expected.
>>
>> Now to simulate the second link, I need to get the object file after
>> ThinLTO processing during the first link:
>>
>> $ clang -flto=thin A.o B.a -o first_link -Wl,-save-temps
>> $ file first_link.0.thinlto.o first_link.1.thinlto.o
>> first_link.0.thinlto.o: Mach-O 64-bit object x86_64
>> first_link.1.thinlto.o: Mach-O 64-bit object x86_64
>> $
>>
>> Let see the two options in practice for the second link, first by passing
>> these objects directly to the linker:
>>
>>
>> $ clang -std=c++11 -flto=thin -o second_link
>> first_link.0.thinlto.o first_link.1.thinlto.o
>> $ ./second_link
>> Registering B.cpp
>> $
>>
>> Same result, great!
>> Now what about packaging the second object (corresponding to B.o) in a
>> library and perform the second link:
>>
>> $ ar -rs B.after_thinlto.a  first_link.1.thinlto.o
>> $ clang -std=c++11 -flto=thin -o second_link_with_archive
>> first_link.0.thinlto.o B.after_thinlto.a
>> $ ./second_link
>> $
>>
>> Too bad, we broke the registration.
>> Because initB() was inlined into main, we lost the reference to B.o and
>> it is not loaded anymore.
>>
>>
>>
>>
>> The fact is that there is one single source of truth as you said -- a
>> program can be linked in different set of object files with different
>> optimization levels (e.g, O0 and O2 difference due to inlining differences).
>>
>> Of course to enable link resolution in second link,  some of the first
>> link's decision needs to be undone --  Teresa's patch is essentially doing
>> that.
>>
>>
>>
>>> Also, the distributed build system probably needs to handle the case
>>> where an object in the archive was not selected to be part of the link at
>>> all, won't be processed by ThinLTO, and there won't be any object to pass
>>> to the final link. I'm not sure how you're handling this with gold right
>>> now though.
>>>
>>
>> No native object file will be generated for such object in this case, so
>> it has no use for the final link?
>>
>>
>> The point I was trying to address is that somehow the build system has to
>> adapt the second linker invocation and can’t just reuse the exact same
>> invocation as the first one.
>>
>>>> Mehdi
>>
>
>


-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160719/a70253b5/attachment.html>


More information about the llvm-commits mailing list