[llvm-dev] Running distributed thinLTO without thin archives.

Teresa Johnson via llvm-dev llvm-dev at lists.llvm.org
Fri Jun 28 12:51:03 PDT 2019

Hi Tanoy,

Sorry for the slow response. I haven't thought through what would need to
be done here very closely, but here are a couple of thoughts.
Somehow, the module identifier for each constituent object would need to be
both unique (we currently generate this from the name of the archive plus
the offset in the archive plus the name of the source file IIRC), but also
correctly identify the extracted bitcode object used in the post-thinlink
backend invocation. This is so it can write out the distributed index file
with a filename that gets consumed by the associated backend invocation
(passed to -fthinlto-index=), and so that the module paths emitted in those
index files correctly identify where we can import functions from. Since
the bitcode objects need to be extracted for the corresponding backend
clang invocations, a couple possibilities come to mind:
1) Do it outside the compiler/linker: wrap the whole thing in a script that
does the extraction, invokes the link with extracted constituents
surrounded by --start-lib/-end-lib pairs, invokes each backend through some
parallel or distributed mechanism, and then invokes the final link; or
2) Add support to pass some kind of mapping file into LTO that maps from
each archive constituent to the extracted filename including path that the
corresponding ThinLTO backend clang invocation will use, and have LTO set
the module identifiers accordingly so that everything "just works" (in
theory). We already support some munging of these names (see
the thinlto_object_suffix_replace plugin option in either gold-plugin.cpp
or in lld), but what you need here is a bit more complicated than a simple
suffix change. But since there is already support for adjusting the name,
it might not be too bad to add this support.

Hope that helps,

On Tue, Jun 18, 2019 at 2:27 PM Tanoy Sinha <tsinha at gmail.com> wrote:

> Thanks very much!
> One more question:
> If I wanted to implement archive support for distributed ThinLTO, what all
> would I need to do?
> I know I need to pull out the bitcode module during the optimizer step,
> which I've looked at.
> What else would I need to change?
> On Tue, Jun 18, 2019 at 4:58 PM Teresa Johnson <tejohnson at google.com>
> wrote:
>> On Tue, Jun 18, 2019 at 1:40 PM Tanoy Sinha <tsinha at gmail.com> wrote:
>>> Thanks!
>>> Question about the final link step:
>>> Do I provide all the object files to the link step, i.e. something like:
>>> clang++ -o thinlto main-native.o lib/lib-native.o src/lib-native.o
>>> Do I need to provide --start-lib markers on that final link step as well?
>> No and No. After the thin link the linker has already done its symbol
>> resolution, and using --start-lib/--end-lib in the final link can muck with
>> that. However, the list of files the linker selected in the right order is
>> emitted in the argument given to thinlto-index-only (thinlto.objects in
>> your case below). You can pass that to the native link via the "@" option:
>>   clang++ -o thinlto @thinlto.objects
>> however you need to deal with the fact that this file contains the
>> original bitcode names (not the names you gave it in your backend step like
>> main-native.o, etc)
>> There are 2 options for correcting the names:
>> 1) Manually rename in thinlto.objects
>> 2) Use the thinlto_prefix_replace=oldprefix;newprefix plugin option, to
>> replace the old path prefix of the input bitcode files with a new path
>> prefix. In your case the old prefix is "", so you could do something like
>> "-Wl,-plugin-opt,thinlto_prefix_replace=:native/". This should do 2 things:
>> 1) the generated .thinlto.bc index files and the .imports files will be put
>> under a "native/" subdirectory; 2) the paths in thinlto.objects should also
>> have the "native/" prefix. If you use that prefix in your LTO backend clang
>> invocations (e.g. -o native/main.o instead of main-native.o), then
>> thinlto.objects can just be passed directly to the final link via "@"
>> without any modification.
>> Note that if your thin link included any already native files/libraries,
>> those still need to be passed as the thinlto.objects only includes those
>> that were originally bitcode.
>> Teresa
>>> Tanoy
>>> On Tue, Jun 18, 2019 at 10:37 AM Teresa Johnson <tejohnson at google.com>
>>> wrote:
>>>> Hi Tanoy,
>>>> You can't use distributed ThinLTO with archives (thin or not), at least
>>>> not today. The reason is that we need to be able to identify specific
>>>> bitcode object files to import from in the backends, and that logic does
>>>> not know how to deal with objects within archives. We do distributed
>>>> ThinLTO in our builds but don't use .a files, rather, we use
>>>> --start-lib/--end-lib around the files that would be in the same archive
>>>> when performing the thin link. I.e. if you change your thin link to be:
>>>> clang++ -flto=thin -o index -O3
>>>>  -Wl,-plugin-opt,thinlto-index-only=thinlto.objects
>>>> -Wl,-plugin-opt,thinlto-emit-imports-files main.o --start-lib lib/lib.o
>>>> src/lib.o --end-lib
>>>> things should work.
>>>> Note you also need to do the ThinLTO backend compile for each of the
>>>> archive constituents anyway, e.g. something like:
>>>>   clang++ -c -x ir lib/lib.o -O3 -flto=thin -o lib/lib-native.o
>>>> -fthinlto-index=lib/lib.o.thinlto.bc
>>>> etc
>>>> HTH,
>>>> Teresa
>>>> On Mon, Jun 17, 2019 at 2:46 PM Tanoy Sinha via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>> I'm trying to run distributed ThinLTO without thin archives.
>>>>> When I do, I get an error in the optimizer when clang tries to open a
>>>>> nonexistent file:
>>>>> clang++ -flto=thin -Xclang -fno-lto-unit -O3 -c main.cpp -o main.o
>>>>> clang++ -flto=thin -Xclang -fno-lto-unit -O3 -c lib/lib.cpp -o
>>>>> lib/lib.o
>>>>> clang++ -flto=thin -Xclang -fno-lto-unit -O3 -c src/lib.cpp -o
>>>>> src/lib.o
>>>>> llvm-ar -format gnu qcs lib.a lib/lib.o src/lib.o
>>>>> clang++ -flto=thin -o index -O3
>>>>>  -Wl,-plugin-opt,thinlto-index-only=thinlto.objects
>>>>> -Wl,-plugin-opt,thinlto-emit-imports-files main.o lib.a
>>>>> clang++ -c -x ir main.o -O3 -flto=thin -o main-native.o
>>>>> -fthinlto-index=main.o.thinlto.bc
>>>>> Error loading imported file 'lib.a.llvm.2596.lib.cpp': No such file or
>>>>> directory
>>>>> In this case, gold has registered the modules within my archive with
>>>>> ThinLTO.
>>>>> The string "lib.a.llvm.2596.lib.cpp" is generated with the archive in
>>>>> question, plus an offset indicating where in the archive the particular
>>>>> object file is.
>>>>> Unfortunately, when the optimizer tries to include the proper modules,
>>>>> it's naively looking for a bitcode file with the name of the string
>>>>> provided, but there's obviously no "lib.a.llvm.2596.lib.cpp" for it to open.
>>>>> Has anyone else tried to get clang to understand distributed ThinLTO
>>>>> when using non thin archives?
>>>>> Is there some way to get clang to understand these out of the box?
>>>>> I'm actually a little confused about the ".cpp" in
>>>>> "lib.a.llvm.2596.lib.cpp".
>>>>> Seems like it should be a ".o"?
>>>>> It didn't seem like there was anything out of the box that supported
>>>>> this.
>>>>> I was looking at having clang actually read in the archive file and
>>>>> register the correct bitcode module.
>>>>> I wanted to run it by the list to get some second opinions before I
>>>>> started that.
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> --
>>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |

Teresa Johnson |  Software Engineer |  tejohnson at google.com |
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190628/16216962/attachment.html>

More information about the llvm-dev mailing list