[PATCH] D23680: [ThinLTO] Emit files for distributed builds for all modules
Teresa Johnson via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 20 11:56:25 PDT 2016
tejohnson added inline comments.
================
Comment at: tools/gold/gold-plugin.cpp:755
@@ -754,1 +754,3 @@
+// Write empty files that may be expected by a distributed build
+// system when invoked with thinlto_index_only. This is invoked when
----------------
tejohnson wrote:
> mehdi_amini wrote:
> > pcc wrote:
> > > tejohnson wrote:
> > > > tejohnson wrote:
> > > > > mehdi_amini wrote:
> > > > > > pcc wrote:
> > > > > > > tejohnson wrote:
> > > > > > > > mehdi_amini wrote:
> > > > > > > > > tejohnson wrote:
> > > > > > > > > > pcc wrote:
> > > > > > > > > > > I am not sure if this is needed, but even if it is, can't this be done inside lib/LTO?
> > > > > > > > > > This is needed because the distributed build system (in this case Bazel) wants to check that its expected outputs were generated. It doesn't know what the linker decided to include in the final link or not. This was a regression from the earlier behavior before we moved to the new LTO API.
> > > > > > > > > >
> > > > > > > > > > My earlier version of the patch had this fix in lib/LTO, but Mehdi expressed a preference (in an offline discussion) to move this into the client.
> > > > > > > > > Right: it seems "client specific" as it is a constraint that comes from Bazel.
> > > > > > > > Although note that Bazel is independent of gold. It just happens that we build with using the gold linker and the Bazel distributed build system.
> > > > > > > When you say "It doesn't know what the linker decided to include" are you referring to which archive members the linker chose?
> > > > > > >
> > > > > > > So the regression is that the distributed build is missing non-ThinLTO objects from archives, and the fix is to create empty .thinlto.bc files to cause the build system to include the non-ThinLTO objects in the final link?
> > > > > > >
> > > > > > > Or am I misunderstanding?
> > > > > > > Although note that Bazel is independent of gold. It just happens that we build with using the gold linker and the Bazel distributed build system.
> > > > > >
> > > > > > Right, I'm not convinced it belongs here entirely :)
> > > > > >
> > > > > > My understanding is that it is very targeted: this operates under the assumptions that:
> > > > > >
> > > > > > 1) The build system has a list of all the input to the linker (including archive members)
> > > > > > 2) The build system expects the linker to produces 1-1 output for each input.
> > > > > >
> > > > > > So it is about both archive members not chosen by the linker, and input files that are not ThinLTO.
> > > > > > I don't think the latter case is handled though, it seems to be relying on every input being built with `-flto=thin` (or I missed a piece somewhere, how would an object file be handled?)
> > > > > Correct, it is to deal with the fact that with archives (--start-lib/--end-lib in our case), gold will not include some of the files in the final link. Those are not passed to the backend and no ThinLTO outputs are created for them.
> > > > >
> > > > > We do emit the list of files to include in the final link (into the thinlto_linked_objects_file, which is an optional file provided to the thinlto-index-only= option), and that ends up informing the final link done by Bazel (essentially passed in via the @ linker option).
> > > > >
> > > > > However, the issue here is with the way Bazel does sanity checking of expected outputs - all it knows when it kicks off the ThinLink action is that it is passing a set of object (IR) files to the Thin Link action, and each action needs to be told what output files it should bring back. The lower level job handling checks that these expected outputs are created and fetches them back. With the current LTO code, that contract is being broken, because not all expected outputs are being created.
> > > > > So it is about both archive members not chosen by the linker, and input files that are not ThinLTO.
> > > > > I don't think the latter case is handled though, it seems to be relying on every input being built with -flto=thin
> > > > > (or I missed a piece somewhere, how would an object file be handled?)
> > > >
> > > > We know which compile actions are ThinLTO actions and set up the expected inputs and outputs that way. In cases where there is e.g. a native object, there isn't a ThinLTO output scheduled to be returned from the ThinLink. So the part of this patch that always generates a summary when compiling -flto=thin fixes the issue where we were not treating these as ThinLTO inputs in the ThinLink.
> > > >
> > > > For the archive members not included in the final link. the issue is that when we set up the ThinLink action, we know which inputs are ThinLTO inputs, and need to specify which outputs should be expected and brought back. That has no knowledge of the linker behavior and which inputs it will ultimately select.
> > > I see. So my next question is, if Bazel knows the names of the expected output files, can it (or some other component of the system such as a wrapper script) not be taught to touch those files by itself before running the initial linker pass?
> > >
> > > I can appreciate if the answer is "not easily". I'm just trying to get a sense of the right tradeoff here.
> > So if I have a project that depends on a vendor supplied library, it is up to the person configuring the project to instruct Bazel about the ThinLTO member inside this archive?
> >
> > So if I have a project that depends on a vendor supplied library, it is up to the person configuring the project to instruct Bazel about the ThinLTO member inside this archive?
>
> There are a few limitations that would currently prevent this from working well with the build system right now. That use case would require some more work beyond this, at least to get the backend job distributed properly. We currently expect to compile all ThinLTO code from source.
>
>
> I can appreciate if the answer is "not easily". I'm just trying to get a sense of the right tradeoff here.
I don't think this is easy given the way things are architected. But it also prevents error checking that we did in fact run the thin link and get outputs.
https://reviews.llvm.org/D23680
More information about the llvm-commits
mailing list