[PATCH] D55080: [ThinLTO] Out-of-process CodeGenerator for legacy C API

Mon Dec 17 09:16:35 PST 2018

tejohnson added a comment.

In D55080#1329591 <https://reviews.llvm.org/D55080#1329591>, @dang wrote:

> In D55080#1328603 <https://reviews.llvm.org/D55080#1328603>, @steven_wu wrote:
>
> > Thanks for taking a look. This patch is adding the customization points for thinLTO legacy API, which the code generator constructs clang invocations to do code generation. There is no dependency on any build system here and it only has a prove of concept codegen manager which invokes clang directly and collect the result back. You can replace this codegen manager with any protocol that is needed to talk to build system to run clang codegen.
> >  XPC is the way to send information between process on Darwin, which is probably what we are going to use to talk to build system. If interested, I can post a patch which have example how to construct XPC communications, but there isn’t a build system you can use to listen on the other side to run the job yet.
> >  When I say there are no code change for linker, I really mean there is no need to change a single line of code (maybe we need to add an API to select codegen manager in the future). ld64 really has a different approach using C API, which it tries to map the object file output back to the bitcode it gets as input. Terminating and relaunching the linker might has unexpected semantic changes for LTO. In the long run, maybe ld64 needs to design a new set of APIs to use the new C++ APIs but this is out of scope of this patch.
>
>
> To reiterate @steven_wu 's point the aim of this patch is to hide the mechanics of out-of-process code generation from ld64, because we can't exit the linker early-like with gold. This is because ld64 tries to remap symbol information deduced by looking at the bitcode to whatever it gets back after thinlto code generation. These customisation point are very similar to what a ThinBackendProc is in the new LTO API. In this case LocalProcessCodeGenManager is like a ThinBackendProc, that emits sliced indices and constructs a clang invocation that performs the codegen and then manages communicating the outputs to the linker via a callback.

Thanks for the clarifications. Would it be possible to utilize ThinBackendProc for this instead of a new CodeGenManager class? I.e. make a new derived version that does the index file write and spawns the local processes? The advantage is that it would start converging the implementations. And I think this could aid in refactoring suggested below to avoid duplication. Another advantage is that both LTO API's would have access to all backend implementations (in process, write indexes and exit, write indexes and use local processes, etc).

================
Comment at: lib/LTO/ThinLTOOutOfProcessCodeGenerator.cpp:182
+// Main entry point for the ThinLTO processing
+void ThinLTOOutOfProcessCodeGenerator::run() {
+  LLVM_DEBUG(
----------------
There's a huge amount of code duplication between this and the base ThinLTOCodeGenerator::run(). Perhaps ThinLTOCodeGenerator can be refactored to use a CodegenManager, and have an in-process thread version of CodegenManager so that both can use the same base run() method but the customization points would be in the CodegenManager virtual methods. Or even better, refactor to use ThinBackendProc (see comment above)?

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55080/new/

https://reviews.llvm.org/D55080