<div dir="ltr"><br><br><div class="gmail_quote">On Thu, May 14, 2015 at 2:31 PM Teresa Johnson <<a href="mailto:tejohnson@google.com">tejohnson@google.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thu, May 14, 2015 at 2:09 PM, Eric Christopher <<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>> wrote:<br>

><br>

><br>

> On Thu, May 14, 2015 at 1:35 PM Teresa Johnson <<a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a>> wrote:<br>

>><br>

>> On Thu, May 14, 2015 at 1:18 PM, Eric Christopher <<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>><br>

>> wrote:<br>

>> ><br>

>> ><br>

>> > On Thu, May 14, 2015 at 1:11 PM David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>><br>

>> > wrote:<br>

>> >><br>

>> >> On Thu, May 14, 2015 at 12:53 PM, Eric Christopher <<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>><br>

>> >> wrote:<br>

>> >>><br>

>> >>><br>

>> >>><br>

>> >>> On Thu, May 14, 2015 at 11:34 AM Daniel Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>><br>

>> >>> wrote:<br>

>> >>>><br>

>> >>>> On Thu, May 14, 2015 at 11:14 AM, Eric Christopher<br>

>> >>>> <<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>><br>

>> >>>> wrote:<br>

>> >>>> > I'm not sure this is a particularly great assumption to make.<br>

>> >>>><br>

>> >>>> Which part?<br>

>> >>><br>

>> >>><br>

>> >>> The binutils part :)<br>

>> >>><br>

>> >>>><br>

>> >>>><br>

>> >>>> >  We have to<br>

>> >>>> > support a lot of different build systems and tools and<br>

>> >>>> > concentrating<br>

>> >>>> > on<br>

>> >>>> > something that just binutils uses isn't particularly friendly here.<br>

>> >>>> I think you may have misunderstood<br>

>> >>>> His point was exactly that they want to be transparent to *all of*<br>

>> >>>> these<br>

>> >>>> tools.<br>

>> >>>> You are saying "we should be friendly to everyone". He is saying the<br>

>> >>>> same thing.<br>

>> >>>> We should be friendly to everyone. The friendly way to do this is to<br>

>> >>>> not require all of these tools build plugins to handle bitcode.<br>

>> >>>><br>

>> >>>> Hence, elf-wrapped bitcode.<br>

>> >>><br>

>> >>><br>

>> >>> Oh, I understood. I just don't know that I agree. To do anything with<br>

>> >>> the<br>

>> >>> tools will require some knowledge of bitcode anyhow or need the<br>

>> >>> plugin. I'm<br>

>> >>> saying that as a baseline start we should look at how to do this using<br>

>> >>> the<br>

>> >>> tools we've got rather than wrapping things for no real gain.<br>

>> >><br>

>> >><br>

>> >> That doesn't seem strictly true - the ar situation (which I'm lead to<br>

>> >> believe is in use in our build system & others, one would assume). With<br>

>> >> the<br>

>> >> symbol table included as proposed, ar can be used without any knowledge<br>

>> >> of<br>

>> >> the bitcode or need for a plugin.<br>

>> >><br>

>> ><br>

>> > For some bits, sure. Optimizing for ar seems a bit silly, why not 'ld<br>

>> > -r'?<br>

>><br>

>> But as mentioned, ld -r can work on native object wrapped bitcode<br>

>> without a plugin as well.<br>

>><br>

><br>

> How? It's not like any partial linking is going to go on inside the bitcode<br>

> if the linker doesn't understand bitcode.<br>

<br>

It allows us to delay the actual linking until the full link step,<br>

thereby enabling ThinLTO on those modules.<br>

<br>

As we discussed offline, the current ld -r behavior with the plugin is<br>

to compile all the way down to machine code. The alternative if we use<br>

straight bitcode is to tell the plugin to stop early after combining<br>

the bitcode and emit bitcode back out, with the thinlto function info<br>

also combined.<br>

<br></blockquote><div><br></div><div>I think this is what should happen anyhow. ld -r that doesn't do a partial link is misleading.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

> Right. I'm not entirely sure what use we're going to see in the existing<br>

> tools that we want to encompass here. There's some of it for convenience<br>

> (i.e. nm etc for developers), but they can use a tool that understands<br>

> bitcode and we can make the existing llvm tools suffice for these needs.<br>

<br>

My understanding from our discussion is that the llvm versions of<br>

those tools do not accept native object files, so that is not<br>

something that will work in the short term.<br>

<br></blockquote><div><br></div><div>We have tools that do understand native object files and it's pretty easy to use the libraries that they're built upon.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

The best alternative to native wrapped bitcode seems to be relying on<br>

the plugin (and changing its behavior for ld -r). Which means that the<br>

ability to use some of the tools from the native toolchain out of the<br>

box, and build systems such as ours have to be taught to use it.<br>

<br></blockquote><div><br></div><div>I don't think that natively wrapped bitcode gets you as much as you think it does anyhow, unless you're duplicating a lot of information (ar, as discussed earlier, aside). I'm not too worried about the build system as far as a wrapping mechanism and I think more traditional LTO schemes with LLVM have just used bitcode/IR output as an input to the LTO link step. I think what we're talking about here is the best way to encode the data that thin lto needs/wants in order to handle summary information etc right?</div><div><br></div><div>-eric</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

><br>

> I think the way of looking at this is that we can:<br>

><br>

> a) go with wrapping things in native object formats, this means<br>

>  - some tools continue to work at the cost of additional I/O and space at<br>

> compile/link time<br>

>  - we still have to update some tools to work at all<br>

><br>

> b) we extend those tools/our own tools and have them be drop in replacements<br>

> to the existing tools. They'll understand the bitcode format natively,<br>

> they'll be smaller, and we'll be able to push the state of the art in<br>

> tooling/analysis a bit more in the future without having to rework thin lto.<br>

><br>

> It's basically a set of trade-offs and for llvm we've historically gone the<br>

> b direction.<br>

><br>

>> ><br>

>> > At any rate, I think this aspect of the proposal needs a bit of<br>

>> > discussion<br>

>> > and some mapping out of the pros and cons here.<br>

>><br>

>> Sure, we can continue to discuss and I will try to lay out the pros/cons.<br>

><br>

><br>

> Excellent.<br>

><br>

> -eric<br>

><br>

>><br>

>><br>

>> Teresa<br>

>><br>

>> ><br>

>> > -eric<br>

>> ><br>

>> >>><br>

>> >>> I've talked to Teresa a bit offline and we're going to talk more later<br>

>> >>> (and discuss on the list), but there are some discussions about how to<br>

>> >>> make<br>

>> >>> this work either with just bitcode/llvm tools and so not requiring<br>

>> >>> integration on all platforms. The latter is what I consider as<br>

>> >>> particularly<br>

>> >>> friendly :)<br>

>> >>><br>

>> >>> -eric<br>

>> >>><br>

>> >>>><br>

>> >>>><br>

>> >>>><br>

>> >>>> > I also<br>

>> >>>> > can't imagine how it's necessary for any of the lto aspects as<br>

>> >>>> > currently<br>

>> >>>> > written in the proposal.<br>

>> >>>> ><br>

>> >>>> > -eric<br>

>> >>>> ><br>

>> >>>> > On Thu, May 14, 2015 at 9:26 AM Xinliang David Li<br>

>> >>>> > <<a href="mailto:xinliangli@gmail.com" target="_blank">xinliangli@gmail.com</a>><br>

>> >>>> > wrote:<br>

>> >>>> >><br>

>> >>>> >> The design objective is to make thinLTO mostly transparent to<br>

>> >>>> >> binutil<br>

>> >>>> >> tools to enable easy integration with any build system in the<br>

>> >>>> >> wild.<br>

>> >>>> >> 'Pass-through' mode with 'ld -r' instead of the partial LTO mode<br>

>> >>>> >> is<br>

>> >>>> >> another<br>

>> >>>> >> reason.<br>

>> >>>> >><br>

>> >>>> >> David<br>

>> >>>> >><br>

>> >>>> >> On Thu, May 14, 2015 at 7:30 AM, Teresa Johnson<br>

>> >>>> >> <<a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a>><br>

>> >>>> >> wrote:<br>

>> >>>> >>><br>

>> >>>> >>> On Thu, May 14, 2015 at 7:22 AM, Eric Christopher<br>

>> >>>> >>> <<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>><br>

>> >>>> >>> wrote:<br>

>> >>>> >>> > So, what Alex is saying is that we have these tools as well and<br>

>> >>>> >>> > they<br>

>> >>>> >>> > understand bitcode just fine, as well as every object format -<br>

>> >>>> >>> > not<br>

>> >>>> >>> > just<br>

>> >>>> >>> > ELF.<br>

>> >>>> >>> > :)<br>

>> >>>> >>><br>

>> >>>> >>> Right, there are also LLVM specific versions (llvm-ar, llvm-nm)<br>

>> >>>> >>> that<br>

>> >>>> >>> handle bitcode similarly to the way the standard tool + plugin<br>

>> >>>> >>> does.<br>

>> >>>> >>> But the goal we are trying to achieve is to allow the standard<br>

>> >>>> >>> system<br>

>> >>>> >>> versions of the tools to handle these files without requiring a<br>

>> >>>> >>> plugin. I know the LLVM tool handles other object formats, but<br>

>> >>>> >>> I'm<br>

>> >>>> >>> not<br>

>> >>>> >>> sure how that helps here? We're not planning to replace those<br>

>> >>>> >>> tools,<br>

>> >>>> >>> just allow the standard system versions to handle the<br>

>> >>>> >>> intermediate<br>

>> >>>> >>> objects produced by ThinLTO.<br>

>> >>>> >>><br>

>> >>>> >>> Thanks,<br>

>> >>>> >>> Teresa<br>

>> >>>> >>><br>

>> >>>> >>> ><br>

>> >>>> >>> > -eric<br>

>> >>>> >>> ><br>

>> >>>> >>> ><br>

>> >>>> >>> > On Thu, May 14, 2015, 6:55 AM Teresa Johnson<br>

>> >>>> >>> > <<a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a>><br>

>> >>>> >>> > wrote:<br>

>> >>>> >>> >><br>

>> >>>> >>> >> On Wed, May 13, 2015 at 11:23 PM, Xinliang David Li<br>

>> >>>> >>> >> <<a href="mailto:xinliangli@gmail.com" target="_blank">xinliangli@gmail.com</a>> wrote:<br>

>> >>>> >>> >> ><br>

>> >>>> >>> >> ><br>

>> >>>> >>> >> > On Wed, May 13, 2015 at 10:46 PM, Alex Rosenberg<br>

>> >>>> >>> >> > <<a href="mailto:alexr@leftfield.org" target="_blank">alexr@leftfield.org</a>><br>

>> >>>> >>> >> > wrote:<br>

>> >>>> >>> >> >><br>

>> >>>> >>> >> >> "ELF-wrapped bitcode" seems potentially controversial to<br>

>> >>>> >>> >> >> me.<br>

>> >>>> >>> >> >><br>

>> >>>> >>> >> >> What about ar, nm, and various ld implementations adds this<br>

>> >>>> >>> >> >> requirement?<br>

>> >>>> >>> >> >> What about the LLVM implementations of these tools is<br>

>> >>>> >>> >> >> lacking?<br>

>> >>>> >>> >> ><br>

>> >>>> >>> >> ><br>

>> >>>> >>> >> > Sorry I can not parse your questions properly. Can you make<br>

>> >>>> >>> >> > it<br>

>> >>>> >>> >> > clearer?<br>

>> >>>> >>> >><br>

>> >>>> >>> >> Alex is asking what the issue is with ar, nm, ld -r and<br>

>> >>>> >>> >> regular<br>

>> >>>> >>> >> bitcode that makes using elf-wrapped bitcode easier.<br>

>> >>>> >>> >><br>

>> >>>> >>> >> The issue is that generally you need to provide a plugin to<br>

>> >>>> >>> >> these<br>

>> >>>> >>> >> tools in order for them to understand and handle bitcode<br>

>> >>>> >>> >> files.<br>

>> >>>> >>> >> We'd<br>

>> >>>> >>> >> like standard tools to work without requiring a plugin as much<br>

>> >>>> >>> >> as<br>

>> >>>> >>> >> possible. And in some cases we want them to be handled<br>

>> >>>> >>> >> different<br>

>> >>>> >>> >> than<br>

>> >>>> >>> >> the way bitcode files are handled with the plugin.<br>

>> >>>> >>> >><br>

>> >>>> >>> >> nm: Without a plugin, normal bitcode files are inscrutable.<br>

>> >>>> >>> >> When<br>

>> >>>> >>> >> provided the gold plugin it can emit the symbols.<br>

>> >>>> >>> >><br>

>> >>>> >>> >> ar: Without a plugin, it will create an archive of bitcode<br>

>> >>>> >>> >> files,<br>

>> >>>> >>> >> but<br>

>> >>>> >>> >> without an index, so it can't be handled by the linker even<br>

>> >>>> >>> >> with<br>

>> >>>> >>> >> a<br>

>> >>>> >>> >> plugin on an -flto link. When ar is provided the gold plugin<br>

>> >>>> >>> >> it<br>

>> >>>> >>> >> does<br>

>> >>>> >>> >> create an index, so the linker + gold plugin handle it<br>

>> >>>> >>> >> appropriately<br>

>> >>>> >>> >> on an -flto link.<br>

>> >>>> >>> >><br>

>> >>>> >>> >> ld -r: Without a plugin, fails when provided bitcode inputs.<br>

>> >>>> >>> >> When<br>

>> >>>> >>> >> provided the gold plugin, it handles them but compiles them<br>

>> >>>> >>> >> all<br>

>> >>>> >>> >> the<br>

>> >>>> >>> >> way through to ELF executable instructions via a partial LTO<br>

>> >>>> >>> >> link.<br>

>> >>>> >>> >> This is where we would like to differ in behavior (while also<br>

>> >>>> >>> >> not<br>

>> >>>> >>> >> requiring a plugin) with ELF-wrapped bitcode: we would like<br>

>> >>>> >>> >> the<br>

>> >>>> >>> >> ld -r<br>

>> >>>> >>> >> output file to still contain ELF-wrapped bitcode, delaying the<br>

>> >>>> >>> >> LTO<br>

>> >>>> >>> >> until the full link step.<br>

>> >>>> >>> >><br>

>> >>>> >>> >> Let me know if that helps address your concerns.<br>

>> >>>> >>> >><br>

>> >>>> >>> >> Thanks,<br>

>> >>>> >>> >> Teresa<br>

>> >>>> >>> >><br>

>> >>>> >>> >> ><br>

>> >>>> >>> >> > David<br>

>> >>>> >>> >> ><br>

>> >>>> >>> >> >><br>

>> >>>> >>> >> >><br>

>> >>>> >>> >> >> Alex<br>

>> >>>> >>> >> >><br>

>> >>>> >>> >> >> > On May 13, 2015, at 7:44 PM, Teresa Johnson<br>

>> >>>> >>> >> >> > <<a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a>><br>

>> >>>> >>> >> >> > wrote:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > I've included below an RFC for implementing ThinLTO in<br>

>> >>>> >>> >> >> > LLVM,<br>

>> >>>> >>> >> >> > looking<br>

>> >>>> >>> >> >> > forward to feedback and questions.<br>

>> >>>> >>> >> >> > Thanks!<br>

>> >>>> >>> >> >> > Teresa<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > RFC to discuss plans for implementing ThinLTO upstream.<br>

>> >>>> >>> >> >> > Background<br>

>> >>>> >>> >> >> > can<br>

>> >>>> >>> >> >> > be found in slides from EuroLLVM 2015:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > <a href="https://drive.google.com/open?id=0B036uwnWM6RWWER1ZEl5SUNENjQ&authuser=0" target="_blank">https://drive.google.com/open?id=0B036uwnWM6RWWER1ZEl5SUNENjQ&authuser=0</a>)<br>

>> >>>> >>> >> >> > As described in the talk, we have a prototype<br>

>> >>>> >>> >> >> > implementation, and<br>

>> >>>> >>> >> >> > would like to start staging patches upstream. This RFC<br>

>> >>>> >>> >> >> > describes<br>

>> >>>> >>> >> >> > a<br>

>> >>>> >>> >> >> > breakdown of the major pieces. We would like to commit<br>

>> >>>> >>> >> >> > upstream<br>

>> >>>> >>> >> >> > gradually in several stages, with all functionality off<br>

>> >>>> >>> >> >> > by<br>

>> >>>> >>> >> >> > default.<br>

>> >>>> >>> >> >> > The core ThinLTO importing support and tuning will<br>

>> >>>> >>> >> >> > require<br>

>> >>>> >>> >> >> > frequent<br>

>> >>>> >>> >> >> > change and iteration during testing and tuning, and for<br>

>> >>>> >>> >> >> > that<br>

>> >>>> >>> >> >> > part<br>

>> >>>> >>> >> >> > we<br>

>> >>>> >>> >> >> > would like to commit rapidly (off by default). See the<br>

>> >>>> >>> >> >> > proposed<br>

>> >>>> >>> >> >> > staged<br>

>> >>>> >>> >> >> > implementation described in the Implementation Plan<br>

>> >>>> >>> >> >> > section.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > ThinLTO Overview<br>

>> >>>> >>> >> >> > ==============<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > See the talk slides linked above for more details. The<br>

>> >>>> >>> >> >> > following<br>

>> >>>> >>> >> >> > is a<br>

>> >>>> >>> >> >> > high-level overview of the motivation.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Cross Module Optimization (CMO) is an effective means for<br>

>> >>>> >>> >> >> > improving<br>

>> >>>> >>> >> >> > runtime performance, by extending the scope of<br>

>> >>>> >>> >> >> > optimizations<br>

>> >>>> >>> >> >> > across<br>

>> >>>> >>> >> >> > source module boundaries. Without CMO, the compiler is<br>

>> >>>> >>> >> >> > limited to<br>

>> >>>> >>> >> >> > optimizing within the scope of single source modules. Two<br>

>> >>>> >>> >> >> > solutions<br>

>> >>>> >>> >> >> > for enabling CMO are Link-Time Optimization (LTO), which<br>

>> >>>> >>> >> >> > is<br>

>> >>>> >>> >> >> > currently<br>

>> >>>> >>> >> >> > supported in LLVM and GCC, and<br>

>> >>>> >>> >> >> > Lightweight-Interprocedural<br>

>> >>>> >>> >> >> > Optimization (LIPO). However, each of these solutions has<br>

>> >>>> >>> >> >> > limitations<br>

>> >>>> >>> >> >> > that prevent it from being enabled by default. ThinLTO is<br>

>> >>>> >>> >> >> > a<br>

>> >>>> >>> >> >> > new<br>

>> >>>> >>> >> >> > approach that attempts to address these limitations, with<br>

>> >>>> >>> >> >> > a<br>

>> >>>> >>> >> >> > goal<br>

>> >>>> >>> >> >> > of<br>

>> >>>> >>> >> >> > being enabled more broadly. ThinLTO is designed with many<br>

>> >>>> >>> >> >> > of<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > same<br>

>> >>>> >>> >> >> > principals as LIPO, and therefore its advantages, without<br>

>> >>>> >>> >> >> > any of<br>

>> >>>> >>> >> >> > its<br>

>> >>>> >>> >> >> > inherent weakness. Unlike in LIPO where the module group<br>

>> >>>> >>> >> >> > decision<br>

>> >>>> >>> >> >> > is<br>

>> >>>> >>> >> >> > made at profile training runtime, ThinLTO makes the<br>

>> >>>> >>> >> >> > decision<br>

>> >>>> >>> >> >> > at<br>

>> >>>> >>> >> >> > compile time, but in a lazy mode that facilitates large<br>

>> >>>> >>> >> >> > scale<br>

>> >>>> >>> >> >> > parallelism. The serial linker plugin phase is designed<br>

>> >>>> >>> >> >> > to<br>

>> >>>> >>> >> >> > be<br>

>> >>>> >>> >> >> > razor<br>

>> >>>> >>> >> >> > thin and blazingly fast. By default this step only does<br>

>> >>>> >>> >> >> > minimal<br>

>> >>>> >>> >> >> > preparation work to enable the parallel lazy importing<br>

>> >>>> >>> >> >> > performed<br>

>> >>>> >>> >> >> > later. ThinLTO aims to be scalable like a regular O2<br>

>> >>>> >>> >> >> > build,<br>

>> >>>> >>> >> >> > enabling<br>

>> >>>> >>> >> >> > CMO on machines without large memory configurations,<br>

>> >>>> >>> >> >> > while<br>

>> >>>> >>> >> >> > also<br>

>> >>>> >>> >> >> > integrating well with distributed build systems. Results<br>

>> >>>> >>> >> >> > from<br>

>> >>>> >>> >> >> > early<br>

>> >>>> >>> >> >> > prototyping on SPEC cpu2006 C++ benchmarks are in line<br>

>> >>>> >>> >> >> > with<br>

>> >>>> >>> >> >> > expectations that ThinLTO can scale like O2 while<br>

>> >>>> >>> >> >> > enabling<br>

>> >>>> >>> >> >> > much<br>

>> >>>> >>> >> >> > of<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > CMO performed during a full LTO build.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > A ThinLTO build is divided into 3 phases, which are<br>

>> >>>> >>> >> >> > referred<br>

>> >>>> >>> >> >> > to<br>

>> >>>> >>> >> >> > in<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > following implementation plan:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > phase-1: IR and Function Summary Generation (-c compile)<br>

>> >>>> >>> >> >> > phase-2: Thin Linker Plugin Layer (thin archive linker<br>

>> >>>> >>> >> >> > step)<br>

>> >>>> >>> >> >> > phase-3: Parallel Backend with Demand-Driven Importing<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Implementation Plan<br>

>> >>>> >>> >> >> > ================<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > This section gives a high-level breakdown of the ThinLTO<br>

>> >>>> >>> >> >> > support<br>

>> >>>> >>> >> >> > that<br>

>> >>>> >>> >> >> > will be added, in roughly the order that the patches<br>

>> >>>> >>> >> >> > would<br>

>> >>>> >>> >> >> > be<br>

>> >>>> >>> >> >> > staged.<br>

>> >>>> >>> >> >> > The patches are divided into three stages. The first<br>

>> >>>> >>> >> >> > stage<br>

>> >>>> >>> >> >> > contains a<br>

>> >>>> >>> >> >> > minimal amount of preparation work that is not<br>

>> >>>> >>> >> >> > ThinLTO-specific.<br>

>> >>>> >>> >> >> > The<br>

>> >>>> >>> >> >> > second stage contains most of the infrastructure for<br>

>> >>>> >>> >> >> > ThinLTO,<br>

>> >>>> >>> >> >> > which<br>

>> >>>> >>> >> >> > will be off by default. The third stage includes<br>

>> >>>> >>> >> >> > enhancements/improvements/tunings that can be performed<br>

>> >>>> >>> >> >> > after the<br>

>> >>>> >>> >> >> > main<br>

>> >>>> >>> >> >> > ThinLTO infrastructure is in.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > The second and third implementation stages will initially<br>

>> >>>> >>> >> >> > be<br>

>> >>>> >>> >> >> > very<br>

>> >>>> >>> >> >> > volatile, requiring a lot of iterations and tuning with<br>

>> >>>> >>> >> >> > large<br>

>> >>>> >>> >> >> > apps to<br>

>> >>>> >>> >> >> > get stabilized. Therefore it will be important to do fast<br>

>> >>>> >>> >> >> > commits<br>

>> >>>> >>> >> >> > for<br>

>> >>>> >>> >> >> > these implementation stages.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > 1. Stage 1: Preparation<br>

>> >>>> >>> >> >> > -------------------------------<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > The first planned sets of patches are enablers for<br>

>> >>>> >>> >> >> > ThinLTO<br>

>> >>>> >>> >> >> > work:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > a. LTO directory structure:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Restructure the LTO directory to remove circular<br>

>> >>>> >>> >> >> > dependence<br>

>> >>>> >>> >> >> > when<br>

>> >>>> >>> >> >> > ThinLTO pass added. Because ThinLTO is being implemented<br>

>> >>>> >>> >> >> > as<br>

>> >>>> >>> >> >> > a SCC<br>

>> >>>> >>> >> >> > pass<br>

>> >>>> >>> >> >> > within Transforms/IPO, and leverages the LTOModule class<br>

>> >>>> >>> >> >> > for<br>

>> >>>> >>> >> >> > linking<br>

>> >>>> >>> >> >> > in functions from modules, IPO then requires the LTO<br>

>> >>>> >>> >> >> > library.<br>

>> >>>> >>> >> >> > This<br>

>> >>>> >>> >> >> > creates a circular dependence between LTO and IPO. To<br>

>> >>>> >>> >> >> > break<br>

>> >>>> >>> >> >> > that,<br>

>> >>>> >>> >> >> > we<br>

>> >>>> >>> >> >> > need to split the lib/LTO directory/library into<br>

>> >>>> >>> >> >> > lib/LTO/CodeGen<br>

>> >>>> >>> >> >> > and<br>

>> >>>> >>> >> >> > lib/LTO/Module, containing LTOCodeGenerator and<br>

>> >>>> >>> >> >> > LTOModule,<br>

>> >>>> >>> >> >> > respectively. Only LTOCodeGenerator has a dependence on<br>

>> >>>> >>> >> >> > IPO,<br>

>> >>>> >>> >> >> > removing<br>

>> >>>> >>> >> >> > the circular dependence.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > b. ELF wrapper generation support:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Implement ELF wrapped bitcode writer. In order to more<br>

>> >>>> >>> >> >> > easily<br>

>> >>>> >>> >> >> > interact<br>

>> >>>> >>> >> >> > with tools such as $AR, $NM, and “$LD -r” we plan to emit<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > phase-1<br>

>> >>>> >>> >> >> > bitcode wrapped in ELF via the .llvmbc section, along<br>

>> >>>> >>> >> >> > with a<br>

>> >>>> >>> >> >> > symbol<br>

>> >>>> >>> >> >> > table. The goal is both to interact with these tools<br>

>> >>>> >>> >> >> > without<br>

>> >>>> >>> >> >> > requiring<br>

>> >>>> >>> >> >> > a plugin, and also to avoid doing partial LTO/ThinLTO<br>

>> >>>> >>> >> >> > across<br>

>> >>>> >>> >> >> > files<br>

>> >>>> >>> >> >> > linked with “$LD -r” (i.e. the resulting object file<br>

>> >>>> >>> >> >> > should<br>

>> >>>> >>> >> >> > still<br>

>> >>>> >>> >> >> > contain ELF-wrapped bitcode to enable ThinLTO at the full<br>

>> >>>> >>> >> >> > link<br>

>> >>>> >>> >> >> > step).<br>

>> >>>> >>> >> >> > I will send a separate design document for these changes,<br>

>> >>>> >>> >> >> > but the<br>

>> >>>> >>> >> >> > following is a high-level overview.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Support was added to LLVM for reading ELF-wrapped bitcode<br>

>> >>>> >>> >> >> > (<a href="http://reviews.llvm.org/rL218078" target="_blank">http://reviews.llvm.org/rL218078</a>), but there does not<br>

>> >>>> >>> >> >> > yet<br>

>> >>>> >>> >> >> > exist<br>

>> >>>> >>> >> >> > support in LLVM/Clang for emitting bitcode wrapped in<br>

>> >>>> >>> >> >> > ELF. I<br>

>> >>>> >>> >> >> > plan<br>

>> >>>> >>> >> >> > to<br>

>> >>>> >>> >> >> > add support for optionally generating bitcode in an ELF<br>

>> >>>> >>> >> >> > file<br>

>> >>>> >>> >> >> > containing a single .llvmbc section holding the bitcode.<br>

>> >>>> >>> >> >> > Specifically,<br>

>> >>>> >>> >> >> > the patch would add new options “emit-llvm-bc-elf”<br>

>> >>>> >>> >> >> > (object<br>

>> >>>> >>> >> >> > file)<br>

>> >>>> >>> >> >> > and<br>

>> >>>> >>> >> >> > corresponding “emit-llvm-elf” (textual assembly code<br>

>> >>>> >>> >> >> > equivalent).<br>

>> >>>> >>> >> >> > Eventually these would be automatically triggered under<br>

>> >>>> >>> >> >> > “-fthinlto<br>

>> >>>> >>> >> >> > -c”<br>

>> >>>> >>> >> >> > and “-fthinlto -S”, respectively.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Additionally, a symbol table will be generated in the ELF<br>

>> >>>> >>> >> >> > file,<br>

>> >>>> >>> >> >> > holding the function symbols within the bitcode. This<br>

>> >>>> >>> >> >> > facilitates<br>

>> >>>> >>> >> >> > handling archives of the ELF-wrapped bitcode created with<br>

>> >>>> >>> >> >> > $AR,<br>

>> >>>> >>> >> >> > since<br>

>> >>>> >>> >> >> > the archive will have a symbol table as well. The archive<br>

>> >>>> >>> >> >> > symbol<br>

>> >>>> >>> >> >> > table<br>

>> >>>> >>> >> >> > enables gold to extract and pass to the plugin the<br>

>> >>>> >>> >> >> > constituent<br>

>> >>>> >>> >> >> > ELF-wrapped bitcode files. To support the concatenated<br>

>> >>>> >>> >> >> > llvmbc<br>

>> >>>> >>> >> >> > section<br>

>> >>>> >>> >> >> > generated by “$LD -r”, some handling needs to be added to<br>

>> >>>> >>> >> >> > gold<br>

>> >>>> >>> >> >> > and to<br>

>> >>>> >>> >> >> > the backend driver to process each original module’s<br>

>> >>>> >>> >> >> > bitcode.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > The function index/summary will later be added as a<br>

>> >>>> >>> >> >> > special<br>

>> >>>> >>> >> >> > ELF<br>

>> >>>> >>> >> >> > section alongside the .llvmbc sections.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > 2. Stage 2: ThinLTO Infrastructure<br>

>> >>>> >>> >> >> > ----------------------------------------------<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > The next set of patches adds the base implementation of<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > ThinLTO<br>

>> >>>> >>> >> >> > infrastructure, specifically those required to make<br>

>> >>>> >>> >> >> > ThinLTO<br>

>> >>>> >>> >> >> > functional<br>

>> >>>> >>> >> >> > and generate correct but not necessarily high-performing<br>

>> >>>> >>> >> >> > binaries. It<br>

>> >>>> >>> >> >> > also does not include support to make debug support under<br>

>> >>>> >>> >> >> > -g<br>

>> >>>> >>> >> >> > efficient<br>

>> >>>> >>> >> >> > with ThinLTO.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > a. Clang/LLVM/gold linker options:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > An early set of clang/llvm patches is needed to provide<br>

>> >>>> >>> >> >> > options<br>

>> >>>> >>> >> >> > to<br>

>> >>>> >>> >> >> > enable ThinLTO (off by default), so that the rest of the<br>

>> >>>> >>> >> >> > implementation can be disabled by default as it is added.<br>

>> >>>> >>> >> >> > Specifically, clang options -fthinlto (used instead of<br>

>> >>>> >>> >> >> > -flto)<br>

>> >>>> >>> >> >> > will<br>

>> >>>> >>> >> >> > cause clang to invoke the phase-1 emission of LLVM<br>

>> >>>> >>> >> >> > bitcode<br>

>> >>>> >>> >> >> > and<br>

>> >>>> >>> >> >> > function summary/index on a compile step, and pass the<br>

>> >>>> >>> >> >> > appropriate<br>

>> >>>> >>> >> >> > option to the gold plugin on a link step. The -thinlto<br>

>> >>>> >>> >> >> > option<br>

>> >>>> >>> >> >> > will be<br>

>> >>>> >>> >> >> > added to the gold plugin and llvm-lto tool to launch the<br>

>> >>>> >>> >> >> > phase-2<br>

>> >>>> >>> >> >> > thin<br>

>> >>>> >>> >> >> > archive step. The -thinlto option will also be added to<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > ‘opt’<br>

>> >>>> >>> >> >> > tool<br>

>> >>>> >>> >> >> > to invoke it as a phase-3 parallel backend instance.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > b. Thin-archive linking support in Gold plugin and<br>

>> >>>> >>> >> >> > llvm-lto:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Under the new plugin option (see above), the plugin needs<br>

>> >>>> >>> >> >> > to<br>

>> >>>> >>> >> >> > perform<br>

>> >>>> >>> >> >> > the phase-2 (thin archive) link which simply emits a<br>

>> >>>> >>> >> >> > combined<br>

>> >>>> >>> >> >> > function<br>

>> >>>> >>> >> >> > map from the linked modules, without actually performing<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > normal<br>

>> >>>> >>> >> >> > link. Corresponding support should be added to the<br>

>> >>>> >>> >> >> > standalone<br>

>> >>>> >>> >> >> > llvm-lto<br>

>> >>>> >>> >> >> > tool to enable testing/debugging without involving the<br>

>> >>>> >>> >> >> > linker and<br>

>> >>>> >>> >> >> > plugin.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > c. ThinLTO backend support:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Support for invoking a phase-3 backend invocation<br>

>> >>>> >>> >> >> > (including<br>

>> >>>> >>> >> >> > importing) on a module should be added to the ‘opt’ tool<br>

>> >>>> >>> >> >> > under<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > new<br>

>> >>>> >>> >> >> > option. The main change under the option is to<br>

>> >>>> >>> >> >> > instantiate a<br>

>> >>>> >>> >> >> > Linker<br>

>> >>>> >>> >> >> > object used to manage the process of linking imported<br>

>> >>>> >>> >> >> > functions<br>

>> >>>> >>> >> >> > into<br>

>> >>>> >>> >> >> > the module, efficient read of the combined function map,<br>

>> >>>> >>> >> >> > and<br>

>> >>>> >>> >> >> > enable<br>

>> >>>> >>> >> >> > the ThinLTO import pass.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > d. Function index/summary support:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > This includes infrastructure for writing and reading the<br>

>> >>>> >>> >> >> > function<br>

>> >>>> >>> >> >> > index/summary section. As noted earlier this will be<br>

>> >>>> >>> >> >> > encoded<br>

>> >>>> >>> >> >> > in a<br>

>> >>>> >>> >> >> > special ELF section within the module, alongside the<br>

>> >>>> >>> >> >> > .llvmbc<br>

>> >>>> >>> >> >> > section<br>

>> >>>> >>> >> >> > containing the bitcode. The thin archive generated by<br>

>> >>>> >>> >> >> > phase-2 of<br>

>> >>>> >>> >> >> > ThinLTO simply contains all of the function index/summary<br>

>> >>>> >>> >> >> > sections<br>

>> >>>> >>> >> >> > across the linked modules, organized for efficient<br>

>> >>>> >>> >> >> > function<br>

>> >>>> >>> >> >> > lookup.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Each function available for importing from the module<br>

>> >>>> >>> >> >> > contains an<br>

>> >>>> >>> >> >> > entry in the module’s function index/summary section and<br>

>> >>>> >>> >> >> > in<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > resulting combined function map. Each function entry<br>

>> >>>> >>> >> >> > contains<br>

>> >>>> >>> >> >> > that<br>

>> >>>> >>> >> >> > function’s offset within the bitcode file, used to<br>

>> >>>> >>> >> >> > efficiently<br>

>> >>>> >>> >> >> > locate<br>

>> >>>> >>> >> >> > and quickly import just that function. The entry also<br>

>> >>>> >>> >> >> > contains<br>

>> >>>> >>> >> >> > summary<br>

>> >>>> >>> >> >> > information (e.g. basic information determined during<br>

>> >>>> >>> >> >> > parsing<br>

>> >>>> >>> >> >> > such as<br>

>> >>>> >>> >> >> > the number of instructions in the function), that will be<br>

>> >>>> >>> >> >> > used to<br>

>> >>>> >>> >> >> > help<br>

>> >>>> >>> >> >> > guide later import decisions. Because the contents of<br>

>> >>>> >>> >> >> > this<br>

>> >>>> >>> >> >> > section<br>

>> >>>> >>> >> >> > will change frequently during ThinLTO tuning, it should<br>

>> >>>> >>> >> >> > also<br>

>> >>>> >>> >> >> > be<br>

>> >>>> >>> >> >> > marked<br>

>> >>>> >>> >> >> > with a version id for backwards compatibility or version<br>

>> >>>> >>> >> >> > checking.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > e. ThinLTO importing support:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Support for the mechanics of importing functions from<br>

>> >>>> >>> >> >> > other<br>

>> >>>> >>> >> >> > modules,<br>

>> >>>> >>> >> >> > which can go in gradually as a set of patches since it<br>

>> >>>> >>> >> >> > will<br>

>> >>>> >>> >> >> > be<br>

>> >>>> >>> >> >> > off by<br>

>> >>>> >>> >> >> > default. Separate patches can include:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > - BitcodeReader changes to use function index to<br>

>> >>>> >>> >> >> > import/deserialize<br>

>> >>>> >>> >> >> > single function of interest (small changes, leverages<br>

>> >>>> >>> >> >> > existing<br>

>> >>>> >>> >> >> > lazy<br>

>> >>>> >>> >> >> > streamer support).<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > - Minor LTOModule changes to pass the ThinLTO function to<br>

>> >>>> >>> >> >> > import<br>

>> >>>> >>> >> >> > and<br>

>> >>>> >>> >> >> > its index into bitcode reader.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > - Marking of imported functions (for use in<br>

>> >>>> >>> >> >> > ThinLTO-specific<br>

>> >>>> >>> >> >> > symbol<br>

>> >>>> >>> >> >> > linking and global DCE, for example). This can be<br>

>> >>>> >>> >> >> > in-memory<br>

>> >>>> >>> >> >> > initially,<br>

>> >>>> >>> >> >> > but IR support may be required in order to support<br>

>> >>>> >>> >> >> > streaming<br>

>> >>>> >>> >> >> > bitcode<br>

>> >>>> >>> >> >> > out and back in again after importing.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > - ModuleLinker changes to do ThinLTO-specific symbol<br>

>> >>>> >>> >> >> > linking<br>

>> >>>> >>> >> >> > and<br>

>> >>>> >>> >> >> > static promotion when necessary. The linkage type of<br>

>> >>>> >>> >> >> > imported<br>

>> >>>> >>> >> >> > functions changes to AvailableExternallyLinkage, for<br>

>> >>>> >>> >> >> > example.<br>

>> >>>> >>> >> >> > Statics<br>

>> >>>> >>> >> >> > must be promoted in certain cases, and renamed in<br>

>> >>>> >>> >> >> > consistent<br>

>> >>>> >>> >> >> > ways.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > - GlobalDCE changes to support removing imported<br>

>> >>>> >>> >> >> > functions<br>

>> >>>> >>> >> >> > that<br>

>> >>>> >>> >> >> > were<br>

>> >>>> >>> >> >> > not inlined (very small changes to existing pass logic).<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > f. ThinLTO Import Driver SCC pass:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Adds Transforms/IPO/ThinLTO.cpp with framework for doing<br>

>> >>>> >>> >> >> > ThinLTO<br>

>> >>>> >>> >> >> > via<br>

>> >>>> >>> >> >> > an SCC pass, enabled only under -fthinlto options. The<br>

>> >>>> >>> >> >> > pass<br>

>> >>>> >>> >> >> > includes<br>

>> >>>> >>> >> >> > utilizing the thin archive (global function<br>

>> >>>> >>> >> >> > index/summary),<br>

>> >>>> >>> >> >> > import<br>

>> >>>> >>> >> >> > decision heuristics, invocation of LTOModule/ModuleLinker<br>

>> >>>> >>> >> >> > routines<br>

>> >>>> >>> >> >> > that perform the import, and any necessary callgraph<br>

>> >>>> >>> >> >> > updates<br>

>> >>>> >>> >> >> > and<br>

>> >>>> >>> >> >> > verification.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > g. Backend Driver:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > For a single node build, the gold plugin can simply write<br>

>> >>>> >>> >> >> > a<br>

>> >>>> >>> >> >> > makefile<br>

>> >>>> >>> >> >> > and fork the parallel backend instances directly via<br>

>> >>>> >>> >> >> > parallel<br>

>> >>>> >>> >> >> > make.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > 3. Stage 3: ThinLTO Tuning and Enhancements<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > ----------------------------------------------------------------<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > This refers to the patches that are not required for<br>

>> >>>> >>> >> >> > ThinLTO<br>

>> >>>> >>> >> >> > to<br>

>> >>>> >>> >> >> > work,<br>

>> >>>> >>> >> >> > but rather to improve compile time, memory, run-time<br>

>> >>>> >>> >> >> > performance<br>

>> >>>> >>> >> >> > and<br>

>> >>>> >>> >> >> > usability.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > a. Lazy Debug Metadata Linking:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > The prototype implementation included lazy importing of<br>

>> >>>> >>> >> >> > module-level<br>

>> >>>> >>> >> >> > metadata during the ThinLTO pass finalization (i.e. after<br>

>> >>>> >>> >> >> > all<br>

>> >>>> >>> >> >> > function<br>

>> >>>> >>> >> >> > importing is complete). This actually applies to all<br>

>> >>>> >>> >> >> > module-level<br>

>> >>>> >>> >> >> > metadata, not just debug, although it is the largest.<br>

>> >>>> >>> >> >> > This<br>

>> >>>> >>> >> >> > can be<br>

>> >>>> >>> >> >> > added as a separate set of patches. Changes to<br>

>> >>>> >>> >> >> > BitcodeReader,<br>

>> >>>> >>> >> >> > ValueMapper, ModuleLinker<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > b. Import Tuning:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > Tuning the import strategy will be an iterative process<br>

>> >>>> >>> >> >> > that<br>

>> >>>> >>> >> >> > will<br>

>> >>>> >>> >> >> > continue to be refined over time. It involves several<br>

>> >>>> >>> >> >> > different<br>

>> >>>> >>> >> >> > types<br>

>> >>>> >>> >> >> > of changes: adding support for recording additional<br>

>> >>>> >>> >> >> > metrics<br>

>> >>>> >>> >> >> > in<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > function summary, such as profile data and optional<br>

>> >>>> >>> >> >> > heavier-weight<br>

>> >>>> >>> >> >> > IPA<br>

>> >>>> >>> >> >> > analyses, and tuning the import heuristics based on the<br>

>> >>>> >>> >> >> > summary<br>

>> >>>> >>> >> >> > and<br>

>> >>>> >>> >> >> > callsite context.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > c. Combined Function Map Pruning:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > The combined function map can be pruned of functions that<br>

>> >>>> >>> >> >> > are<br>

>> >>>> >>> >> >> > unlikely<br>

>> >>>> >>> >> >> > to benefit from being imported. For example, during the<br>

>> >>>> >>> >> >> > phase-2<br>

>> >>>> >>> >> >> > thin<br>

>> >>>> >>> >> >> > archive plug step we can safely omit large and (with<br>

>> >>>> >>> >> >> > profile<br>

>> >>>> >>> >> >> > data)<br>

>> >>>> >>> >> >> > cold functions, which are unlikely to benefit from being<br>

>> >>>> >>> >> >> > inlined.<br>

>> >>>> >>> >> >> > Additionally, all but one copy of comdat functions can be<br>

>> >>>> >>> >> >> > suppressed.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > d. Distributed Build System Integration:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > For a distributed build system, the gold plugin should<br>

>> >>>> >>> >> >> > write<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > parallel backend invocations into a makefile, including<br>

>> >>>> >>> >> >> > the<br>

>> >>>> >>> >> >> > mapping<br>

>> >>>> >>> >> >> > from the IR file to the real object file path, and exit.<br>

>> >>>> >>> >> >> > Additional<br>

>> >>>> >>> >> >> > work needs to be done in the distributed build system<br>

>> >>>> >>> >> >> > itself<br>

>> >>>> >>> >> >> > to<br>

>> >>>> >>> >> >> > distribute and dispatch the parallel backend jobs to the<br>

>> >>>> >>> >> >> > build<br>

>> >>>> >>> >> >> > cluster.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > e. Dependence Tracking and Incremental Compiles:<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > In order to support build systems that stage from local<br>

>> >>>> >>> >> >> > disks or<br>

>> >>>> >>> >> >> > network storage, the plugin will optionally support<br>

>> >>>> >>> >> >> > computation<br>

>> >>>> >>> >> >> > of<br>

>> >>>> >>> >> >> > dependent sets of IR files that each module may import<br>

>> >>>> >>> >> >> > from.<br>

>> >>>> >>> >> >> > This<br>

>> >>>> >>> >> >> > can<br>

>> >>>> >>> >> >> > be computed from profile data, if it exists, or from the<br>

>> >>>> >>> >> >> > symbol<br>

>> >>>> >>> >> >> > table<br>

>> >>>> >>> >> >> > and heuristics if not. These dependence sets also enable<br>

>> >>>> >>> >> >> > support<br>

>> >>>> >>> >> >> > for<br>

>> >>>> >>> >> >> > incremental backend compiles.<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > --<br>

>> >>>> >>> >> >> > Teresa Johnson | Software Engineer | <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a><br>

>> >>>> >>> >> >> > |<br>

>> >>>> >>> >> >> > 408-460-2413<br>

>> >>>> >>> >> >> ><br>

>> >>>> >>> >> >> > _______________________________________________<br>

>> >>>> >>> >> >> > LLVM Developers mailing list<br>

>> >>>> >>> >> >> > <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

>> >>>> >>> >> >> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

>> >>>> >>> >> >><br>

>> >>>> >>> >> >> _______________________________________________<br>

>> >>>> >>> >> >> LLVM Developers mailing list<br>

>> >>>> >>> >> >> <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

>> >>>> >>> >> >> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

>> >>>> >>> >> ><br>

>> >>>> >>> >> ><br>

>> >>>> >>> >><br>

>> >>>> >>> >><br>

>> >>>> >>> >><br>

>> >>>> >>> >> --<br>

>> >>>> >>> >> Teresa Johnson | Software Engineer | <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |<br>

>> >>>> >>> >> 408-460-2413<br>

>> >>>> >>> >><br>

>> >>>> >>> >> _______________________________________________<br>

>> >>>> >>> >> LLVM Developers mailing list<br>

>> >>>> >>> >> <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

>> >>>> >>> >> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

>> >>>> >>><br>

>> >>>> >>><br>

>> >>>> >>><br>

>> >>>> >>> --<br>

>> >>>> >>> Teresa Johnson | Software Engineer | <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |<br>

>> >>>> >>> 408-460-2413<br>

>> >>>> >><br>

>> >>>> >><br>

>> >>>> ><br>

>> >>>> > _______________________________________________<br>

>> >>>> > LLVM Developers mailing list<br>

>> >>>> > <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

>> >>>> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

>> >>>> ><br>

>> >>><br>

>> >>><br>

>> >>> _______________________________________________<br>

>> >>> LLVM Developers mailing list<br>

>> >>> <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

>> >>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

>> >>><br>

>> ><br>

>> > _______________________________________________<br>

>> > LLVM Developers mailing list<br>

>> > <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

>> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

>> ><br>

>><br>

>><br>

>><br>

>> --<br>

>> Teresa Johnson | Software Engineer | <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> | 408-460-2413<br>

<br>

<br>

<br>

--<br>

Teresa Johnson | Software Engineer | <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> | 408-460-2413<br>

</blockquote></div></div>