[LLVMdev] RFC: ThinLTO Impementation Plan

Thu May 14 13:11:36 PDT 2015

On Thu, May 14, 2015 at 12:53 PM, Eric Christopher <echristo at gmail.com>
wrote:

>
>
> On Thu, May 14, 2015 at 11:34 AM Daniel Berlin <dberlin at dberlin.org>
> wrote:
>
>> On Thu, May 14, 2015 at 11:14 AM, Eric Christopher <echristo at gmail.com>
>> wrote:
>> > I'm not sure this is a particularly great assumption to make.
>>
>> Which part?
>>
>
> The binutils part :)
>
>
>>
>> >  We have to
>> > support a lot of different build systems and tools and concentrating on
>> > something that just binutils uses isn't particularly friendly here.
>> I think you may have misunderstood
>> His point was exactly that they want to be transparent to *all of* these
>> tools.
>> You are saying "we should be friendly to everyone". He is saying the same
>> thing.
>> We should be friendly to everyone. The friendly way to do this is to
>> not require all of these tools build plugins to handle bitcode.
>>
>> Hence, elf-wrapped bitcode.
>>
>
> Oh, I understood. I just don't know that I agree. To do anything with the
> tools will require some knowledge of bitcode anyhow or need the plugin. I'm
> saying that as a baseline start we should look at how to do this using the
> tools we've got rather than wrapping things for no real gain.
>

That doesn't seem strictly true - the ar situation (which I'm lead to
believe is in use in our build system & others, one would assume). With the
symbol table included as proposed, ar can be used without any knowledge of
the bitcode or need for a plugin.

It'd be helpful to have the scenarios we're trying to support with these
tools & then weigh up the alternatives.

> I've talked to Teresa a bit offline and we're going to talk more later
> (and discuss on the list), but there are some discussions about how to make
> this work either with just bitcode/llvm tools and so not requiring
> integration on all platforms. The latter is what I consider as particularly
> friendly :)
>
> -eric
>
>
>>
>>
>> > I also
>> > can't imagine how it's necessary for any of the lto aspects as currently
>> > written in the proposal.
>> >
>> > -eric
>> >
>> > On Thu, May 14, 2015 at 9:26 AM Xinliang David Li <xinliangli at gmail.com
>> >
>> > wrote:
>> >>
>> >> The design objective is to make thinLTO mostly transparent to binutil
>> >> tools to enable easy integration with any build system in the wild.
>> >> 'Pass-through' mode with 'ld -r' instead of the partial LTO mode is
>> another
>> >> reason.
>> >>
>> >> David
>> >>
>> >> On Thu, May 14, 2015 at 7:30 AM, Teresa Johnson <tejohnson at google.com>
>> >> wrote:
>> >>>
>> >>> On Thu, May 14, 2015 at 7:22 AM, Eric Christopher <echristo at gmail.com
>> >
>> >>> wrote:
>> >>> > So, what Alex is saying is that we have these tools as well and they
>> >>> > understand bitcode just fine, as well as every object format - not
>> just
>> >>> > ELF.
>> >>> > :)
>> >>>
>> >>> Right, there are also LLVM specific versions (llvm-ar, llvm-nm) that
>> >>> handle bitcode similarly to the way the standard tool + plugin does.
>> >>> But the goal we are trying to achieve is to allow the standard system
>> >>> versions of the tools to handle these files without requiring a
>> >>> plugin. I know the LLVM tool handles other object formats, but I'm not
>> >>> sure how that helps here? We're not planning to replace those tools,
>> >>> just allow the standard system versions to handle the intermediate
>> >>> objects produced by ThinLTO.
>> >>>
>> >>> Thanks,
>> >>> Teresa
>> >>>
>> >>> >
>> >>> > -eric
>> >>> >
>> >>> >
>> >>> > On Thu, May 14, 2015, 6:55 AM Teresa Johnson <tejohnson at google.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> On Wed, May 13, 2015 at 11:23 PM, Xinliang David Li
>> >>> >> <xinliangli at gmail.com> wrote:
>> >>> >> >
>> >>> >> >
>> >>> >> > On Wed, May 13, 2015 at 10:46 PM, Alex Rosenberg
>> >>> >> > <alexr at leftfield.org>
>> >>> >> > wrote:
>> >>> >> >>
>> >>> >> >> "ELF-wrapped bitcode" seems potentially controversial to me.
>> >>> >> >>
>> >>> >> >> What about ar, nm, and various ld implementations adds this
>> >>> >> >> requirement?
>> >>> >> >> What about the LLVM implementations of these tools is lacking?
>> >>> >> >
>> >>> >> >
>> >>> >> > Sorry I can not parse your questions properly. Can you make it
>> >>> >> > clearer?
>> >>> >>
>> >>> >> Alex is asking what the issue is with ar, nm, ld -r and regular
>> >>> >> bitcode that makes using elf-wrapped bitcode easier.
>> >>> >>
>> >>> >> The issue is that generally you need to provide a plugin to these
>> >>> >> tools in order for them to understand and handle bitcode files.
>> We'd
>> >>> >> like standard tools to work without requiring a plugin as much as
>> >>> >> possible. And in some cases we want them to be handled different
>> than
>> >>> >> the way bitcode files are handled with the plugin.
>> >>> >>
>> >>> >> nm: Without a plugin, normal bitcode files are inscrutable. When
>> >>> >> provided the gold plugin it can emit the symbols.
>> >>> >>
>> >>> >> ar: Without a plugin, it will create an archive of bitcode files,
>> but
>> >>> >> without an index, so it can't be handled by the linker even with a
>> >>> >> plugin on an -flto link. When ar is provided the gold plugin it
>> does
>> >>> >> create an index, so the linker + gold plugin handle it
>> appropriately
>> >>> >> on an -flto link.
>> >>> >>
>> >>> >> ld -r: Without a plugin, fails when provided bitcode inputs. When
>> >>> >> provided the gold plugin, it handles them but compiles them all the
>> >>> >> way through to ELF executable instructions via a partial LTO link.
>> >>> >> This is where we would like to differ in behavior (while also not
>> >>> >> requiring a plugin) with ELF-wrapped bitcode: we would like the ld
>> -r
>> >>> >> output file to still contain ELF-wrapped bitcode, delaying the LTO
>> >>> >> until the full link step.
>> >>> >>
>> >>> >> Let me know if that helps address your concerns.
>> >>> >>
>> >>> >> Thanks,
>> >>> >> Teresa
>> >>> >>
>> >>> >> >
>> >>> >> > David
>> >>> >> >
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> Alex
>> >>> >> >>
>> >>> >> >> > On May 13, 2015, at 7:44 PM, Teresa Johnson
>> >>> >> >> > <tejohnson at google.com>
>> >>> >> >> > wrote:
>> >>> >> >> >
>> >>> >> >> > I've included below an RFC for implementing ThinLTO in LLVM,
>> >>> >> >> > looking
>> >>> >> >> > forward to feedback and questions.
>> >>> >> >> > Thanks!
>> >>> >> >> > Teresa
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > RFC to discuss plans for implementing ThinLTO upstream.
>> >>> >> >> > Background
>> >>> >> >> > can
>> >>> >> >> > be found in slides from EuroLLVM 2015:
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> >
>> https://drive.google.com/open?id=0B036uwnWM6RWWER1ZEl5SUNENjQ&authuser=0)
>> >>> >> >> > As described in the talk, we have a prototype implementation,
>> and
>> >>> >> >> > would like to start staging patches upstream. This RFC
>> describes
>> >>> >> >> > a
>> >>> >> >> > breakdown of the major pieces. We would like to commit
>> upstream
>> >>> >> >> > gradually in several stages, with all functionality off by
>> >>> >> >> > default.
>> >>> >> >> > The core ThinLTO importing support and tuning will require
>> >>> >> >> > frequent
>> >>> >> >> > change and iteration during testing and tuning, and for that
>> part
>> >>> >> >> > we
>> >>> >> >> > would like to commit rapidly (off by default). See the
>> proposed
>> >>> >> >> > staged
>> >>> >> >> > implementation described in the Implementation Plan section.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > ThinLTO Overview
>> >>> >> >> > ==============
>> >>> >> >> >
>> >>> >> >> > See the talk slides linked above for more details. The
>> following
>> >>> >> >> > is a
>> >>> >> >> > high-level overview of the motivation.
>> >>> >> >> >
>> >>> >> >> > Cross Module Optimization (CMO) is an effective means for
>> >>> >> >> > improving
>> >>> >> >> > runtime performance, by extending the scope of optimizations
>> >>> >> >> > across
>> >>> >> >> > source module boundaries. Without CMO, the compiler is
>> limited to
>> >>> >> >> > optimizing within the scope of single source modules. Two
>> >>> >> >> > solutions
>> >>> >> >> > for enabling CMO are Link-Time Optimization (LTO), which is
>> >>> >> >> > currently
>> >>> >> >> > supported in LLVM and GCC, and Lightweight-Interprocedural
>> >>> >> >> > Optimization (LIPO). However, each of these solutions has
>> >>> >> >> > limitations
>> >>> >> >> > that prevent it from being enabled by default. ThinLTO is a
>> new
>> >>> >> >> > approach that attempts to address these limitations, with a
>> goal
>> >>> >> >> > of
>> >>> >> >> > being enabled more broadly. ThinLTO is designed with many of
>> the
>> >>> >> >> > same
>> >>> >> >> > principals as LIPO, and therefore its advantages, without any
>> of
>> >>> >> >> > its
>> >>> >> >> > inherent weakness. Unlike in LIPO where the module group
>> decision
>> >>> >> >> > is
>> >>> >> >> > made at profile training runtime, ThinLTO makes the decision
>> at
>> >>> >> >> > compile time, but in a lazy mode that facilitates large scale
>> >>> >> >> > parallelism. The serial linker plugin phase is designed to be
>> >>> >> >> > razor
>> >>> >> >> > thin and blazingly fast. By default this step only does
>> minimal
>> >>> >> >> > preparation work to enable the parallel lazy importing
>> performed
>> >>> >> >> > later. ThinLTO aims to be scalable like a regular O2 build,
>> >>> >> >> > enabling
>> >>> >> >> > CMO on machines without large memory configurations, while
>> also
>> >>> >> >> > integrating well with distributed build systems. Results from
>> >>> >> >> > early
>> >>> >> >> > prototyping on SPEC cpu2006 C++ benchmarks are in line with
>> >>> >> >> > expectations that ThinLTO can scale like O2 while enabling
>> much
>> >>> >> >> > of
>> >>> >> >> > the
>> >>> >> >> > CMO performed during a full LTO build.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > A ThinLTO build is divided into 3 phases, which are referred
>> to
>> >>> >> >> > in
>> >>> >> >> > the
>> >>> >> >> > following implementation plan:
>> >>> >> >> >
>> >>> >> >> > phase-1: IR and Function Summary Generation (-c compile)
>> >>> >> >> > phase-2: Thin Linker Plugin Layer (thin archive linker step)
>> >>> >> >> > phase-3: Parallel Backend with Demand-Driven Importing
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > Implementation Plan
>> >>> >> >> > ================
>> >>> >> >> >
>> >>> >> >> > This section gives a high-level breakdown of the ThinLTO
>> support
>> >>> >> >> > that
>> >>> >> >> > will be added, in roughly the order that the patches would be
>> >>> >> >> > staged.
>> >>> >> >> > The patches are divided into three stages. The first stage
>> >>> >> >> > contains a
>> >>> >> >> > minimal amount of preparation work that is not
>> ThinLTO-specific.
>> >>> >> >> > The
>> >>> >> >> > second stage contains most of the infrastructure for ThinLTO,
>> >>> >> >> > which
>> >>> >> >> > will be off by default. The third stage includes
>> >>> >> >> > enhancements/improvements/tunings that can be performed after
>> the
>> >>> >> >> > main
>> >>> >> >> > ThinLTO infrastructure is in.
>> >>> >> >> >
>> >>> >> >> > The second and third implementation stages will initially be
>> very
>> >>> >> >> > volatile, requiring a lot of iterations and tuning with large
>> >>> >> >> > apps to
>> >>> >> >> > get stabilized. Therefore it will be important to do fast
>> commits
>> >>> >> >> > for
>> >>> >> >> > these implementation stages.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > 1. Stage 1: Preparation
>> >>> >> >> > -------------------------------
>> >>> >> >> >
>> >>> >> >> > The first planned sets of patches are enablers for ThinLTO
>> work:
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > a. LTO directory structure:
>> >>> >> >> >
>> >>> >> >> > Restructure the LTO directory to remove circular dependence
>> when
>> >>> >> >> > ThinLTO pass added. Because ThinLTO is being implemented as a
>> SCC
>> >>> >> >> > pass
>> >>> >> >> > within Transforms/IPO, and leverages the LTOModule class for
>> >>> >> >> > linking
>> >>> >> >> > in functions from modules, IPO then requires the LTO library.
>> >>> >> >> > This
>> >>> >> >> > creates a circular dependence between LTO and IPO. To break
>> that,
>> >>> >> >> > we
>> >>> >> >> > need to split the lib/LTO directory/library into
>> lib/LTO/CodeGen
>> >>> >> >> > and
>> >>> >> >> > lib/LTO/Module, containing LTOCodeGenerator and LTOModule,
>> >>> >> >> > respectively. Only LTOCodeGenerator has a dependence on IPO,
>> >>> >> >> > removing
>> >>> >> >> > the circular dependence.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > b. ELF wrapper generation support:
>> >>> >> >> >
>> >>> >> >> > Implement ELF wrapped bitcode writer. In order to more easily
>> >>> >> >> > interact
>> >>> >> >> > with tools such as $AR, $NM, and “$LD -r” we plan to emit the
>> >>> >> >> > phase-1
>> >>> >> >> > bitcode wrapped in ELF via the .llvmbc section, along with a
>> >>> >> >> > symbol
>> >>> >> >> > table. The goal is both to interact with these tools without
>> >>> >> >> > requiring
>> >>> >> >> > a plugin, and also to avoid doing partial LTO/ThinLTO across
>> >>> >> >> > files
>> >>> >> >> > linked with “$LD -r” (i.e. the resulting object file should
>> still
>> >>> >> >> > contain ELF-wrapped bitcode to enable ThinLTO at the full link
>> >>> >> >> > step).
>> >>> >> >> > I will send a separate design document for these changes, but
>> the
>> >>> >> >> > following is a high-level overview.
>> >>> >> >> >
>> >>> >> >> > Support was added to LLVM for reading ELF-wrapped bitcode
>> >>> >> >> > (http://reviews.llvm.org/rL218078), but there does not yet
>> exist
>> >>> >> >> > support in LLVM/Clang for emitting bitcode wrapped in ELF. I
>> plan
>> >>> >> >> > to
>> >>> >> >> > add support for optionally generating bitcode in an ELF file
>> >>> >> >> > containing a single .llvmbc section holding the bitcode.
>> >>> >> >> > Specifically,
>> >>> >> >> > the patch would add new options “emit-llvm-bc-elf” (object
>> file)
>> >>> >> >> > and
>> >>> >> >> > corresponding “emit-llvm-elf” (textual assembly code
>> equivalent).
>> >>> >> >> > Eventually these would be automatically triggered under
>> >>> >> >> > “-fthinlto
>> >>> >> >> > -c”
>> >>> >> >> > and “-fthinlto -S”, respectively.
>> >>> >> >> >
>> >>> >> >> > Additionally, a symbol table will be generated in the ELF
>> file,
>> >>> >> >> > holding the function symbols within the bitcode. This
>> facilitates
>> >>> >> >> > handling archives of the ELF-wrapped bitcode created with $AR,
>> >>> >> >> > since
>> >>> >> >> > the archive will have a symbol table as well. The archive
>> symbol
>> >>> >> >> > table
>> >>> >> >> > enables gold to extract and pass to the plugin the constituent
>> >>> >> >> > ELF-wrapped bitcode files. To support the concatenated llvmbc
>> >>> >> >> > section
>> >>> >> >> > generated by “$LD -r”, some handling needs to be added to gold
>> >>> >> >> > and to
>> >>> >> >> > the backend driver to process each original module’s bitcode.
>> >>> >> >> >
>> >>> >> >> > The function index/summary will later be added as a special
>> ELF
>> >>> >> >> > section alongside the .llvmbc sections.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > 2. Stage 2: ThinLTO Infrastructure
>> >>> >> >> > ----------------------------------------------
>> >>> >> >> >
>> >>> >> >> > The next set of patches adds the base implementation of the
>> >>> >> >> > ThinLTO
>> >>> >> >> > infrastructure, specifically those required to make ThinLTO
>> >>> >> >> > functional
>> >>> >> >> > and generate correct but not necessarily high-performing
>> >>> >> >> > binaries. It
>> >>> >> >> > also does not include support to make debug support under -g
>> >>> >> >> > efficient
>> >>> >> >> > with ThinLTO.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > a. Clang/LLVM/gold linker options:
>> >>> >> >> >
>> >>> >> >> > An early set of clang/llvm patches is needed to provide
>> options
>> >>> >> >> > to
>> >>> >> >> > enable ThinLTO (off by default), so that the rest of the
>> >>> >> >> > implementation can be disabled by default as it is added.
>> >>> >> >> > Specifically, clang options -fthinlto (used instead of -flto)
>> >>> >> >> > will
>> >>> >> >> > cause clang to invoke the phase-1 emission of LLVM bitcode and
>> >>> >> >> > function summary/index on a compile step, and pass the
>> >>> >> >> > appropriate
>> >>> >> >> > option to the gold plugin on a link step. The -thinlto option
>> >>> >> >> > will be
>> >>> >> >> > added to the gold plugin and llvm-lto tool to launch the
>> phase-2
>> >>> >> >> > thin
>> >>> >> >> > archive step. The -thinlto option will also be added to the
>> ‘opt’
>> >>> >> >> > tool
>> >>> >> >> > to invoke it as a phase-3 parallel backend instance.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > b. Thin-archive linking support in Gold plugin and llvm-lto:
>> >>> >> >> >
>> >>> >> >> > Under the new plugin option (see above), the plugin needs to
>> >>> >> >> > perform
>> >>> >> >> > the phase-2 (thin archive) link which simply emits a combined
>> >>> >> >> > function
>> >>> >> >> > map from the linked modules, without actually performing the
>> >>> >> >> > normal
>> >>> >> >> > link. Corresponding support should be added to the standalone
>> >>> >> >> > llvm-lto
>> >>> >> >> > tool to enable testing/debugging without involving the linker
>> and
>> >>> >> >> > plugin.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > c. ThinLTO backend support:
>> >>> >> >> >
>> >>> >> >> > Support for invoking a phase-3 backend invocation (including
>> >>> >> >> > importing) on a module should be added to the ‘opt’ tool under
>> >>> >> >> > the
>> >>> >> >> > new
>> >>> >> >> > option. The main change under the option is to instantiate a
>> >>> >> >> > Linker
>> >>> >> >> > object used to manage the process of linking imported
>> functions
>> >>> >> >> > into
>> >>> >> >> > the module, efficient read of the combined function map, and
>> >>> >> >> > enable
>> >>> >> >> > the ThinLTO import pass.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > d. Function index/summary support:
>> >>> >> >> >
>> >>> >> >> > This includes infrastructure for writing and reading the
>> function
>> >>> >> >> > index/summary section. As noted earlier this will be encoded
>> in a
>> >>> >> >> > special ELF section within the module, alongside the .llvmbc
>> >>> >> >> > section
>> >>> >> >> > containing the bitcode. The thin archive generated by phase-2
>> of
>> >>> >> >> > ThinLTO simply contains all of the function index/summary
>> >>> >> >> > sections
>> >>> >> >> > across the linked modules, organized for efficient function
>> >>> >> >> > lookup.
>> >>> >> >> >
>> >>> >> >> > Each function available for importing from the module
>> contains an
>> >>> >> >> > entry in the module’s function index/summary section and in
>> the
>> >>> >> >> > resulting combined function map. Each function entry contains
>> >>> >> >> > that
>> >>> >> >> > function’s offset within the bitcode file, used to efficiently
>> >>> >> >> > locate
>> >>> >> >> > and quickly import just that function. The entry also contains
>> >>> >> >> > summary
>> >>> >> >> > information (e.g. basic information determined during parsing
>> >>> >> >> > such as
>> >>> >> >> > the number of instructions in the function), that will be
>> used to
>> >>> >> >> > help
>> >>> >> >> > guide later import decisions. Because the contents of this
>> >>> >> >> > section
>> >>> >> >> > will change frequently during ThinLTO tuning, it should also
>> be
>> >>> >> >> > marked
>> >>> >> >> > with a version id for backwards compatibility or version
>> >>> >> >> > checking.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > e. ThinLTO importing support:
>> >>> >> >> >
>> >>> >> >> > Support for the mechanics of importing functions from other
>> >>> >> >> > modules,
>> >>> >> >> > which can go in gradually as a set of patches since it will be
>> >>> >> >> > off by
>> >>> >> >> > default. Separate patches can include:
>> >>> >> >> >
>> >>> >> >> > - BitcodeReader changes to use function index to
>> >>> >> >> > import/deserialize
>> >>> >> >> > single function of interest (small changes, leverages existing
>> >>> >> >> > lazy
>> >>> >> >> > streamer support).
>> >>> >> >> >
>> >>> >> >> > - Minor LTOModule changes to pass the ThinLTO function to
>> import
>> >>> >> >> > and
>> >>> >> >> > its index into bitcode reader.
>> >>> >> >> >
>> >>> >> >> > - Marking of imported functions (for use in ThinLTO-specific
>> >>> >> >> > symbol
>> >>> >> >> > linking and global DCE, for example). This can be in-memory
>> >>> >> >> > initially,
>> >>> >> >> > but IR support may be required in order to support streaming
>> >>> >> >> > bitcode
>> >>> >> >> > out and back in again after importing.
>> >>> >> >> >
>> >>> >> >> > - ModuleLinker changes to do ThinLTO-specific symbol linking
>> and
>> >>> >> >> > static promotion when necessary. The linkage type of imported
>> >>> >> >> > functions changes to AvailableExternallyLinkage, for example.
>> >>> >> >> > Statics
>> >>> >> >> > must be promoted in certain cases, and renamed in consistent
>> >>> >> >> > ways.
>> >>> >> >> >
>> >>> >> >> > - GlobalDCE changes to support removing imported functions
>> that
>> >>> >> >> > were
>> >>> >> >> > not inlined (very small changes to existing pass logic).
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > f. ThinLTO Import Driver SCC pass:
>> >>> >> >> >
>> >>> >> >> > Adds Transforms/IPO/ThinLTO.cpp with framework for doing
>> ThinLTO
>> >>> >> >> > via
>> >>> >> >> > an SCC pass, enabled only under -fthinlto options. The pass
>> >>> >> >> > includes
>> >>> >> >> > utilizing the thin archive (global function index/summary),
>> >>> >> >> > import
>> >>> >> >> > decision heuristics, invocation of LTOModule/ModuleLinker
>> >>> >> >> > routines
>> >>> >> >> > that perform the import, and any necessary callgraph updates
>> and
>> >>> >> >> > verification.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > g. Backend Driver:
>> >>> >> >> >
>> >>> >> >> > For a single node build, the gold plugin can simply write a
>> >>> >> >> > makefile
>> >>> >> >> > and fork the parallel backend instances directly via parallel
>> >>> >> >> > make.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > 3. Stage 3: ThinLTO Tuning and Enhancements
>> >>> >> >> >
>> ----------------------------------------------------------------
>> >>> >> >> >
>> >>> >> >> > This refers to the patches that are not required for ThinLTO
>> to
>> >>> >> >> > work,
>> >>> >> >> > but rather to improve compile time, memory, run-time
>> performance
>> >>> >> >> > and
>> >>> >> >> > usability.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > a. Lazy Debug Metadata Linking:
>> >>> >> >> >
>> >>> >> >> > The prototype implementation included lazy importing of
>> >>> >> >> > module-level
>> >>> >> >> > metadata during the ThinLTO pass finalization (i.e. after all
>> >>> >> >> > function
>> >>> >> >> > importing is complete). This actually applies to all
>> module-level
>> >>> >> >> > metadata, not just debug, although it is the largest. This
>> can be
>> >>> >> >> > added as a separate set of patches. Changes to BitcodeReader,
>> >>> >> >> > ValueMapper, ModuleLinker
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > b. Import Tuning:
>> >>> >> >> >
>> >>> >> >> > Tuning the import strategy will be an iterative process that
>> will
>> >>> >> >> > continue to be refined over time. It involves several
>> different
>> >>> >> >> > types
>> >>> >> >> > of changes: adding support for recording additional metrics in
>> >>> >> >> > the
>> >>> >> >> > function summary, such as profile data and optional
>> >>> >> >> > heavier-weight
>> >>> >> >> > IPA
>> >>> >> >> > analyses, and tuning the import heuristics based on the
>> summary
>> >>> >> >> > and
>> >>> >> >> > callsite context.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > c. Combined Function Map Pruning:
>> >>> >> >> >
>> >>> >> >> > The combined function map can be pruned of functions that are
>> >>> >> >> > unlikely
>> >>> >> >> > to benefit from being imported. For example, during the
>> phase-2
>> >>> >> >> > thin
>> >>> >> >> > archive plug step we can safely omit large and (with profile
>> >>> >> >> > data)
>> >>> >> >> > cold functions, which are unlikely to benefit from being
>> inlined.
>> >>> >> >> > Additionally, all but one copy of comdat functions can be
>> >>> >> >> > suppressed.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > d. Distributed Build System Integration:
>> >>> >> >> >
>> >>> >> >> > For a distributed build system, the gold plugin should write
>> the
>> >>> >> >> > parallel backend invocations into a makefile, including the
>> >>> >> >> > mapping
>> >>> >> >> > from the IR file to the real object file path, and exit.
>> >>> >> >> > Additional
>> >>> >> >> > work needs to be done in the distributed build system itself
>> to
>> >>> >> >> > distribute and dispatch the parallel backend jobs to the build
>> >>> >> >> > cluster.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > e. Dependence Tracking and Incremental Compiles:
>> >>> >> >> >
>> >>> >> >> > In order to support build systems that stage from local disks
>> or
>> >>> >> >> > network storage, the plugin will optionally support
>> computation
>> >>> >> >> > of
>> >>> >> >> > dependent sets of IR files that each module may import from.
>> This
>> >>> >> >> > can
>> >>> >> >> > be computed from profile data, if it exists, or from the
>> symbol
>> >>> >> >> > table
>> >>> >> >> > and heuristics if not. These dependence sets also enable
>> support
>> >>> >> >> > for
>> >>> >> >> > incremental backend compiles.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > --
>> >>> >> >> > Teresa Johnson | Software Engineer | tejohnson at google.com |
>> >>> >> >> > 408-460-2413
>> >>> >> >> >
>> >>> >> >> > _______________________________________________
>> >>> >> >> > LLVM Developers mailing list
>> >>> >> >> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >>> >> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >>> >> >>
>> >>> >> >> _______________________________________________
>> >>> >> >> LLVM Developers mailing list
>> >>> >> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >>> >> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >>> >> >
>> >>> >> >
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Teresa Johnson | Software Engineer | tejohnson at google.com |
>> >>> >> 408-460-2413
>> >>> >>
>> >>> >> _______________________________________________
>> >>> >> LLVM Developers mailing list
>> >>> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Teresa Johnson | Software Engineer | tejohnson at google.com |
>> 408-460-2413
>> >>
>> >>
>> >
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >
>>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150514/db632f0a/attachment.html>