[LLVMdev] RFC: ThinLTO Impementation Plan
Teresa Johnson
tejohnson at google.com
Fri May 15 13:06:50 PDT 2015
On Fri, May 15, 2015 at 12:36 PM, Xinliang David Li
<xinliangli at gmail.com> wrote:
>
>
> On Fri, May 15, 2015 at 12:02 PM, Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
>>
>>
>> > On 2015-May-15, at 07:30, Teresa Johnson <tejohnson at google.com> wrote:
>> >
>> >>> a. Lazy Debug Metadata Linking:
>> >>>
>> >>> The prototype implementation included lazy importing of module-level
>> >>> metadata during the ThinLTO pass finalization (i.e. after all function
>> >>> importing is complete). This actually applies to all module-level
>> >>> metadata, not just debug, although it is the largest. This can be
>> >>> added as a separate set of patches. Changes to BitcodeReader,
>> >>> ValueMapper, ModuleLinker
>> >>
>> >> It sounds like this would work well with the "full" LTO implemented
>> >> by tools/gold-plugin right now. What exactly did you do to improve
>> >> this?
>> >
>> > I don't think it will help with full LTO. The parsing of the metadata
>> > is only delayed until the ThinLTO pass finalization, and the delayed
>> > metadata import is necessary to avoid reading and linking in the
>> > metadata multiple times (for each function imported from that module).
>> > Coming out of the ThinLTO pass you still have all the metadata
>> > necessary for each function that was imported. For a full LTO that
>> > would end up being all of the metadata in the module.
>> >
>> > The high level summary is that during the initial import it leaves the
>> > temporary metadata on the instructions that were imported, but saves
>> > the index used by the bitcode reader used to correlate with the
>> > metadata when it is ready (i.e. the MDValuePtrs index), and skips the
>> > metadata parsing. During finalization we parse just the metadata, and
>> > suture it up during metadata value mapping using the saved index.
>>
>> AFAICT, the gold-plugin currently does similar work. (Rafael knows
>> this code better (I've only introduced bugs there), but IIRC he's on
>> vacation until next week.) Even in "full" LTO.
>>
>> Have a look at `getModuleForFile()` and its calling loop inside
>> `allSymbolsReadHook()`:
>>
>> 1. Load a single module, lazily.
>> 2. Delete the bodies of unwanted functions (without ever loading them)
>> and fiddle with linkage as necessary.
>>
>> 3. Link in the module.
>> 4. Delete the module.
>>
>> How is what you're proposing different? Does it need to be different?
>
>
> The difference is that for thinLTO, the lazy import happens in backend
> compilation instead of running at linker plugin time. It is really lazy and
> imports bare minimal depends on only references from imported functions.
>
> Say module B has 1000 functions. Module A only imports 2 function foo1 and
> foo2 from Module B transitively. The lazy reading will import necessary
> module level data from B only to satisfy foo1 and foo2.
Right, so there are a couple of differences. As David mentions, we do
this later and also in an iterative fashion. I.e. it may import
function foo1 from module B, then in function foo1 see hot calls to
functions bar1 in module C and bar2 in module D, importing each of
those in turn, then later decide to import function foo2 from module
B. So the BitcodeReader is created and opens a module multiple times.
Each time we import we create a BitcodeReader, read in what we need,
create a Module (which contains exactly 1 function), and invoke
linkInModule, delete the module.
This is in contrast to gold/LTO and the lazy bitcode reader which does
instantiate the BitcodeReader once per module, and so it can defer
parsing the metadata until the first function is materialized (at
which point it parses and materializes all of the metadata). We don't
want to parse/materialize the metadata until we have completed all
imports from the module.
The alternative if we wanted to leverage the existing lazy metadata
parsing support would be to keep the BitcodeReader instantiated for
each module and not close it until ThinLTO importing finalization. So
we would have a number of BitcodeReaders active at once (one per
module we import from). I need to think more about whether this is
feasible and investigate if this adds a lot of overhead.
The other thing that David alluded to above, that I should call out
more explicitly in my RFC, is that we only link in the metadata
transitively reached by the imported functions. This is handled during
metadata value mapping.
I think I will update the RFC based on some of the discussion so far
and will send out an updated version early next week.
Thanks,
Teresa
>
> David
>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
--
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
More information about the llvm-dev
mailing list