[llvm] r304516 - [ThinLTO] Efficiency improvement when writing module path string table

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 2 10:51:55 PDT 2017


On Fri, Jun 2, 2017 at 10:49 AM Teresa Johnson <tejohnson at google.com> wrote:

> Cool, thanks! Just for my own edification - I assume all but the template
> Functor change are independent of this patch (i.e. they would have applied
> before)?
>

Yeah, look like. This just caught my attention since you'd found this piece
of code to be hot (obviously not so hot now - so my changes here are likely
dwarfed by your fix anyway).


> Is it worth doing the same template change to forAllSummaries?
>

Where is that function?


> Teresa
>
> On Fri, Jun 2, 2017 at 10:25 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
>> Added a few cleanups/hopefully improvements in r304566 (use of
>> SmallVector::assign to potentially streamline byte copies, use of a
>> template to avoid std::function overhead, StringRefizing & some other
>> little bits of tidyup). Please take a look & let me know if you've any
>> questions/criticisms/ideas :)
>>
>> - Dave
>>
>>
>> On Thu, Jun 1, 2017 at 6:56 PM Teresa Johnson via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>> Author: tejohnson
>>> Date: Thu Jun  1 20:56:02 2017
>>> New Revision: 304516
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=304516&view=rev
>>> Log:
>>> [ThinLTO] Efficiency improvement when writing module path string table
>>>
>>> Summary:
>>> When writing the combined index, we are walking the entire module
>>> path StringMap in the full index, and checking whether each one should be
>>> included in the index being written. For distributed backends, where we
>>> write an individual combined index for each file, each with only a few
>>> module paths, this is incredibly inefficient. Add a method that takes
>>> a callback and hides the details of whether we are writing the full
>>> combined index, or just a slice, and in the latter case it walks the set
>>> of modules to include instead of the entire index.
>>>
>>> For a huge application with around 23K files (i.e. where we were
>>> iterating
>>> through the 23K-entry modulePath StringMap 23K times), this change
>>> improved
>>> the thin link time by a whopping 48%.
>>>
>>> Reviewers: pcc
>>>
>>> Subscribers: Prazek, inglorion, llvm-commits
>>>
>>> Differential Revision: https://reviews.llvm.org/D33813
>>>
>>> Modified:
>>>     llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp
>>>
>>> Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=304516&r1=304515&r2=304516&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original)
>>> +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Thu Jun  1 20:56:02
>>> 2017
>>> @@ -363,6 +363,32 @@ public:
>>>      }
>>>    }
>>>
>>> +  /// Calls the callback for each entry in the modulePaths StringMap
>>> that
>>> +  /// should be written to the module path string table. This hides the
>>> details
>>> +  /// of whether they are being pulled from the entire index or just
>>> those in a
>>> +  /// provided ModuleToSummariesForIndex map.
>>> +  void
>>> +  forEachModule(std::function<
>>> +                void(const StringMapEntry<std::pair<uint64_t,
>>> ModuleHash>> &)>
>>> +                    Callback) {
>>
>> +    if (ModuleToSummariesForIndex) {
>>> +      for (const auto &M : *ModuleToSummariesForIndex) {
>>> +        const auto &MPI = Index.modulePaths().find(M.first);
>>> +        if (MPI == Index.modulePaths().end()) {
>>> +          // This should only happen if the bitcode file was empty, in
>>> which
>>> +          // case we shouldn't be importing (the
>>> ModuleToSummariesForIndex
>>> +          // would only include the module we are writing and index
>>> for).
>>> +          assert(ModuleToSummariesForIndex->size() == 1);
>>> +          continue;
>>> +        }
>>> +        Callback(*MPI);
>>> +      }
>>> +    } else {
>>> +      for (const auto &MPSE : Index.modulePaths())
>>> +        Callback(MPSE);
>>> +    }
>>> +  }
>>> +
>>>    /// Main entry point for writing a combined index to bitcode.
>>>    void write();
>>>
>>> @@ -370,14 +396,6 @@ private:
>>>    void writeModStrings();
>>>    void writeCombinedGlobalValueSummary();
>>>
>>> -  /// Indicates whether the provided \p ModulePath should be written
>>> into
>>> -  /// the module string table, e.g. if full index written or if it is in
>>> -  /// the provided subset.
>>> -  bool doIncludeModule(StringRef ModulePath) {
>>> -    return !ModuleToSummariesForIndex ||
>>> -           ModuleToSummariesForIndex->count(ModulePath);
>>> -  }
>>> -
>>>    Optional<unsigned> getValueId(GlobalValue::GUID ValGUID) {
>>>      auto VMI = GUIDToValueIdMap.find(ValGUID);
>>>      if (VMI == GUIDToValueIdMap.end())
>>> @@ -3149,41 +3167,41 @@ void IndexBitcodeWriter::writeModStrings
>>>    unsigned AbbrevHash = Stream.EmitAbbrev(std::move(Abbv));
>>>
>>>    SmallVector<unsigned, 64> Vals;
>>> -  for (const auto &MPSE : Index.modulePaths()) {
>>> -    if (!doIncludeModule(MPSE.getKey()))
>>> -      continue;
>>> -    StringEncoding Bits =
>>> -        getStringEncoding(MPSE.getKey().data(), MPSE.getKey().size());
>>> -    unsigned AbbrevToUse = Abbrev8Bit;
>>> -    if (Bits == SE_Char6)
>>> -      AbbrevToUse = Abbrev6Bit;
>>> -    else if (Bits == SE_Fixed7)
>>> -      AbbrevToUse = Abbrev7Bit;
>>> -
>>> -    Vals.push_back(MPSE.getValue().first);
>>> -
>>> -    for (const auto P : MPSE.getKey())
>>> -      Vals.push_back((unsigned char)P);
>>> -
>>> -    // Emit the finished record.
>>> -    Stream.EmitRecord(bitc::MST_CODE_ENTRY, Vals, AbbrevToUse);
>>> -
>>> -    Vals.clear();
>>> -    // Emit an optional hash for the module now
>>> -    auto &Hash = MPSE.getValue().second;
>>> -    bool AllZero = true; // Detect if the hash is empty, and do not
>>> generate it
>>> -    for (auto Val : Hash) {
>>> -      if (Val)
>>> -        AllZero = false;
>>> -      Vals.push_back(Val);
>>> -    }
>>> -    if (!AllZero) {
>>> -      // Emit the hash record.
>>> -      Stream.EmitRecord(bitc::MST_CODE_HASH, Vals, AbbrevHash);
>>> -    }
>>> +  forEachModule(
>>> +      [&](const StringMapEntry<std::pair<uint64_t, ModuleHash>> &MPSE) {
>>> +        StringEncoding Bits =
>>> +            getStringEncoding(MPSE.getKey().data(),
>>> MPSE.getKey().size());
>>> +        unsigned AbbrevToUse = Abbrev8Bit;
>>> +        if (Bits == SE_Char6)
>>> +          AbbrevToUse = Abbrev6Bit;
>>> +        else if (Bits == SE_Fixed7)
>>> +          AbbrevToUse = Abbrev7Bit;
>>>
>>
>>
>>
>>> +
>>> +        Vals.push_back(MPSE.getValue().first);
>>> +
>>> +        for (const auto P : MPSE.getKey())
>>> +          Vals.push_back((unsigned char)P);
>>> +
>>> +        // Emit the finished record.
>>> +        Stream.EmitRecord(bitc::MST_CODE_ENTRY, Vals, AbbrevToUse);
>>> +
>>> +        Vals.clear();
>>> +        // Emit an optional hash for the module now
>>> +        auto &Hash = MPSE.getValue().second;
>>> +        bool AllZero =
>>> +            true; // Detect if the hash is empty, and do not generate it
>>> +        for (auto Val : Hash) {
>>> +          if (Val)
>>> +            AllZero = false;
>>> +          Vals.push_back(Val);
>>> +        }
>>> +        if (!AllZero) {
>>> +          // Emit the hash record.
>>> +          Stream.EmitRecord(bitc::MST_CODE_HASH, Vals, AbbrevHash);
>>> +        }
>>>
>>> -    Vals.clear();
>>> -  }
>>> +        Vals.clear();
>>> +      });
>>>    Stream.ExitBlock();
>>>  }
>>>
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170602/1a5c759d/attachment.html>


More information about the llvm-commits mailing list