[lld] r270340 - Define SectionPiece and use it instead of std::pair<uint_t, uint_t>.

Sean Silva via llvm-commits llvm-commits at lists.llvm.org
Tue May 24 22:19:41 PDT 2016


On Tue, May 24, 2016 at 10:03 PM, Rui Ueyama <ruiu at google.com> wrote:

> Thank you for sharing it to me. It's quite interesting. Let me do it
> myself too. But in any way this patch is for correctness. Without this
> patch you can't create >4Gb executables on Windows (even on Win64.)
>

I was mostly referring to using `bool Live;` instead of sentinel values,
which increases the size of the data structure by 50% (which for a binary
search will increase the number of cache misses by 50%).

FWIW, the profiling tool I was using was the default one built into VS2015
which turned out to be pretty good for looking at this (I wasn't actually
specifically looking at LLD non-LTO link time, I was mostly just testing
the VS2015 profiler). The profile looked pretty easy to optimize; about
evenly split into 3 parts concentrated at a relatively small number of
lines of code. I reckon it shouldn't be too difficult to make non-LTO link
times at least twice as fast. Focusing on anything that gives less than a
5-10% speedup is probably pointless at this stage.

-- Sean Silva


>
> On Tue, May 24, 2016 at 9:56 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>> This is the screenshot that I remember taking:
>> http://reviews.llvm.org/F1983334
>>
>> Percentages are out of the overall runtime.
>>
>> -- Sean Silva
>>
>>
>> On Tue, May 24, 2016 at 9:42 PM, Rui Ueyama <ruiu at google.com> wrote:
>>
>>> It is I think actually spent in StringTableBuilder. More specifically,
>>> StringTableBuilder spends time on inserting given strings (all pieces of
>>> splittable sections) into a hash table in order to uniquify them.
>>>
>>> On Tue, May 24, 2016 at 9:39 PM, Sean Silva <chisophugis at gmail.com>
>>> wrote:
>>>
>>>> Be very careful with this code (e.g. changing data layouts). We spend
>>>> over 20% of non-LTO time in it last time I looked (mostly the call to
>>>> lower_bound IIRC).
>>>>
>>>> -- Sean Silva
>>>>
>>>> On Sat, May 21, 2016 at 5:13 PM, Rui Ueyama via llvm-commits <
>>>> llvm-commits at lists.llvm.org> wrote:
>>>>
>>>>> Author: ruiu
>>>>> Date: Sat May 21 19:13:04 2016
>>>>> New Revision: 270340
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=270340&view=rev
>>>>> Log:
>>>>> Define SectionPiece and use it instead of std::pair<uint_t, uint_t>.
>>>>>
>>>>> We were using std::pair to represents pieces of splittable section
>>>>> contents. It hurt readability because "first" and "second" are not
>>>>> meaningful. This patch give them names.
>>>>>
>>>>> One more thing is that piecewise liveness information is stored to
>>>>> the second element of the pair as a special value of output section
>>>>> offset. It was confusing, so I defiend a new bit, "Live", in the
>>>>> new struct.
>>>>>
>>>>> Modified:
>>>>>     lld/trunk/ELF/InputSection.cpp
>>>>>     lld/trunk/ELF/InputSection.h
>>>>>     lld/trunk/ELF/MarkLive.cpp
>>>>>     lld/trunk/ELF/OutputSections.cpp
>>>>>     lld/trunk/ELF/Writer.cpp
>>>>>
>>>>> Modified: lld/trunk/ELF/InputSection.cpp
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/InputSection.cpp?rev=270340&r1=270339&r2=270340&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- lld/trunk/ELF/InputSection.cpp (original)
>>>>> +++ lld/trunk/ELF/InputSection.cpp Sat May 21 19:13:04 2016
>>>>> @@ -414,13 +414,12 @@ typename ELFT::uint EHInputSection<ELFT>
>>>>>    // identify the start of the output .eh_frame. Handle this special
>>>>> case.
>>>>>    if (this->getSectionHdr()->sh_size == 0)
>>>>>      return Offset;
>>>>> -  std::pair<uintX_t, uintX_t> *I =
>>>>> this->getRangeAndSize(Offset).first;
>>>>> -  uintX_t Base = I->second;
>>>>> -  if (Base == uintX_t(-1))
>>>>> +  SectionPiece *Piece = this->getRangeAndSize(Offset).first;
>>>>> +  if (Piece->OutputOff == size_t(-1))
>>>>>      return -1; // Not in the output
>>>>>
>>>>> -  uintX_t Addend = Offset - I->first;
>>>>> -  return Base + Addend;
>>>>> +  uintX_t Addend = Offset - Piece->InputOff;
>>>>> +  return Piece->OutputOff + Addend;
>>>>>  }
>>>>>
>>>>>  static size_t findNull(StringRef S, size_t EntSize) {
>>>>> @@ -443,17 +442,14 @@ MergeInputSection<ELFT>::MergeInputSecti
>>>>>    uintX_t EntSize = Header->sh_entsize;
>>>>>    ArrayRef<uint8_t> D = this->getSectionData();
>>>>>    StringRef Data((const char *)D.data(), D.size());
>>>>> -  std::vector<std::pair<uintX_t, uintX_t>> &Offsets = this->Offsets;
>>>>>
>>>>> -  uintX_t V = Config->GcSections ? MergeInputSection<ELFT>::PieceDead
>>>>> -                                 : MergeInputSection<ELFT>::PieceLive;
>>>>>    if (Header->sh_flags & SHF_STRINGS) {
>>>>>      uintX_t Offset = 0;
>>>>>      while (!Data.empty()) {
>>>>>        size_t End = findNull(Data, EntSize);
>>>>>        if (End == StringRef::npos)
>>>>>          fatal("string is not null terminated");
>>>>> -      Offsets.push_back(std::make_pair(Offset, V));
>>>>> +      this->Pieces.emplace_back(Offset);
>>>>>        uintX_t Size = End + EntSize;
>>>>>        Data = Data.substr(Size);
>>>>>        Offset += Size;
>>>>> @@ -465,7 +461,7 @@ MergeInputSection<ELFT>::MergeInputSecti
>>>>>    size_t Size = Data.size();
>>>>>    assert((Size % EntSize) == 0);
>>>>>    for (unsigned I = 0, N = Size; I != N; I += EntSize)
>>>>> -    Offsets.push_back(std::make_pair(I, V));
>>>>> +    this->Pieces.emplace_back(I);
>>>>>  }
>>>>>
>>>>>  template <class ELFT>
>>>>> @@ -474,8 +470,7 @@ bool MergeInputSection<ELFT>::classof(co
>>>>>  }
>>>>>
>>>>>  template <class ELFT>
>>>>> -std::pair<std::pair<typename ELFT::uint, typename ELFT::uint> *,
>>>>> -          typename ELFT::uint>
>>>>> +std::pair<SectionPiece *, typename ELFT::uint>
>>>>>  SplitInputSection<ELFT>::getRangeAndSize(uintX_t Offset) {
>>>>>    ArrayRef<uint8_t> D = this->getSectionData();
>>>>>    StringRef Data((const char *)D.data(), D.size());
>>>>> @@ -485,37 +480,32 @@ SplitInputSection<ELFT>::getRangeAndSize
>>>>>
>>>>>    // Find the element this offset points to.
>>>>>    auto I = std::upper_bound(
>>>>> -      Offsets.begin(), Offsets.end(), Offset,
>>>>> -      [](const uintX_t &A, const std::pair<uintX_t, uintX_t> &B) {
>>>>> -        return A < B.first;
>>>>> -      });
>>>>> -  uintX_t End = I == Offsets.end() ? Data.size() : I->first;
>>>>> +      Pieces.begin(), Pieces.end(), Offset,
>>>>> +      [](const uintX_t &A, const SectionPiece &B) { return A <
>>>>> B.InputOff; });
>>>>> +  uintX_t End = (I == Pieces.end()) ? Data.size() : I->InputOff;
>>>>>    --I;
>>>>> -  return std::make_pair(&*I, End);
>>>>> +  return {&*I, End};
>>>>>  }
>>>>>
>>>>>  template <class ELFT>
>>>>>  typename ELFT::uint MergeInputSection<ELFT>::getOffset(uintX_t
>>>>> Offset) {
>>>>> -  std::pair<std::pair<uintX_t, uintX_t> *, uintX_t> T =
>>>>> -      this->getRangeAndSize(Offset);
>>>>> -  std::pair<uintX_t, uintX_t> *I = T.first;
>>>>> +  std::pair<SectionPiece *, uintX_t> T =
>>>>> this->getRangeAndSize(Offset);
>>>>> +  SectionPiece &Piece = *T.first;
>>>>>    uintX_t End = T.second;
>>>>> -  uintX_t Start = I->first;
>>>>> +  assert(Piece.Live);
>>>>>
>>>>>    // Compute the Addend and if the Base is cached, return.
>>>>> -  uintX_t Addend = Offset - Start;
>>>>> -  uintX_t &Base = I->second;
>>>>> -  assert(Base != MergeInputSection<ELFT>::PieceDead);
>>>>> -  if (Base != MergeInputSection<ELFT>::PieceLive)
>>>>> -    return Base + Addend;
>>>>> +  uintX_t Addend = Offset - Piece.InputOff;
>>>>> +  if (Piece.OutputOff != size_t(-1))
>>>>> +    return Piece.OutputOff + Addend;
>>>>>
>>>>>    // Map the base to the offset in the output section and cache it.
>>>>>    ArrayRef<uint8_t> D = this->getSectionData();
>>>>>    StringRef Data((const char *)D.data(), D.size());
>>>>> -  StringRef Entry = Data.substr(Start, End - Start);
>>>>> +  StringRef Entry = Data.substr(Piece.InputOff, End - Piece.InputOff);
>>>>>    auto *MOS = static_cast<MergeOutputSection<ELFT> *>(this->OutSec);
>>>>> -  Base = MOS->getOffset(Entry);
>>>>> -  return Base + Addend;
>>>>> +  Piece.OutputOff = MOS->getOffset(Entry);
>>>>> +  return Piece.OutputOff + Addend;
>>>>>  }
>>>>>
>>>>>  template <class ELFT>
>>>>>
>>>>> Modified: lld/trunk/ELF/InputSection.h
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/InputSection.h?rev=270340&r1=270339&r2=270340&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- lld/trunk/ELF/InputSection.h (original)
>>>>> +++ lld/trunk/ELF/InputSection.h Sat May 21 19:13:04 2016
>>>>> @@ -130,6 +130,14 @@ public:
>>>>>
>>>>>  template <class ELFT> InputSectionBase<ELFT>
>>>>> InputSectionBase<ELFT>::Discarded;
>>>>>
>>>>> +// SectionPiece represents a piece of splittable section contents.
>>>>> +struct SectionPiece {
>>>>> +  SectionPiece(size_t I) : InputOff(I), Live(!Config->GcSections) {}
>>>>> +  size_t InputOff;
>>>>> +  size_t OutputOff = -1;
>>>>> +  bool Live;
>>>>> +};
>>>>> +
>>>>>  // Usually sections are copied to the output as atomic chunks of data,
>>>>>  // but some special types of sections are split into small pieces of
>>>>> data
>>>>>  // and each piece is copied to a different place in the output.
>>>>> @@ -142,24 +150,11 @@ public:
>>>>>    SplitInputSection(ObjectFile<ELFT> *File, const Elf_Shdr *Header,
>>>>>                      typename InputSectionBase<ELFT>::Kind
>>>>> SectionKind);
>>>>>
>>>>> -  // For each piece of data, we maintain the offsets in the input
>>>>> section and
>>>>> -  // in the output section.
>>>>> -  std::vector<std::pair<uintX_t, uintX_t>> Offsets;
>>>>> -
>>>>> -  // Merge input sections may use the following special values as the
>>>>> output
>>>>> -  // section offset:
>>>>> -  enum {
>>>>> -    // The piece is dead.
>>>>> -    PieceDead = uintX_t(-1),
>>>>> -    // The piece is live, but an offset has not yet been assigned.
>>>>> After offsets
>>>>> -    // have been assigned, if the output section uses tail merging,
>>>>> the field
>>>>> -    // will still have this value and the output section offset is
>>>>> computed
>>>>> -    // lazilly.
>>>>> -    PieceLive = uintX_t(-2),
>>>>> -  };
>>>>> +  // Splittable sections are handled as a sequence of data
>>>>> +  // rather than a single large blob of data.
>>>>> +  std::vector<SectionPiece> Pieces;
>>>>>
>>>>> -  std::pair<std::pair<uintX_t, uintX_t> *, uintX_t>
>>>>> -  getRangeAndSize(uintX_t Offset);
>>>>> +  std::pair<SectionPiece *, uintX_t> getRangeAndSize(uintX_t Offset);
>>>>>  };
>>>>>
>>>>>  // This corresponds to a SHF_MERGE section of an input file.
>>>>>
>>>>> Modified: lld/trunk/ELF/MarkLive.cpp
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/MarkLive.cpp?rev=270340&r1=270339&r2=270340&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- lld/trunk/ELF/MarkLive.cpp (original)
>>>>> +++ lld/trunk/ELF/MarkLive.cpp Sat May 21 19:13:04 2016
>>>>> @@ -143,9 +143,8 @@ template <class ELFT> void elf::markLive
>>>>>      if (!R.Sec)
>>>>>        return;
>>>>>      if (auto *MS = dyn_cast<MergeInputSection<ELFT>>(R.Sec)) {
>>>>> -      std::pair<std::pair<uintX_t, uintX_t> *, uintX_t> T =
>>>>> -          MS->getRangeAndSize(R.Offset);
>>>>> -      T.first->second = MergeInputSection<ELFT>::PieceLive;
>>>>> +      std::pair<SectionPiece *, uintX_t> T =
>>>>> MS->getRangeAndSize(R.Offset);
>>>>> +      T.first->Live = true;
>>>>>      }
>>>>>      if (R.Sec->Live)
>>>>>        return;
>>>>>
>>>>> Modified: lld/trunk/ELF/OutputSections.cpp
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/OutputSections.cpp?rev=270340&r1=270339&r2=270340&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- lld/trunk/ELF/OutputSections.cpp (original)
>>>>> +++ lld/trunk/ELF/OutputSections.cpp Sat May 21 19:13:04 2016
>>>>> @@ -978,10 +978,9 @@ EHRegion<ELFT>::EHRegion(EHInputSection<
>>>>>
>>>>>  template <class ELFT> StringRef EHRegion<ELFT>::data() const {
>>>>>    ArrayRef<uint8_t> SecData = S->getSectionData();
>>>>> -  ArrayRef<std::pair<uintX_t, uintX_t>> Offsets = S->Offsets;
>>>>> -  size_t Start = Offsets[Index].first;
>>>>> -  size_t End =
>>>>> -      Index == Offsets.size() - 1 ? SecData.size() : Offsets[Index +
>>>>> 1].first;
>>>>> +  size_t Start = S->Pieces[Index].InputOff;
>>>>> +  size_t End = (Index == S->Pieces.size() - 1) ? SecData.size()
>>>>> +                                               : S->Pieces[Index +
>>>>> 1].InputOff;
>>>>>    return StringRef((const char *)SecData.data() + Start, End - Start);
>>>>>  }
>>>>>
>>>>> @@ -1142,8 +1141,8 @@ void EHOutputSection<ELFT>::addSectionAu
>>>>>
>>>>>    DenseMap<uintX_t, uintX_t> OffsetToIndex;
>>>>>    while (!D.empty()) {
>>>>> -    unsigned Index = S->Offsets.size();
>>>>> -    S->Offsets.push_back(std::make_pair(Offset, -1));
>>>>> +    unsigned Index = S->Pieces.size();
>>>>> +    S->Pieces.emplace_back(Offset);
>>>>>
>>>>>      uintX_t Length = readEntryLength<ELFT>(D);
>>>>>      // If CIE/FDE data length is zero then Length is 4, this
>>>>> @@ -1227,11 +1226,11 @@ template <class ELFT> void EHOutputSecti
>>>>>
>>>>>    size_t Offset = 0;
>>>>>    for (const Cie<ELFT> &C : Cies) {
>>>>> -    C.S->Offsets[C.Index].second = Offset;
>>>>> +    C.S->Pieces[C.Index].OutputOff = Offset;
>>>>>      Offset += alignTo(C.data().size(), sizeof(uintX_t));
>>>>>
>>>>>      for (const EHRegion<ELFT> &F : C.Fdes) {
>>>>> -      F.S->Offsets[F.Index].second = Offset;
>>>>> +      F.S->Pieces[F.Index].OutputOff = Offset;
>>>>>        Offset += alignTo(F.data().size(), sizeof(uintX_t));
>>>>>      }
>>>>>    }
>>>>> @@ -1240,11 +1239,11 @@ template <class ELFT> void EHOutputSecti
>>>>>  template <class ELFT> void EHOutputSection<ELFT>::writeTo(uint8_t
>>>>> *Buf) {
>>>>>    const endianness E = ELFT::TargetEndianness;
>>>>>    for (const Cie<ELFT> &C : Cies) {
>>>>> -    size_t CieOffset = C.S->Offsets[C.Index].second;
>>>>> +    size_t CieOffset = C.S->Pieces[C.Index].OutputOff;
>>>>>      writeCieFde<ELFT>(Buf + CieOffset, C.data());
>>>>>
>>>>>      for (const EHRegion<ELFT> &F : C.Fdes) {
>>>>> -      size_t Offset = F.S->Offsets[F.Index].second;
>>>>> +      size_t Offset = F.S->Pieces[F.Index].OutputOff;
>>>>>        writeCieFde<ELFT>(Buf + Offset, F.data());
>>>>>        write32<E>(Buf + Offset + 4, Offset + 4 - CieOffset); // Pointer
>>>>>
>>>>> @@ -1284,32 +1283,30 @@ void MergeOutputSection<ELFT>::addSectio
>>>>>    StringRef Data((const char *)D.data(), D.size());
>>>>>    uintX_t EntSize = S->getSectionHdr()->sh_entsize;
>>>>>    this->Header.sh_entsize = EntSize;
>>>>> -  MutableArrayRef<std::pair<uintX_t, uintX_t>> Offsets = S->Offsets;
>>>>>
>>>>>    // If this is of type string, the contents are null-terminated
>>>>> strings.
>>>>>    if (this->Header.sh_flags & SHF_STRINGS) {
>>>>> -    for (unsigned I = 0, N = Offsets.size(); I != N; ++I) {
>>>>> -      auto &P = Offsets[I];
>>>>> -      if (P.second == MergeInputSection<ELFT>::PieceDead)
>>>>> +    for (unsigned I = 0, N = S->Pieces.size(); I != N; ++I) {
>>>>> +      SectionPiece &Piece = S->Pieces[I];
>>>>> +      if (!Piece.Live)
>>>>>          continue;
>>>>>
>>>>> -      uintX_t Start = P.first;
>>>>> -      uintX_t End = (I == N - 1) ? Data.size() : Offsets[I + 1].first;
>>>>> +      uintX_t Start = Piece.InputOff;
>>>>> +      uintX_t End = (I == N - 1) ? Data.size() : S->Pieces[I +
>>>>> 1].InputOff;
>>>>>        StringRef Entry = Data.substr(Start, End - Start);
>>>>>        uintX_t OutputOffset = Builder.add(Entry);
>>>>> -      if (shouldTailMerge())
>>>>> -        OutputOffset = MergeInputSection<ELFT>::PieceLive;
>>>>> -      P.second = OutputOffset;
>>>>> +      if (!shouldTailMerge())
>>>>> +        Piece.OutputOff = OutputOffset;
>>>>>      }
>>>>>      return;
>>>>>    }
>>>>>
>>>>>    // If this is not of type string, every entry has the same size.
>>>>> -  for (auto &P : Offsets) {
>>>>> -    if (P.second == (uintX_t)-1)
>>>>> +  for (SectionPiece &Piece : S->Pieces) {
>>>>> +    if (!Piece.Live)
>>>>>        continue;
>>>>> -    StringRef Entry = Data.substr(P.first, EntSize);
>>>>> -    P.second = Builder.add(Entry);
>>>>> +    StringRef Entry = Data.substr(Piece.InputOff, EntSize);
>>>>> +    Piece.OutputOff = Builder.add(Entry);
>>>>>    }
>>>>>  }
>>>>>
>>>>>
>>>>> Modified: lld/trunk/ELF/Writer.cpp
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Writer.cpp?rev=270340&r1=270339&r2=270340&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- lld/trunk/ELF/Writer.cpp (original)
>>>>> +++ lld/trunk/ELF/Writer.cpp Sat May 21 19:13:04 2016
>>>>> @@ -855,8 +855,7 @@ template <class ELFT> static bool includ
>>>>>      if (!D->Section->Live)
>>>>>        return false;
>>>>>      if (auto *S = dyn_cast<MergeInputSection<ELFT>>(D->Section))
>>>>> -      if (S->getRangeAndSize(D->Value).first->second ==
>>>>> -          MergeInputSection<ELFT>::PieceDead)
>>>>> +      if (!S->getRangeAndSize(D->Value).first->Live)
>>>>>          return false;
>>>>>    }
>>>>>    return true;
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160524/a66147f9/attachment.html>


More information about the llvm-commits mailing list