[lld] r246869 - COFF: Use section content checksum for ICF.

Rui Ueyama via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 16 16:27:04 PDT 2015


On Wed, Sep 16, 2015 at 4:08 PM, David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Wed, Sep 16, 2015 at 3:46 PM, Rui Ueyama via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>> This optimization seems very effective. I picked up a program
>> mksnapshot.exe from Chromium and measure the time LLD spends on ICF. The
>> program contains 27645 COMDAT sections, of which 4501 sections are
>> redundant. The output size after ICF is 9MB.
>>
>> Without this patch, LLD takes 93 ms to scan and group all COMDAT
>> sections. The number of hash collisions that had to be resolved by compare
>> section contents was 1335211.
>>
>> With this patch, LLD takes only 33 ms to do the same thing. There was
>> *no* false hash collision (but we still had to compare contents for
>> positive case because the section hash value is just a CRC32 and not a
>> cryptographic one).
>>
>> So this is working as we hoped, and that's probably important because ICF
>> is on by default.
>>
>
> \o/
>
> Got any comparative numbers with link.exe? (specifically I was curious
> about your recent fixes to ICF which mention smaller binary size than
> link.exe & I was wondering if you had any ideas what the difference was
> there - but also just tracking the exact comdat functions and deduplication
> & see if we get exactly the same answers as link.exe?)
>

Ran link.exe with the same command line options, and here's the table. Unit
is second. Probably this program is too small to use for benchmark. I'll
test this with different programs.

         MSVC  LLD
ICF on   1.43  0.98
ICF off  1.41  0.93

As to the small difference in size between LLD and link.exe, I have no good
explanation yet. I was thinking that that can even be a sign of a hidden
bug that LLD reduces sections too aggressively. Or that's just because of a
difference of section layouts.


>
>> On Fri, Sep 4, 2015 at 1:45 PM, Rui Ueyama via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>> Author: ruiu
>>> Date: Fri Sep  4 15:45:50 2015
>>> New Revision: 246869
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=246869&view=rev
>>> Log:
>>> COFF: Use section content checksum for ICF.
>>>
>>> Previously, we calculated our own hash values for section contents.
>>> Of coruse that's slow because we had to access all bytes in sections.
>>> Fortunately, COFF objects usually contain hash values for COMDAT
>>> sections. We can use that to speed up Identical COMDAT Folding.
>>>
>>> Modified:
>>>     lld/trunk/COFF/Chunks.h
>>>     lld/trunk/COFF/ICF.cpp
>>>     lld/trunk/COFF/InputFiles.cpp
>>>
>>> Modified: lld/trunk/COFF/Chunks.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/Chunks.h?rev=246869&r1=246868&r2=246869&view=diff
>>>
>>> ==============================================================================
>>> --- lld/trunk/COFF/Chunks.h (original)
>>> +++ lld/trunk/COFF/Chunks.h Fri Sep  4 15:45:50 2015
>>> @@ -183,6 +183,10 @@ public:
>>>    // and this chunk is considrered as dead.
>>>    SectionChunk *Ptr;
>>>
>>> +  // The CRC of the contents as described in the COFF spec 4.5.5.
>>> +  // Auxiliary Format 5: Section Definitions. Used for ICF.
>>> +  uint32_t Checksum = 0;
>>> +
>>>  private:
>>>    ArrayRef<uint8_t> getContents() const;
>>>
>>>
>>> Modified: lld/trunk/COFF/ICF.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/ICF.cpp?rev=246869&r1=246868&r2=246869&view=diff
>>>
>>> ==============================================================================
>>> --- lld/trunk/COFF/ICF.cpp (original)
>>> +++ lld/trunk/COFF/ICF.cpp Fri Sep  4 15:45:50 2015
>>> @@ -44,7 +44,7 @@ uint64_t SectionChunk::getHash() const {
>>>                        NumRelocs,
>>>                        uint32_t(Header->SizeOfRawData),
>>>                        std::distance(Relocs.end(), Relocs.begin()),
>>> -                      hash_combine_range(A.data(), A.data() +
>>> A.size()));
>>> +                      Checksum);
>>>  }
>>>
>>>  // Returns true if this and a given chunk are identical COMDAT sections.
>>> @@ -58,6 +58,8 @@ bool SectionChunk::equals(const SectionC
>>>      return false;
>>>    if (NumRelocs != X->NumRelocs)
>>>      return false;
>>> +  if (Checksum != X->Checksum)
>>> +    return false;
>>>
>>>    // Compare data
>>>    if (getContents() != X->getContents())
>>>
>>> Modified: lld/trunk/COFF/InputFiles.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/InputFiles.cpp?rev=246869&r1=246868&r2=246869&view=diff
>>>
>>> ==============================================================================
>>> --- lld/trunk/COFF/InputFiles.cpp (original)
>>> +++ lld/trunk/COFF/InputFiles.cpp Fri Sep  4 15:45:50 2015
>>> @@ -225,13 +225,14 @@ Defined *ObjectFile::createDefined(COFFS
>>>    if (!SC)
>>>      return nullptr;
>>>
>>> -  // Handle associative sections
>>> +  // Handle section definitions
>>>    if (IsFirst && AuxP) {
>>>      auto *Aux = reinterpret_cast<const coff_aux_section_definition
>>> *>(AuxP);
>>>      if (Aux->Selection == IMAGE_COMDAT_SELECT_ASSOCIATIVE)
>>>        if (auto *ParentSC = cast_or_null<SectionChunk>(
>>>                SparseChunks[Aux->getNumber(Sym.isBigObj())]))
>>>          ParentSC->addAssociative(SC);
>>> +    SC->Checksum = Aux->CheckSum;
>>>    }
>>>
>>>    auto *B = new (Alloc) DefinedRegular(this, Sym, SC);
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150916/b5fcf400/attachment-0001.html>


More information about the llvm-commits mailing list