[PATCH] D55585: RFC: [LLD][COFF] Parallel GHASH generation at link-time

Alexandre Ganea via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 11 14:45:59 PST 2019


aganea updated this revision to Diff 181365.
aganea added subscribers: mstorsjo, thakis.
aganea added a comment.
Herald added subscribers: dexonsmith, mgorny.

Rebased on r350764.

I've played around a bit with different hashing algorithms. I've added a flag `/hasher:(sha1|md5|cityhash)` and `/summary` along the way.
Figures are on a 3.5 GHz, 6-core Intel Xeon Haswell:

The resulting PDB is 950 MB:

| (at r350764) lld-link                            |                                      | Type Merging: 20816 ms ( 59.4%) | Total Link Time: 35065 ms |
| (this patch) lld-link ... /hasher:sha1 (default) | Global hashing: **9635** ms ( 34.3%) | Type Merging: 4814 ms ( 17.1%)  | Total Link Time: 28128 ms |
| (this patch) lld-link ... /hasher:md5            | Global hashing: **5658** ms ( 23.4%) | Type Merging: 4813 ms ( 19.9%)  | Total Link Time: 24137 ms |
| (this patch) lld-link ... /hasher:cityhash       | Global hashing: **3640** ms ( 16.5%) | Type Merging: 4822 ms ( 21.8%)  | Total Link Time: 22120 ms |
|



                                      Summary
  --------------------------------------------------------------------------------
              156 Input OBJ files (expanded from all cmd-line inputs)
                0 Dependent PDB files
                1 Dependent PCH OBJ files
         81556098 Input type records (across all OBJ and dependencies)
       5108516032 Input type records bytes (across all OBJ and dependencies)
          4588516 Output merged type records
         10067321 Output merged symbol records
            23157 Output PDB strings



---

And with a slightly larger input the difference becomes more apparent. The resulting PDB is 2 GB:

| (r350764) lld-link                               |                                       | Type Merging:           38413 ms ( 56.8%) | Total Link Time:              67609 ms |
| (this patch) lld-link ... /hasher:sha1 (default) | Global hashing: **15807** ms ( 29.7%) | Type Merging: 9906 ms ( 18.6%)            | Total Link Time: 53135 ms              |
| (this patch) lld-link ... /hasher:md5            | Global hashing: **8354** ms ( 17.9%)  | Type Merging: 9977 ms ( 21.3%)            | Total Link Time: 46745 ms              |
| (this patch) lld-link ... /hasher:cityhash       | Global hashing: **6077** ms ( 13.7%)  | Type Merging: 9957 ms ( 22.5%)            | Total Link Time: 43291 ms              |
|



                                      Summary
  --------------------------------------------------------------------------------
             4768 Input OBJ files (expanded from all cmd-line inputs)
               70 Dependent PDB files
               27 Dependent PCH OBJ files
        142150698 Input type records (across all OBJ and dependencies)
       8623310584 Input type records bytes (across all OBJ and dependencies)
          9699343 Output merged type records
         33727100 Output merged symbol records
            48382 Output PDB strings

To the light of all this, does it still makes sense to compute and emit GHASH streams in the clang? I'm pretty sure that the cost for serialization and I/O for those streams would be much higher that just computing this on-the-fly in the LLD. The only marginal benefit would be for incrementally linking, however even that is debatable.

If you have no major concerns over all this, I'll start sending smaller patches.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55585/new/

https://reviews.llvm.org/D55585

Files:
  lld/trunk/COFF/Config.h
  lld/trunk/COFF/Driver.cpp
  lld/trunk/COFF/InputFiles.h
  lld/trunk/COFF/Options.td
  lld/trunk/COFF/PDB.cpp
  lld/trunk/Common/CMakeLists.txt
  lld/trunk/Common/Summary.cpp
  lld/trunk/include/lld/Common/Summary.h
  lld/trunk/include/lld/Common/Threads.h
  llvm/trunk/include/llvm/ADT/Any.h
  llvm/trunk/include/llvm/ADT/Hashing.h
  llvm/trunk/include/llvm/DebugInfo/CodeView/CVRecord.h
  llvm/trunk/include/llvm/DebugInfo/CodeView/GlobalTypeDenseMap.h
  llvm/trunk/include/llvm/DebugInfo/CodeView/GlobalTypeTableBuilder.h
  llvm/trunk/include/llvm/DebugInfo/CodeView/RecordSerialization.h
  llvm/trunk/include/llvm/DebugInfo/CodeView/TypeHashing.h
  llvm/trunk/include/llvm/DebugInfo/CodeView/TypeIndexDiscovery.h
  llvm/trunk/include/llvm/Support/BinaryStreamArray.h
  llvm/trunk/include/llvm/Support/CityHash.h
  llvm/trunk/include/llvm/Support/FormatProviders.h
  llvm/trunk/include/llvm/Support/MD5.h
  llvm/trunk/include/llvm/Support/Memory.h
  llvm/trunk/lib/DebugInfo/CodeView/GlobalTypeTableBuilder.cpp
  llvm/trunk/lib/DebugInfo/CodeView/TypeHashing.cpp
  llvm/trunk/lib/DebugInfo/CodeView/TypeIndexDiscovery.cpp
  llvm/trunk/lib/DebugInfo/CodeView/TypeStreamMerger.cpp
  llvm/trunk/lib/Support/Windows/Memory.inc
  llvm/trunk/tools/llvm-pdbutil/DumpOutputStyle.cpp
  llvm/trunk/unittests/DebugInfo/CodeView/TypeIndexDiscoveryTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D55585.181365.patch
Type: text/x-patch
Size: 97195 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190111/65dd8df0/attachment.bin>


More information about the llvm-commits mailing list