[PATCH] D55585: RFC: [LLD][COFF] Parallel GHASH generation at link-time

Alexandre Ganea via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Dec 29 17:57:20 PST 2018


aganea marked 4 inline comments as done.
aganea added a comment.

In D55585#1339859 <https://reviews.llvm.org/D55585#1339859>, @rnk wrote:

> One more thing I just realized that was probably obvious to you: Computing ghash in parallel completely eliminates the need for the /debug:ghash flag, and we can eliminate the non-ghash codepath. That's great.


Yes, and even further it might eliminate the need for GHASHes streams at all, if porting this step works out well on GPU (currently still WIP).



================
Comment at: lld/trunk/COFF/PDB.cpp:1599
+           ComputeH);
+  GlobalHashTimer.stop();
+
----------------
rnk wrote:
> All work up to here (except for dependency loading) can be kicked off from SymbolTable::addFile, which is the earliest point that we know we are including an object file in the link. The way it is written here, it's easy to understand that we are computing or fetching hashes in parallel. That clarify is worth something. Do you think it's worth restructuring things so that addFile is responsible for starting ghash computation and loading PDB and PCH dependencies? I think it might help reduce link time further by overlapping independent work, but it might make the unnecessarily complicated. Maybe it's best to keep things this way for now.
I was thinking about that. Maybe as a subsequent change, once all this lands?


================
Comment at: lld/trunk/COFF/PDB.cpp:1604-1618
+  // Merge Types in PDB/PCH files
+  for (InputFile *File : Depends) {
+    CVIndexMap TIStorage;
+    auto R = mergeTypes(File, TIStorage);
+    if (!R)
+      return R.takeError();
+
----------------
rnk wrote:
> We need to find a way to parallelize type merging... but that is future work.
I already have a good idea of what to do to make `GlobalTypeDenseMap` thread-safe and lock-free. It's more a matter of finding the time to do it :-)


================
Comment at: lld/trunk/COFF/PDB.cpp:1658
     // Sort the public symbols and add them to the stream.
-    std::sort(Publics.begin(), Publics.end(),
+    sort(parallel::par, Publics.begin(), Publics.end(),
+         [](const PublicSym32 &L, const PublicSym32 &R) {
----------------
rnk wrote:
> Wow, I'm surprised that matters. Feel free to send it separately, it seems like a small win.
Ok - I'll send another patch. Some of our large EXEs can take up to 3.5sec in this "Globals" step.


================
Comment at: llvm/trunk/include/llvm/DebugInfo/CodeView/GlobalTypeDenseMap.h:77
+          typename BucketT = GloballyHashedInfo>
+class GlobalTypeDenseMap : public DebugEpochBase {
+  template <typename T>
----------------
rnk wrote:
> I haven't read this implementation yet, it's quite long, but broadly I'm in favor of having a custom hash table here. This is the most performance critical thing LLD does.
> 
> This seems like a good separable change, since this map isn't actually used in parallel.
It's mostly a copy of `DenseMap`, but without tombstones, and with a different `DenseHashInfo` API.
I was wondering if a insert-only `DenseMap` would be useful in other parts of LLVM/Clang?


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55585/new/

https://reviews.llvm.org/D55585





More information about the llvm-commits mailing list