[PATCH] D27155: Merge strings using concurrent hash map (3rd try!)

Sean Silva via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 29 22:19:13 PST 2016


silvas added a comment.

I don't think that it makes sense to put this in a file with a generic name like "Concurrent", as this is a quite specialized data structure that depends on specific LLD types. Maybe just call it ConcurrentStringTableBuilder.h?
I'm a bit worried about the layering too, does this class introduce any circular dependencies?



================
Comment at: ELF/Concurrent.cpp:24
+    : Alignment(Alignment),
+      NumBuckets(std::max<size_t>(PowerOf2Ceil(EstimatedNumEntries * 2), 1024)),
+      Counter(0) {
----------------
Why double it?

Also, you have a similar std::max calculation in MergeOutputSection<ELFT>::finalize. Do you need it in both places?


================
Comment at: ELF/Concurrent.cpp:26
+      Counter(0) {
+  Buckets = (EntryTy *)calloc(NumBuckets, sizeof(EntryTy));
+}
----------------
Can you use a std::unique_ptr<EntryTy[]> to manage this?


================
Comment at: ELF/Concurrent.cpp:73
+  }
+  IsTableFull = true;
+}
----------------
I think this technically needs to be atomic to avoid races.


================
Comment at: ELF/OutputSections.cpp:564
+      break;
+    Estimated *= 2;
+  }
----------------
I'm very concerned that there may be user scenarios where we end up almost always needing multiple trips through this loop (e.g., maybe "most" non-debug builds end up with only 2 duplicates on average?). How will we determine if this is the case? Otherwise, this "optimization" may backfire and users in the wild will get slower links; we need to have some feedback loop to correct this if it is the case, or be really confident that this optimization will always speed things up.

I would propose that we do a run of Poudriere or Debian or Gentoo with this patch applied and treat the resize case as a fatal error (and ideally the error message would also give the number of resizes). That way we can get an idea of how common this is.



https://reviews.llvm.org/D27155





More information about the llvm-commits mailing list