[PATCH] D27152: Merge strings using sharded hash tables.
Sean Silva via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 6 21:48:01 PST 2016
silvas added a comment.
When running with just a single core, can we still do a sharded visitation, but instead of doing NumShards passes through all the pieces (skipping ones not in the current shard), only do a single pass? We should be able to get identical layout in that case.
I.e., instead of
if ((Hash % NumShards) != Idx)
continue;
...
auto P = OffsetMap[Idx].insert({{S, Hash}, Off});
instead do:
auto P = OffsetMap[Hash % NumShards].insert({{S, Hash}, Off});
Maybe that can recover some of the single-threaded performance? If not, then we definitely need comments explaining what is causing the single-thread slowdown (which is equivalent to a throughput reduction; improving latency is good, but the throughput reduction seems to be about 20%+ which might be a problem (for example, for a buildbot))
================
Comment at: ELF/OutputSections.cpp:530
-template <class ELFT> void MergeOutputSection<ELFT>::finalizeNoTailMerge() {
- // Add all string pieces to the string table builder to create section
- // contents. Because we are not tail-optimizing, offsets of strings are
- // fixed when they are added to the builder (string table builder contains
- // a hash table from strings to offsets).
- for (MergeInputSection<ELFT> *Sec : Sections)
- for (size_t I = 0, E = Sec->Pieces.size(); I != E; ++I)
- if (Sec->Pieces[I].Live)
- Sec->Pieces[I].OutputOff = Builder.add(Sec->getData(I));
-
- Builder.finalizeInOrder();
- this->Size = Builder.getSize();
+static size_t align2(size_t Val, size_t Alignment) {
+ return (Val + Alignment - 1) & ~(Alignment - 1);
----------------
Do we already have a helper for this in libsupport?
================
Comment at: ELF/OutputSections.cpp:562
+ const int NumShards = 8;
+ DenseMap<CachedHashStringRef, size_t> OffsetMap[NumShards];
+ size_t ShardSize[NumShards];
----------------
Please pad this so that there isn't false sharing. DenseMap is smaller than a cacheline IIRC and so currently different threads will have false sharing.
https://reviews.llvm.org/D27152
More information about the llvm-commits
mailing list