[PATCH] D87805: [PDB] Merge types in parallel when using ghashing
Reid Kleckner via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 12 09:31:51 PST 2021
rnk added inline comments.
================
Comment at: lld/COFF/DebugTypes.cpp:621
+ size_t newSize = alignTo(ty.length(), 4);
+ merged.recs.resize(offset + newSize);
+ auto newRec = makeMutableArrayRef(&merged.recs[offset], newSize);
----------------
aganea wrote:
> @rnk: Just to follow up on https://reviews.llvm.org/D94267#2491643, the `.resize()` here takes 3.5 sec out of 74 sec (cumulated thread time on 72 hyper-threads).
> {F15003941}
>
> I've modified the code to do instead two passes, then `.reserve()`, and that saves about 0.6 sec median walltime. Although I think it is better to wait on prefetching mmap'ed memory pages first.
> ```
> Benchmark #1: before\lld-link.exe @link.rsp /threads:12
> Time (mean ± σ): 17.939 s ± 1.215 s [User: 2.7 ms, System: 3.5 ms]
> Range (min … max): 15.537 s … 18.597 s 10 runs
>
> Benchmark #2: after\lld-link.exe @link.rsp /threads:12
> Time (mean ± σ): 17.298 s ± 1.511 s [User: 1.4 ms, System: 8.9 ms]
> Range (min … max): 15.512 s … 18.513 s 10 runs
> ```
> As you see, there's also quite some variability in execution time, mostly because of the contention issues that I've mentionned in D94267.
I see, makes sense. I think it's worth doing something about this to avoid the memcpy resize overhead. We don't really need a flat buffer of type records here, I just thought it would be more efficient to flatten them than to have separate allocations for each type when most of them are small.
================
Comment at: lld/COFF/DebugTypes.cpp:652
+ assert(mergedIpi.recs.empty());
+ forEachTypeChecked(typeRecords, [&](const CVType &ty) {
+ if (nextUniqueIndex != uniqueTypes.end() &&
----------------
Doing an up front size calculation requires iterating all types in the TpiSource twice. The second pass will probably be hot in cache, so it's probably faster to precalculate the size with two iterations than it is to use a different, dynamic buffering strategy.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D87805/new/
https://reviews.llvm.org/D87805
More information about the llvm-commits
mailing list