[PATCH] D79467: [PDB] Optimize public symbol processing

Fri May 8 10:42:38 PDT 2020

rnk added a comment.

This is the (poorly annotated in paint) profile that I'm looking at, BTW:
F11885386: lld-profile.png <https://reviews.llvm.org/F11885386>

The big remaining costs are input reading and the PDB processing of each object file. I think both operations really depend on having a good, generic, parallel primitive for building a map.

To build up the symbol table, we maintain a map from symbol name to symbol definition. To process debug info, we need to build a map from type record (ghash, really) to dst type index, and then symbols can be processed in parallel. If you have any pending patches in this area, please remind me, I've forgotten the state of things.

Unfortunately, there isn't an obviously best algorithm for building a map in parallel. One way would be to do what I did for publics: put all external symbols in a vector, sort it in parallel, remove the duplicates, but this is more work overall in the single-threaded case (O(n log n), presumably with a high constant factor to move large objects around) than our simple sequential hash table insertion.

We could do a map-reduce style hash & bucket computation: hash keys in parallel, divide key space equally among workers, each worker builds a map, merge each map. We could *try* to share the table since we could arrange the workers to insert into distinct buckets, but this requires careful handling of re-hashing or estimating the table size up front.

This seems like it would be a well-studied problem, but my initial searches haven't gotten me anywhere that I like yet. Apologies if I'm retreading something you've already researched, I vaguely recall a super-parallel ghash implementation.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79467/new/

https://reviews.llvm.org/D79467