[PATCH] D79672: [COFF] Move type merging to TpiSource::mergeDebugT virtual method

Thu May 14 10:16:52 PDT 2020

aganea added inline comments.

================
Comment at: lld/COFF/PDB.cpp:950
+
+  if (source->kind == TpiSource::PDB)
+    return; // No symbols in TypeServer PDBs
----------------
rnk wrote:
> aganea wrote:
> > santagada wrote:
> > > rnk wrote:
> > > > aganea wrote:
> > > > > I was wondering if we could pre-link some libraries to speed-up linking of rarely modified projects. We're building lots of external/third-party libraries compiled from source, because we want to enforce the same settings as the rest of the gamecode (for options like /MT or -flto=thin). However they are cached and very rarely compiled. But we'd still pay the cost to merge symbols and types at link-time.
> > > > > Maybe each library (or each group of libraries) could come with a /Zi-style .PDB which LLD would use instead of the library's .OBJs. If that ever happens, we'd need to merge symbols here as well.
> > > > True.
> > > What I would like is to generate static libraries that are more than just a tar of obj, but actually have a merged type and possibly symbol tables. That would make linking much faster as you can distribute those libs and reuse most of them. Ar files have a place for these, maybe we could put a empty .obj with a specific name on a .lib and read from there when available?
> > @santagada : Yeah that's what `cl /Zi` does, it stamps a .PDB redirection into the .OBJ type stream:
> > ```
> > 0x1000 : Length = 58, Leaf = 0x1515 LF_TYPESERVER2
> >                 GUID={2760BCA6-C486-455B-833A-13B67CA4FA9A}, age = 0x00000001, PDB name = 'F:\llvm-project\__test\vc140.pdb
> > ```
> > Except that the .PDB only contains the type stream, not the symbol stream, which remains in the .OBJ. The type stream is self-standing, whereas symbols need references to other sections in the .OBJ. MSVC does this with an external application (`mspdbsrv.exe`), so it would be a bit complicated to merge the symbols on the fly, multithreaded.
> > 
> > In our case, this would be a deferred step, not a 'live' step like MSVC. We could call lld-link in a special pre-link mode. At least that's the idea.
> Hm, a special pre-link step sounds a lot like the Unix `ld -r` mode to produce a relocatable object. So, the build system's responsibility is to partition the objects into objects-under-development and the rest, and then `ld -r` the rest into a semi-static pre-linked blob. Interesting.
Yes something along those lines. We could further partition into the 'rarely changed .LIBs' and the 'unmodified locally .LIBs'. The 'rarely changed' would be in the network cache and will not be built most of the time. The 'unmodified locally' would end up as a large blob, linked in every time the user modifies a file in a given project. On one hand you would have the pre-linked blob, and on the other hand only the .LIB that you're modifying currently. This would keep things deterministic, ie. the same link order. Most of the users rarely change files across projects, and if they do, it's only a few projects.

Another alternative to this would be an incremental link mode, where we would save all internal structures raw to disk. But that's a bit more complicated, as we need to add extra information to the runtime. For example for tracking down relationships between which .OBJ inserted which record. This would allow removing records for "stale" .OBJs and inserting records from the "new" .OBJs. But I think there's higher ROI for entirely parallelizing the COFF linker first.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79672/new/

https://reviews.llvm.org/D79672