[PATCH] D96035: [WIP][dsymutil][DWARFlinker] implement separate multi-thread processing for compile units.

Wed Feb 24 05:04:03 PST 2021

avl added a comment.

> Comparing your proposal - avoiding nondeterminism by sorting the types by name in the new type-holding CU, we could do something similar, but instead sorting the CUs a type appears in to pick which of those existing locations to be the canonical home. (or, indeed, could sort by the original linear visitation order)

> eg: multithreaded walk over each CU and find each type (this applies to both your proposal and what I'm suggesting here, I think) - then, rather than sorting by name and putting them in a new CU, sort by the "index" of the CU the type appeared in (where the index is the order the CUs would've been visited in in the current linear algorithm) then place/preserve the canonical type in the CU with the lowest index where the type appears?

> Then the second pass goes and emits the CUs, consulting this list of type homes to determine whether this type should be emitted in the CU, or reference a copy emitted elsewhere.

My understanding is that this way assumes all CUs should be loaded into the memory and it does extra pass. i.e.

1. the first pass enumerates in a multithreaded manner all object files, all compile units, and creates an indexed map(the list of type homes). (In the result all CUs from all object files are loaded into the memory at the same time. The indexed map is also in the memory).

2. the second pass enumerates in a multithreaded manner all object files, all compile units, and emits bodies(consulting the list of type homes to determine whether this type should be emitted in the CU, or reference a copy emitted elsewhere).

3. Patch sizes/offsets/references after individual CU bodies are glued into the resulting file.

The scheme implemented in this patch and which might be done with additional compile unit keeping types - visit CU only once and then it might be unloaded from the memory:

1. the first pass enumerates in a multithreaded manner all object files, all compile units. Each CU is loaded, analyzed for types(there would be created a list of attribute referencing types), emitted. types moved into the artificial CU. unloaded.

2. Emit artificial CU(After the first pass is finished all CU are unloaded from the memory, except the artificial one).

3. Patch sizes/offsets/references after individual CU bodies are glued into the resulting file. (At this stage type's references from already emitted CU bodies would be patched to proper offsets inside artificial CU).

this scheme does not need to have two passes and it does not need to load all CUs into the memory at the same time.

> (as for type merging - that might also be possible with the scheme I'm proposing - if we're rewriting DIEs anyway, seems plausible we could add new child DIEs to canonical type type DIEs)

That would be one more reason to keep CUs in memory. When we rewrite a canonical DIE in some CU we would need to gather all children from all other CUs.

The artificial CU(from the scheme with separate CU keeping types) would have merged type so it does not require to keep source CU.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96035/new/

https://reviews.llvm.org/D96035