[PATCH] D96035: [dsymutil][DWARFlinker] implement separate multi-thread processing for compile units.

Tue Feb 1 09:26:44 PST 2022

dblaikie added a comment.

In D96035#3287393 <https://reviews.llvm.org/D96035#3287393>, @avl wrote:

> In D96035#3285620 <https://reviews.llvm.org/D96035#3285620>, @clayborg wrote:
>
>> I do believe that splitting types up into a compile unit that matches the DW_AT_decl_file would make this patch really hard to resist as it then makes the DWARF the best it can be. The nice thing is that if this is done it makes it very easy to tell where a type should be defined. So if the type's DW_AT_decl_file matches the current CU or if this is an anonymous namespace, then the type stays where it is. If it doesn't match, then it gets moved to a new compile unit. I don't know exactly how complex this would be, but it seems like it shouldn't be too hard. The huge type unit has the ability to greatly impact debugger performance as the code stands now because as soon as the debugger needs any type, it will have to parse all of the DIEs in the type compile unit. LLDB parses DWARF lazily and only pulls in what it needs, but with these binaries we would need to parse some 60MB of type DIEs as soon as anyone needs a type.
>
> There are some disadvantages with creating additional compilation units for each source compile unit:
>
> 1. Fragmentation and duplication. It would be necessary to duplicate: unit header, unit die, namespace dies, base types, line table headers, line table files, abbreviation table. clang has approx 1600 compilation units. So we need to duplicate all the above information for each of them. At the end of all, we might lose some DWARF size achievements.

Oh, yeah, too many units sounds unfortunate for sure. Currently we'd have one CU per .cpp file, and with "putting types in a synthetic unit that matches their `decl_file`" we'd have an additional one CU per .h file - which, admittedly, is probably a bit more than doubling the number of units. (template instantiations, for instance, might all be grouped together in the same unit, since they're defined in the same header)

I don't know that this makes the DWARF 'the best it can be' - it's not clear to me what expressive power is provided by the types being in units that match source file names (indeed the information is provided in the DWARF either way - a consumer can treat types defined in a certain file in a certain way even if they're all in one big CU).

> 2. Clously coupled references. If all types would be placed in separate compilation units matched with the original unit of declaration then types would reference each other. As the result, It would be hard to process such units in a parallel manner(independently). This limits the acceleration that can be achieved by parallelization. This patch tries to avoid cross-CU references. Only one type is allowed: non-type-CU -> type-CU.

Yeah.

> What about the following solution: Current type table unit(let`s say 60MB) would be divided into several buckets(let`s say 16) of independent types. Each bucket is placed in separate artificial compilation unit. So that there would not be references between units, there would not be a lot of duplicated information. The size of each separate type unit would be around 4MB(it would help to lldb to not parse much). Can this be a good solution? It looks like it allows to keep benefits(small final size of overall DWARF file, simple references, small size of each compile unit). It also would probably help to speed up multi-thread execution of DWARFLinker(if all type units would be generated in parallel) but I am afraid it would slow down single-thread execution.

Finding independent buckets of types sounds difficult/algorithmically complicated. But maybe that's feasible? I'm not sure. I was thinking more "emit all the types the same way you do currently, except into multiple unit "Chunks"" - ie: the code already handles type-to-type references within the single type-CU, so I don't understand (maybe I'm missing something) why it would be difficult to treat that "type-CU" as actually being "multiple type CUs" with arbitrary cross-referencing within that collection of type CUs. Then the chunks/buckets are chosen arbitrarily - admittedly that means longer encoding (sec_offset references are longer than the unit-relative references) than if you can group the types together into isolated/only-self-referencing groups - so maybe the extra space savings is worth the work to create those isolated groups? Naively, I would not have expected it to be worth that much.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96035/new/

https://reviews.llvm.org/D96035