[PATCH] D96035: [WIP][dsymutil][DWARFlinker] implement separate multi-thread processing for compile units.

Fri Aug 6 15:56:34 PDT 2021

dblaikie added a comment.

In D96035#2932140 <https://reviews.llvm.org/D96035#2932140>, @avl wrote:

>> (I'm not especially invested in DWARFLinker, since it's mostly for MachO dsymutil linking (though generalizing it to work for dwp could be interesting, might be an alternative solution (rather than going to more like gold-dwp) to addressing some overhead/scalability issues in llvm-dwp))
>
>
>
>> The type merging approach sounds OK to me, broadly speaking.
>> The overhead at low parallelism seems unfortunate - can the algorithm be made more amenable to single threaded performance as well?
>
> yes it can. I think that performance might be improved for 1 thread case at least to 1.5X of upstream dsymutil. May be better.
>
>> (is there some design tradeoff that could happen at low >parallelism that'd keep mostly the same approach? Or if we want low-parallelism performance are we going to end up maintaining the two different approaches entirely?)
>
> Types merging plus deterministic output require doing more work than current dsymutil solution. That is the reason why 1-2 threads version works slower. But there is a set of things which might improve performance.

Fair enough - maybe there's room for the overhead to be low enough that it's worth the extra benefits of smaller output, such that it's not worth keeping the old implementation (ie: its time/space tradeoff isn't so interesting to be worth keeping). That'll be up to the Apple folks, pretty much.

> If deterministic output is the requirement then we would not be able to support two different approaches since they would be binary incompatible.

So long as the different behaviors aren't chosen organically (based on number of cores available, etc), but instead by a flag - that would be OK. The goal is to be deterministic based on inputs/command line - but different command lines can produce different behaviors.

>> What's the explanation for the difference/improvement in total output size with type uniquing enabled? I guess existing dsymutil is only structural, not odr-based uniquing or something?
>
> Upstream dsymutil solution could not deduplicate partially defined types. Some type class declarations might omit member functions in various different compile units. Upstream dsymutil does not change them and leave inplace. Types merging solution is able to remove all such copies and to have only one class declaration containing all member functions. That allows to save space.

Yeah, figured something like that - thanks for explaining!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96035/new/

https://reviews.llvm.org/D96035