[PATCH] D96035: [WIP][dsymutil][DWARFlinker] implement separate multi-thread processing for compile units.

Greg Clayton via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 7 14:44:31 PDT 2021


clayborg added a comment.

In D96035#2985299 <https://reviews.llvm.org/D96035#2985299>, @avl wrote:

> @JDevlieghere @aprantl  @clayborg @friss @echristo
>
> Folks, What is your opinion on this patch? Would it be useful to integrate?

It is very compelling! I do think it would be useful to integrate if there are perf and size wins as you are finding.

> The patch implements two features:
>
> 1. Types deduplication by merging. It merges all partial type definitions/declarations into a single description to avoid types duplications. This approach allows the generation .debug_info table of 50% less in size.

Very nice! I will apply this patch and do some testing. This can help eliminate issues we have in LLDB when you have many copies of a type, and any compiler generated methods (constructors, copy constructors, destructors, etc) are only defined in some of the copies of the type. Were you able to find all such methods and merge them in? I am guessing you didn't end up using the DWARFGenerator for this?

> 2. Parallel execution. It allows speed-up execution if the hardware has available threads.

We have this in normal dsymutil right now right? You have just made it faster? Or made it work with the new approach you have in this patch?

> Current size/performance results are(compared with llvm upstream dsymutil):
>
> - .debug_info table of 40% less in size.

That is great! Anything we can do to reduce the dups in DWARF even more than what we already had in dsymutil is great. I know there were a bunch of things that could cause types to be emitted multiple times in the current dsymutil that were not allowing us to reduce the DWARF size as much as we would have liked to.

> - single-threaded mode works 1.7x slower.
> - multy-thread mode works up to 2x faster.

Nice.

> There is a number of things which might improve the result:
>
> 1. Deduplicate types defined in DW_TAG_subprograms with DW_AT_specification attribute.
>
> 2. (probably)Deduplicate abstract inline instances.
>
> 3. Use another memory pool for dies. Instead of BumpPtrAllocator use a pre-allocated pool. It would reduce data fragmentation.
>
> 4. Current version of the patch uses two different memory pools(because of interfaces dependencies) with duplicated data: MTSafeStringTable and NonRelocatableStringpool. Using only one kind of pool would help to reduce memory usage and to improve performance numbers.
>
> 5. Improve Dies navigation - https://reviews.llvm.org/D102634#inline-971772.
>
> 6. Generate index tables in parallel.
>
> 7. Avoid gluing generated tables into a single file. Use tables, generated for compilation units, as the output. It would allow avoiding data copying. Consumers might write compilation units tables directly so we do not need to create an intermediate output file.
>
> So, I think we might have(after above is done) a single-threaded performance to be 1.3-1.5 slower than current dsymutil, multi-threaded performance up to 3x faster.

You say 3x here and 2x above. If it is faster, then all is good. By default dsymutil runs with multiple threads, so the single threaded performance doesn't matter to me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96035/new/

https://reviews.llvm.org/D96035



More information about the llvm-commits mailing list