[PATCH] D96035: [WIP][dsymutil][DWARFlinker] implement separate multi-thread processing for compile units.

Alexey Lapshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 8 00:01:47 PDT 2021


avl added a comment.

>>   The patch implements two features:
>>   
>>       Types deduplication by merging. It merges all partial type definitions/declarations into a single description to avoid types duplications. This approach allows the generation .debug_info table of 50% less in size.
>
> Very nice! I will apply this patch and do some testing. This can help eliminate issues we have in LLDB when you have many copies of a type, and any compiler generated methods (constructors, copy >constructors, destructors, etc) are only defined in some of the copies of the type. Were you able to find all such methods and merge them in?

yep, all such methods should be merged into the single type description(as soon as method name was properly identified - it either should have explicit name(DW_AT_name, DW_AT_linkage_name), either it should have implicit unique identifiers(DW_AT_decl_file, DW_AT_decl_line)).

> I am guessing you didn't end up using the DWARFGenerator for this?

I did not use DWARFGenerator because it heavily depends on AsmPrinter, MC/* classes. It might be reused though, If "DIE tree building" part would be separated from "writing output" part.
Implementation of  "DIE tree building" part in this patch uses the same idea as DWARFGenerator.

Another thing is that DWARFGenerator is not ready to work in multi-thread mode(i.e. to build dies tree in parallel).

DWARFGenerator might be refactored so that above problems would be resolved. In that case this patch would be able to use DWARFGenerator.

>>   Parallel execution. It allows speed-up execution if the hardware has available threads.
>
> We have this in normal dsymutil right now right? You have just made it faster? Or made it work with the new approach you have in this patch?

This patch uses the new approach. Current upstream dsymutil is able to utilize only two threads. It processes compile units sequentially, but is able to analyze compile unit and clone previous compile unit in parallel.

This patch is able to utilize all available threads. It is able to process each compile unit separately and in parallel.
Thus, it might perform better when available threads exist because more threads are used.

> You say 3x here and 2x above. If it is faster, then all is good. By default dsymutil runs with multiple threads, so the single threaded performance doesn't matter to me.

2x - is the performance result for this patch for 16 threads.
3x - is my estimation for this patch plus set of improvements(mentioned above)  for 16 threads.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96035/new/

https://reviews.llvm.org/D96035



More information about the llvm-commits mailing list