[PATCH] D153268: [DWARFLinkerParallel] Add limited functionality to DWARFLinkerParallel.

Alexey Lapshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 21 07:08:55 PDT 2023


avl added a comment.

In D153268#4436350 <https://reviews.llvm.org/D153268#4436350>, @clayborg wrote:

> In D153268#4435984 <https://reviews.llvm.org/D153268#4435984>, @avl wrote:
>
>> In D153268#4435883 <https://reviews.llvm.org/D153268#4435883>, @clayborg wrote:
>>
>>>> a) No common abbreviation table. Each compile unit has its own abbreviation table. Generating common abbreviation table slowdowns parallel execution(This is a resource that is accessed many times from many threads). Abbreviation table does not take a lot of space, so it looks cheap to have separate abbreviations tables. Later, it might be optimized a bit(by removing equal abbreviations tables).
>>>
>>> Can't we just let each compile unit generate its own abbrev table and then unique them into a single abbrev table at when we write out the .debug_info data? That way compile units can stay fast in parallel execution, and since we need to emit the compile units serially when generating the .debug_info, we can just have a map that maps CU specific abbrev codes to the global and final abbrev list? Then we end up with just one .debug_abbrev table in the end
>>
>> The problem with this solution seems that when local abbreviation number would be replaced with global number(which is ULEB128 format) the DIEs offsets will be changed(because size of global number may differ from the size of local number). It will brake all sizes and references.
>
> We make up DIEs to emit, but _as_ we emit the DIEs from a CU, can't we globalize the abbrev from the local CU as they are being emitted?

We can, but after abbrev numbers are globalized we need to update all size and reference fields and rewrite emitted DWARF.

> Each abbrev could cache its global value if it has already been emitted from each CU. I am thinking that DIEs would have references to other DIEs using pointers when they are stored in intermediate form but this might not be true coming to think about it. If reference would need to be updated then this wouldn't work as you mentioned.

this patch uses Codegen DIE. It does not use pointers for references/sizes/offsets. These are absolute values:

  DIE::setOffset()
  DIE::setSize()
  DIE::addValue(DIEAlloc, dwarf::Attribute(AttrSpec.Attr), dwarf::DW_FORM_ref_addr, DIEInteger(0x0)))

>> Also, currently CU is emitted(when we write out the .debug_info data) immediately. The global abbreviation table might be done in the very end after all CUs are emitted. That means we need to parse already emitted CU and update abbreviation numbers. It would probably slowdown execution.
>
> Yeah, if DIE refs are done with hard coded offsets up front then this wouldn't work. A DWARF generator a created in python has references pointing to a DIE object so that it can ask that object for its offset when a reference is needed, and if it isn't available yet, it gets added to a fixup list for forward refs, and then they get fixed up in a post CU output phase. This is nice because I never have to worry about reference offsets as they "just work". I admit I am not as familiar with out the LLVM DWARFLinker does things.

Yes. that is the case - DIEs emitted before global abbreviations are known. This patch does similar fixup handling for references(write proper values later, when they are known). But that fixup occurs for fixed-size fields. For patching variable size fileds(like LEB128)  it is necessary to rewrite already generated DWARF(as we need to shift data because of changed fields length).

>> The size of abbreviation table for clang binary having .debug_info 1.7G : 12.5K for DWARFLinker and 7.2M for DWARFLinkerParallel. i.e. new abbreviation table is approx 0.5% of overall .debug_info size. It looks relatively cheap. Though, any solution making .debug_abbrev shorter would certainly be good. Probably we might experiment with fixed length abbreviation numbers...
>
> No real worry if the size isn't too big.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153268/new/

https://reviews.llvm.org/D153268



More information about the llvm-commits mailing list