[llvm-dev] [Proposal][Debuginfo] dsymutil-like tool for ELF.

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Fri Oct 23 09:43:08 PDT 2020


>
>
>>
>>
> Ah, yeah - that seems like a missed opportunity - duplicating the whole
> type DIE. LTO does this by making monolithic types - merging all the
> members from different definitions of the same type into one, but that's
> maybe too expensive for dsymutil (might still be interesting to know how
> much more expensive, etc). But I think the other way to go would be to
> produce a declaration of the type, with the relevant members - and let the
> DWARF consumer identify this declaration as matching up with the earlier
> definition. That's the sort of DWARF you get from the non-MachO default
> -fno-standalone-debug anyway, so it's already pretty well tested/supported
> (support in lldb's a bit younger/more work-in-progress, admittedly). I
> wonder how much dsym size there is that could be reduced by such an
> implementation.
>
> I see. Yes, that could be done and I think it would result in noticeable
> size reduction(I do not know exact numbers at the moment).
>
> I work on multi-thread DWARFLinker now and it`s first version will do
> exactly the same type processing like current dsymutil.
>
Yeah, best to keep the behavior the same through that

> Above scheme could be implemented as a next step and it would result in
> better size reduction(better than current state).
>
> But I think the better scheme could be done also and it would result in
> even bigger size reduction and in faster execution. This scheme is
> something similar to what you`ve described above: "LTO does - making
> monolithic types - merging all the members from different definitions of
> the same type into one".
>
I believe the reason that's probably not been done is that it can't be
streamed - it'd lead to buffering more of the output (if two of these
expandable types were in one CU - the start of the second type couldn't be
known until the end because it might keep getting pushed later due to
expansion of the first type) and/or having to revisit all the type
references (the offset to the second type wouldn't be known until the end -
so writing the offsets to refer to the type would have to be deferred until
then).

> DWARFLinker could create additional artificial compile unit and put all
> merged types there. Later patch all type references to point into this
> additional compilation unit.  No any bits would be duplicated in that case.
> The performance improvement could be achieved due to less amount of the
> copied DWARF and due to the fact that type references could be updated when
> DWARF is cloned(no need in additional pass for that).
>
"later patch all type references to point into this additional compilation
unit" - that's the additional pass that people are probably
talking/concerned about. Rewalking all the DWARF. The current dsymutil
approach, as far as I know, is single pass - it knows the final, absolute
offset to the type from the moment it emits that type/needs to refer to it.

> Anyway, that might be the next step after multi-thread DWARFLinker would
> be ready.
>
Yep, be interesting to see how it all goes!

>
>
>>
>> Do you suggest that 0x0000011b should be transformed into something like
>> that:
>>
>> 0x000000fc: DW_TAG_compile_unit
>>               DW_AT_language    (DW_LANG_C_plus_plus)
>>               DW_AT_name        ("templ.cpp")
>>               DW_AT_stmt_list   (0x00000090)
>>               DW_AT_low_pc      (0x0000000100000fa0)
>>               DW_AT_high_pc     (0x0000000100000fab)
>>
>> 0x0000011b:   DW_TAG_structure_type
>>                 DW_AT_specification (0x0000002a "x")
>>
>> 0x00000124:     DW_TAG_subprogram
>>                   DW_AT_linkage_name    ("_ZN1x2f3IiEEiv")
>>                   DW_AT_name    ("f3<int>")
>>                   DW_AT_type    (0x000000000000005e "int")
>>                   DW_AT_declaration     (true)
>>                   DW_AT_external        (true)
>>                   DW_AT_APPLE_optimized (true)
>> 0x00000138:       NULL
>> 0x00000139:     NULL
>>
>> 0x00000140:   DW_TAG_subprogram
>>                 DW_AT_low_pc    (0x0000000100000fa0)
>>                 DW_AT_high_pc   (0x0000000100000fab)
>>                 DW_AT_specification     (0x0000000000000124
>> "_ZN1x2f3IiEEiv")
>> 0x00000155:     NULL
>>
>> Did I correctly get the idea?
>>
>
> Yep, more or less. It'd be "safer" if 11b didn't use DW_AT_specification
> to refer to 2a, but instead was only a completely independent declaration
> of "x" - that path is already well supported/tested (well, it's the
> work-in-progress stuff for lldb to support -fno-standalone-debug, but gdb's
> been consuming DWARF like this for years, Clang and GCC both produce DWARF
> like this (if the type is "homed" in another file, then Clang/GCC produce
> DWARF that emits a declaration with just the members needed to define any
> member functions defined/inlined/referenced in this CU)) for years.
>
> But using DW_AT_specification, or maybe some other extension attribute
> might make the consumers task a bit easier (could do both - use an
> extension attribute to tie them up, leave DW_AT_declaration/DW_AT_name here
> for consumers that don't understand the extension attribute) in finding
> that they're all the same type/pieces of teh same type.
>
>
> yes. would try this solution.
>
> Thank you, Alexey.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201023/5663da5d/attachment.html>


More information about the llvm-dev mailing list