[llvm-dev] [Proposal][Debuginfo] dsymutil-like tool for ELF.

Alexey Lapshin via llvm-dev llvm-dev at lists.llvm.org
Sun Oct 25 09:31:23 PDT 2020


On 23.10.2020 19:43, David Blaikie wrote:
>
>>
>>
>>
>>     Ah, yeah - that seems like a missed opportunity - duplicating the
>>     whole type DIE. LTO does this by making monolithic types -
>>     merging all the members from different definitions of the same
>>     type into one, but that's maybe too expensive for dsymutil (might
>>     still be interesting to know how much more expensive, etc). But I
>>     think the other way to go would be to produce a declaration of
>>     the type, with the relevant members - and let the DWARF consumer
>>     identify this declaration as matching up with the earlier
>>     definition. That's the sort of DWARF you get from the non-MachO
>>     default -fno-standalone-debug anyway, so it's already pretty well
>>     tested/supported (support in lldb's a bit younger/more
>>     work-in-progress, admittedly). I wonder how much dsym size there
>>     is that could be reduced by such an implementation.
>
>     I see. Yes, that could be done and I think it would result in
>     noticeable size reduction(I do not know exact numbers at the moment).
>
>     I work on multi-thread DWARFLinker now and it`s first version will
>     do exactly the same type processing like current dsymutil.
>
> Yeah, best to keep the behavior the same through that
>
>     Above scheme could be implemented as a next step and it would
>     result in better size reduction(better than current state).
>
>     But I think the better scheme could be done also and it would
>     result in even bigger size reduction and in faster execution. This
>     scheme is something similar to what you`ve described above: "LTO
>     does - making monolithic types - merging all the members from
>     different definitions of the same type into one".
>
> I believe the reason that's probably not been done is that it can't be 
> streamed - it'd lead to buffering more of the output

yes. The fact that DWARF should be streamed into AsmPrinter complicates 
parallel dwarf generation. In my prototype, I generate
several resulting files(each for one source compilation unit) and then 
sequentially glue them into the final resulting file.


> (if two of these expandable types were in one CU - the start of the 
> second type couldn't be known until the end because it might keep 
> getting pushed later due to expansion of the first type) and/or having 
> to revisit all the type references (the offset to the second type 
> wouldn't be known until the end - so writing the offsets to refer to 
> the type would have to be deferred until then).

That is the second problem: offsets are not known until the end of file.
dsymutil already has that situation for inter-CU references, so it has 
extra pass to
fixup offsets. With multi-thread implementation such situation would 
arise more often
for type references and so more offsets should be fixed during 
additional pass.

>     DWARFLinker could create additional artificial compile unit and
>     put all merged types there. Later patch all type references to
>     point into this additional compilation unit.  No any bits would be
>     duplicated in that case. The performance improvement could be
>     achieved due to less amount of the copied DWARF and due to the
>     fact that type references could be updated when DWARF is cloned(no
>     need in additional pass for that).
>
> "later patch all type references to point into this additional 
> compilation unit" - that's the additional pass that people are 
> probably talking/concerned about. Rewalking all the DWARF. The current 
> dsymutil approach, as far as I know, is single pass - it knows the 
> final, absolute offset to the type from the moment it emits that 
> type/needs to refer to it.

Right. Current dsymutil approach is single pass. And from that point of 
view, solution
which you`ve described(to produce a declaration of the type, with the 
relevant members)
allows to keep that single pass implementation.

But there is a restriction for current dsymutil approach: To process 
inter-CU references
it needs to load all DWARF into the memory(While it analyzes which part 
of DWARF is live,
it needs to have all CUs loaded into the memory). That leads to huge 
memory usage.
It is less important when source is a set of object files(like in 
dsymutil case) and this
become a real problem for llvm-dwarfutil utility when source is a single 
file(With current
implementation it needs 30G of memory for compiling clang binary).

Without loading all CU into the memory it would require two passes 
solution. First to analyze
which part of DWARF relates to live code and then second pass to 
generate the result.
If we would have a two passes solution then we could create a 
compilation unit with all
types at first pass and at the second pass we could generate result with 
correct offsets(no
need to fix up them as it is currently required by dsymutil for forward 
inter-CU references).
The open question currently: how expensive this two passes approach is.

Thank you, Alexey.

>     Anyway, that might be the next step after multi-thread DWARFLinker
>     would be ready.
>
> Yep, be interesting to see how it all goes!
>
>>
>>         Do you suggest that 0x0000011b should be transformed into
>>         something like that:
>>
>>         0x000000fc: DW_TAG_compile_unit
>>                       DW_AT_language (DW_LANG_C_plus_plus)
>>                       DW_AT_name        ("templ.cpp")
>>                       DW_AT_stmt_list   (0x00000090)
>>                       DW_AT_low_pc (0x0000000100000fa0)
>>                       DW_AT_high_pc (0x0000000100000fab)
>>
>>         0x0000011b:   DW_TAG_structure_type
>>                         DW_AT_specification (0x0000002a "x")
>>
>>         0x00000124:     DW_TAG_subprogram
>>                           DW_AT_linkage_name ("_ZN1x2f3IiEEiv")
>>                           DW_AT_name ("f3<int>")
>>                           DW_AT_type (0x000000000000005e "int")
>>                           DW_AT_declaration     (true)
>>                           DW_AT_external        (true)
>>                           DW_AT_APPLE_optimized (true)
>>         0x00000138:       NULL
>>         0x00000139:     NULL
>>
>>         0x00000140:   DW_TAG_subprogram
>>                         DW_AT_low_pc (0x0000000100000fa0)
>>                         DW_AT_high_pc (0x0000000100000fab)
>>                         DW_AT_specification (0x0000000000000124
>>         "_ZN1x2f3IiEEiv")
>>         0x00000155:     NULL
>>
>>         Did I correctly get the idea?
>>
>>
>>     Yep, more or less. It'd be "safer" if 11b didn't use
>>     DW_AT_specification to refer to 2a, but instead was only a
>>     completely independent declaration of "x" - that path is already
>>     well supported/tested (well, it's the work-in-progress stuff for
>>     lldb to support -fno-standalone-debug, but gdb's been consuming
>>     DWARF like this for years, Clang and GCC both produce DWARF like
>>     this (if the type is "homed" in another file, then Clang/GCC
>>     produce DWARF that emits a declaration with just the members
>>     needed to define any member functions defined/inlined/referenced
>>     in this CU)) for years.
>>
>>     But using DW_AT_specification, or maybe some other extension
>>     attribute might make the consumers task a bit easier (could do
>>     both - use an extension attribute to tie them up, leave
>>     DW_AT_declaration/DW_AT_name here for consumers that don't
>>     understand the extension attribute) in finding that they're all
>>     the same type/pieces of teh same type.
>
>     yes. would try this solution.
>
>     Thank you, Alexey.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201025/f32217c2/attachment.html>


More information about the llvm-dev mailing list