[llvm-dev] Remove obsolete debug info while garbage collecting

Fri Sep 20 13:41:20 PDT 2019

19.09.2019 4:24, David Blaikie пишет:
>
>
> On Wed, Sep 18, 2019 at 7:25 AM Alexey Lapshin <a.v.lapshin at mail.ru 
> <mailto:a.v.lapshin at mail.ru>> wrote:
>
>
>>
>     Generally speaking, dsymutil does a very similar thing. It parses
>     DWARF DIEs, analyzes relocations, scans through references and
>     throws out unused DIEs. But it`s current interface does not allow
>     to use it at link stage.
>      I think it would be perfect to have a singular implementation.
>      Though I did not analyze how easy or is it possible to reuse its
>     code at the link stage, it looked like it needs a significant rework.
>
>      Implementation from this proposal does removing of obsolete debug
>     info at link stage.
>      And so has benefits of already loaded object files, already
>     created liveness information,
>      generating an optimized binary from scratch.
>
>
>     If dsymutil could be refactored in such manner that could be used
>     at the link stage, then it`s implementation could be reused. I
>     would research the possibility of such a refactoring.
>
> Yeah, if this is going to be implemented, I think that would be 
> strongly preferred - though I realize it may be substantial work to 
> refactor. The alternative - duplicating all this work - doesn't seem 
> like something that would be good for the LLVM project.

I see. So I would research the question of whether it is possible to 
refactor it accordingly.

>>         1. Minimize or entirely avoid references from subprograms
>>         into other parts of .debug_info section. That would simplify
>>         splitting and removing subprograms out in that sense that it
>>         would minimize the number of references that should be parsed
>>         and followed. (DW_FORM_ref_subroutine instead of
>>         DW_FORM_ref_*, ?)
>>
>>
>>     Not sure I follow - by "other parts of the .debug_info section"
>>     do you mean in the same CU, or cross CU references? Any
>>     particular references you have in mind? Or encountered in practice?
>     I mean here all kinds of references into .debug_info section.
>
>
> Ah, not only references from other places /into/ .debug_info (which 
> don't really exist, so far as I know) but any references to locations 
> within debug_info.
>
> Reducing these isn't super-viable - types being the most common 
> examples. Though now I understand what you're getting at partly around 
> the debug_type_table idea - adding a level of indirection to type 
> references. So it'd be easy to find only one place to fix when 
> removing chunks of debug_info (updating only the type table without 
> having to find all the places inside debug_info to touch). That 
> indirection would come at a size cost, of course - and an overhead for 
> DWARF parsers having to follow that indirection. Doesn't make it 
> impossible - just tradeoffs to be aware of.
>
> Though that's not the only DIE references - without removing them all 
> there'd still be a fair bit of overhead for finding any remaining ones 
> and applying them. If an indirection table is to be added, maybe a 
> generalized one (for any DIE reference) rather than one only for types 
> would be good.
>
yes, some general indirection table would probably be useful.
But, types would still require specialized handling.
Types have "type hash" and need some specific logic around that.

> (aspects of this have been discusesd before - we've sometimes 
> nicknamed it "bag of DWARF" when discussing it in the context of type 
> units (currently you can only reference the type DIE in a type unit - 
> which adds overhead when wanting to reference subprogram declaration 
> DIEs, etc (or maybe multiple types are clustered together and don't 
> need a separate type unit each - if only you could refer to multiple 
> types in a type unit) - so we've discussed generalizing the type unit 
> header (actually it could generalize even as far as the classic CU 
> header) to have N type DIE offset+hash pairs (zero for a normal CU, 
> one for a classic type unit, and any number for more interesting cases))

As far as I understand, "generalizing the type unit header (actually it 
could generalize even as far as the classic CU header) to have N type 
DIE offset+hash pairs" looks very close to "global type table" which I 
am talking about.

>     Going through references is the time-consuming task.
>     Thus the fewer references there should be followed then the faster
>     it works.
>
>     For the cross CU references - It requires to load referenced CU. I
>     do not know use cases where cross CU references are used.
>
>
> Cross-CU inlining due to LTO. Try something like this:
>
> a.cpp:
>   void f2();
>   __attribute__((always_inline)) void f1() {
>     f2();
>   }
>
> b.cpp:
>   void f1();
>   int main() {
>     f1();
>   }
>
> $ clang++ a.cpp b.cpp -emit-llvm -S -c -g
> $ llvm-link a.ll b.ll -o ab.bc
> $ clang++ ab.bc -c
> $ llvm-dwarfdump ab.o -v -debug-info |
> 0x0b: DW_TAG_compile_unit
>         DW_AT_name "a.cpp"
> 0x2a:   DW_TAG_subprogram
>           DW_AT_abstract_origin [DW_FORM_ref4] (cu + 0x0056 => 
> {0x00000056} "_Z2f1v")
>         DW_TAG_subprogram
>           DW_AT_name "f1"
> 0x6e: DW_TAG_compile_unit
>         DW_AT_name "b.cpp"
> 0x8d:   DW_TAG_subprogram
>           DW_AT_name "main"
> 0xa6:     DW_TAG_inlined_subroutine
>             DW_AT_abstract_origin [DW_FORM_ref_addr] 
> (0x0000000000000056 "_Z2f1v")
>
> ueaueoa
> ueaoueoa
>
> Notice that the inlined_subroutine's abstract_origin uses a linker 
> relocation into the debug_info section to give an absolute offset 
> within the finally linked debug_info section (since the debugger 
> wouldn't know that these two compile_units are bound together and to 
> use some particular compile_unit as the base offset - either it's 
> absolute across the whole debug_info section (FORM_ref_addr) or it's 
> local to the CU (FORM_refN (such as FORM_ref4 above)))

Got it. Thank you.

>     If that is the specific case and is not used inside subprograms
>     usually, then probably it is possible to avoid it.
>
>
> It's fairly specifically used inside subprograms (& would need to be 
> adjusted even if it wasn't inside a subprogram - when bytes are 
> removed, etc) - though possibly general relocation handling in the 
> linker could be used to implement handling ref_addr.
>
>     For the same CU - there could probably be cases when references
>     could be ignored: https://reviews.llvm.org/P8165
>
>
> How would references be ignored while keeping them correct? Ah, by 
> making subprograms more self-contained - maybe, but the work to figure 
> out which things are only referenced from one place and structure the 
> DWARF differently probably wouldn't be ideal in the compiler & 
> wouldn't save the debug info linker from having to haev code to handle 
> the case where it wasn't only used from that subprogram anyway.
>
>>         2. Create additional section - global types table
>>         (.debug_types_table). That would significantly reduce the
>>         number of references inside .debug_info section. It also
>>         makes it possible to have a 4-byte reference in this section
>>         instead of 8-bytes reference into type unit
>>         (DW_FORM_ref_types instead of DW_FORM_ref_sig8). It also
>>         makes it possible to place base types into this section and
>>         avoid per-compile unit duplication of them. Additionally,
>>         there could be achieved size reduction by not generating type
>>         unit header. Note, that new section - .debug_types_table -
>>         differs from DWARF4 section .debug_types in that sense that:
>>         it contains unique type descriptors referenced by offsets
>>         instead of list of type units referenced by
>>         DW_FORM_ref_sig8;  all table entries share the same
>>         abbreviations and do not have type unit headers.
>>
>>
>>     What do you mean when you say "global types table" the phrasing
>>     in the above paragraph is present-tense, as though this thing
>>     exists but doesn't seem to describe what it actually is and how
>>     it achieves the things the text says it achieves. Perhaps I've
>>     missed some context here.
>
>
>     The "global types table" does not exist yet. It could be created
>     if the discussed approach would be considered useful.
>
>
> Ah, the present-tense language was a bit confusing for me when 
> discussing a thing that doesn't exist yet & not having provided a 
> description of what it might be or might contain and why it would 
> exist/what it would achieve.

I should've written it more precise.

>     Please check the comparison of possible "global types table" and
>     currently existed type units: https://reviews.llvm.org/P8164
>
> Ah, that proposed version makes it easy to remove subprograms from 
> debug_info without having to fix up type references (but you still 
> have to have the code to fix up other cross-CU references, like 
> abstract_origin, so I'm not sure it provides that much value) but 
> doesn't make it easy to remove types (becaues you'd have to go looking 
> through the debug_info section to update all the type offsets (which I 
> guess you have to do anyway to find the type references) and removing 
> the types still also requires fixing up the types that reference each 
> other...
>
> So I'm not seeing a big win there.

Correct. Even if types were put into a separated table, there still 
would be necessary to:
  "go looking through the debug_info section to update all the type 
offsets";
  "removing the types still also requires fixing up the types that 
reference each other".

  But additionally it allows to have following benefits:

  1. Size reduction by remove fragmentation. In "-fdebug-types-section" 
solution every type which is put  into type unit requires:
    - additional type unit header,
    - section header(since it put into separate section),
    - proxy type copies inside compilation unit.

   Putting types into separate table allows not to create above data for 
every type.

2. Size reduction by deduplicate base types. In "-fdebug-types-section" 
solution base types are not deduplicated at all.

3. Performance improvement by handling fewer data. #1 leads to loading 
and parsing fewer bits.

4. Performance improvement by handling fewer references. Simpler 
reference chains allow parsing references faster.
   Instead of this :

type_offset->proxy_type->DW_FORM_ref_sig8->type_unit->type_offset->type.

   There would be this :

   type_offset->type_table->type.

>>
>>         We evaluated the approach on LLVM and Clang codebases. The
>>         results obtained are summarized in the tables below:
>>
>>
>>     Memory usage statistics (& confidence intervals for the build
>>     time) would probably be especially useful for comparing these
>>     tradeoffs.
>>     Doubly so when using compression (since the decompression would
>>     need to use more memory, as would the recompression - so, two
>>     different tradeoffs (compressed input, compressed output, and
>>     then both at the same time))
>
>     I would measure memory impact for that PoC implementation, but I
>     expect it would be significant.
>     Memory usage was not optimized yet. There are several things which
>     might be done to reduce memory footprint:
>     do not load all compile units into memory, avoid adding Parent
>     field to all DIEs.
>
> Yep, this is the sort of thing where I suspect the dsymutil 
> implementation may've already had at least some of that work done - 
> or, if not, that doing the work once for both/all implementations 
> would be very preferable to duplicating the effort.

Ok,

Thank you, Alexey.

>
> - Dave
>
>     Alexey.
>
>>
>>         _______________________________________________
>>         LLVM Developers mailing list
>>         llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>         https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190920/4ebf60eb/attachment.html>