<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><br>
</p>
<div class="moz-cite-prefix">17.09.2019 3:12, David Blaikie пишет:<br>
</div>
<blockquote type="cite"
cite="mid:CAENS6EvkpbHfuE3rKHhagDN_z=_inE1tmCUgYzi=HkNZ4d0SCA@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Sep 11, 2019 at 3:32
PM Alexey Lapshin via llvm-dev <<a
href="mailto:llvm-dev@lists.llvm.org"
moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>Debuginfo and linker folks, we (AccessSoftek) would
like to suggest a proposal for removing obsolete debug
info. If you find it useful we will be happy to work on
improving it. Thank you for any opinions and suggestions.<br>
<br>
Alexey.<br>
<br>
Currently when the linker does garbage collection a
lot of abandoned debug info is left behind (see Appendix A
for documentation). Besides inflated debug info size, we
ended up with overlapping address ranges and no way to say
valid vs garbage ranges. We propose removing debug info
along with removing code. This would reduce debug info
size and make sure debug info accuracy.<br>
<br>
There are several approaches which could be used to solve
that problem:<br>
<br>
1. Require dwarf producers to generate fragmented debug
data according to DWARF5 specification: "E.3.3
Single-function-per-DWARF-compilation-unit" page 388. That
approach assumes fragmenting the whole debug info per
function basis and glue fragmented sections at the link
time using section groups.<br>
<br>
2. Use an additional tool, which would optimize out
unnecessary debug data, something similar to dwz (dwarf
compressor tool), dsymutil (links the DWARF debug
information). This approach assumes additional post-link
binaries processing.<br>
<br>
3. Teach the linker to parse debug data and let it remove
unused debug data. <br>
<br>
In this proposal, we focus on approach #3. We show that
this approach is viable and discuss some preliminary
results, leaving particular implementation out of the
scope. We attach the Proof of Concept (PoC)
implementation(<a href="https://reviews.llvm.org/D67469"
target="_blank" moz-do-not-send="true">https://reviews.llvm.org/D67469</a>)
for illustrative purposes. Please keep in mind that it is
not final, and there is room for improvements (see
Appendix B). However, the achieved results look quite
promising and demonstrate up to 2 times size reduction and
performance overhead is 30% of linking time (which is in
the same ballpark as the already done section compressing
(see table 2 point F)).<br>
</div>
</blockquote>
<div><br>
</div>
<div>Have you considered/tried reusing the DWARF
minimization/deduplication/linking logic that's already in
llvm's dsymutil implementation? If we're going to do that
having a singular implementation would be desirable.<br>
<br>
(bonus points if we could do something like the dsymutil
approach when using Split DWARF and building a DWP - taking
some address table output from the linker, and using that to
help trim things (or, even when having no input from the
linker - at least doing more aggressive deduplication during
DWP construction than can be currently done with only type
units (& potentially removing/avoiding type unit
overhead too))<br>
</div>
<div> </div>
</div>
</div>
</blockquote>
Generally speaking, dsymutil does a very similar thing. It parses
DWARF DIEs, analyzes relocations, scans through references and
throws out unused DIEs. But it`s current interface does not allow to
use it at link stage. <br>
I think it would be perfect to have a singular implementation. <br>
Though I did not analyze how easy or is it possible to reuse its
code at the link stage, it looked like it needs a significant
rework. <br>
<br>
Implementation from this proposal does removing of obsolete debug
info at link stage. <br>
And so has benefits of already loaded object files, already created
liveness information, <br>
generating an optimized binary from scratch.<br>
<p><br>
</p>
<p>If dsymutil could be refactored in such manner that could be used
at the link stage, then it`s implementation could be reused. I
would research the possibility of such a refactoring.</p>
<p><br>
</p>
<blockquote type="cite"
cite="mid:CAENS6EvkpbHfuE3rKHhagDN_z=_inE1tmCUgYzi=HkNZ4d0SCA@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>1. Minimize or entirely avoid references from
subprograms into other parts of .debug_info section. That
would simplify splitting and removing subprograms out in
that sense that it would minimize the number of references
that should be parsed and followed.
(DW_FORM_ref_subroutine instead of DW_FORM_ref_*, ?)<br>
</div>
</blockquote>
<div><br>
Not sure I follow - by "other parts of the .debug_info
section" do you mean in the same CU, or cross CU references?
Any particular references you have in mind? Or encountered
in practice?<br>
</div>
</div>
</div>
</blockquote>
I mean here all kinds of references into .debug_info section. Going
through references is the time-consuming task. <br>
Thus the fewer references there should be followed then the faster
it works.<br>
<br>
For the cross CU references - It requires to load referenced CU. I
do not know use cases where cross CU references are used. If that is
the specific case and is not used inside subprograms usually, then
probably it is possible to avoid it.<br>
<br>
For the same CU - there could probably be cases when references
could be ignored: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/P8165">https://reviews.llvm.org/P8165</a>
<blockquote type="cite"
cite="mid:CAENS6EvkpbHfuE3rKHhagDN_z=_inE1tmCUgYzi=HkNZ4d0SCA@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>2. Create additional section - global types table
(.debug_types_table). That would significantly reduce the
number of references inside .debug_info section. It also
makes it possible to have a 4-byte reference in this
section instead of 8-bytes reference into type unit
(DW_FORM_ref_types instead of DW_FORM_ref_sig8). It also
makes it possible to place base types into this section
and avoid per-compile unit duplication of them.
Additionally, there could be achieved size reduction by
not generating type unit header. Note, that new section -
.debug_types_table - differs from DWARF4 section
.debug_types in that sense that: it contains unique type
descriptors referenced by offsets instead of list of type
units referenced by DW_FORM_ref_sig8; all table entries
share the same abbreviations and do not have type unit
headers.<br>
</div>
</blockquote>
<div><br>
What do you mean when you say "global types table" the
phrasing in the above paragraph is present-tense, as though
this thing exists but doesn't seem to describe what it
actually is and how it achieves the things the text says it
achieves. Perhaps I've missed some context here.<br>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>The "global types table" does not exist yet. It could be created
if the discussed approach would be considered useful. <br>
Please check the comparison of possible "global types table" and
currently existed type units: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/P8164">https://reviews.llvm.org/P8164</a></p>
<p>The benefit of using "global types table" is that it saves the
space required to keep types comparing with type units solution.
</p>
<p><br>
</p>
<blockquote type="cite"
cite="mid:CAENS6EvkpbHfuE3rKHhagDN_z=_inE1tmCUgYzi=HkNZ4d0SCA@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>3. Define the limited scope for line programs which
could be removed independently. I.e. currently .debug_line
section contains a program in byte-coded language for a
state machine. That program actually represents a matrix
[instruction][line information]. In general, it is hard to
cut out part of that program and to keep the whole program
correct. Thus it would be good to specify separate scopes
(related to address ranges) which could be easily removed
from the program body.<br>
</div>
</blockquote>
<div><br>
In my experience line tables are /tiny/ - have you
prototyped any change in this space to have a sense of
whether it would have significant savings? (it'd potentially
help address the address ambiguity issues when the linker
discards code, though - so might be a correctness issue
rather than a size performance issue)<br>
</div>
</div>
</div>
</blockquote>
<p>I did not measure the value of size reduction for line table,
though I think that it would be a small value.<br>
The more important thing is a correctness issue. Line table could
contain information for overlapping address ranges.<br>
</p>
<p>There is another attempt to fix that issue -
<a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D59553">https://reviews.llvm.org/D59553</a>.<br>
</p>
<br>
<blockquote type="cite"
cite="mid:CAENS6EvkpbHfuE3rKHhagDN_z=_inE1tmCUgYzi=HkNZ4d0SCA@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div><br>
We evaluated the approach on LLVM and Clang codebases. The
results obtained are summarized in the tables below:<br>
</div>
</blockquote>
<div><br>
Memory usage statistics (& confidence intervals for the
build time) would probably be especially useful for
comparing these tradeoffs.<br>
Doubly so when using compression (since the decompression
would need to use more memory, as would the recompression -
so, two different tradeoffs (compressed input, compressed
output, and then both at the same time))<br>
</div>
</div>
</div>
</blockquote>
<p>I would measure memory impact for that PoC implementation, but I
expect it would be significant. <br>
Memory usage was not optimized yet. There are several things which
might be done to reduce memory footprint:<br>
do not load all compile units into memory, avoid adding Parent
field to all DIEs.</p>
<p>Alexey.<br>
</p>
<blockquote type="cite"
cite="mid:CAENS6EvkpbHfuE3rKHhagDN_z=_inE1tmCUgYzi=HkNZ4d0SCA@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div><br>
</div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"
moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
<a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
</div>
</blockquote>
</body>
</html>