<div dir="ltr"><div>Hi Alexey,</div><div><br></div><div>Just an update - I identified the cause of the "Generated debug info is broken" error message when I tried to build things locally: the `outStreamer` instance is initialised with the host Triple, instead of whatever the target's triple is. For example, I build and run LLD on Windows, which means that a Windows triple will be generated, and consequently a COFF-emitting streamer will be created, rather than the ELF-emitting one I'd expect were the triple information to somehow be derived from the linker flavor/input objects etc. Hard-coding in my target triple resolved the issue (although I still got the other warnings mentioned from my game link).</div><div><br></div><div>I measured the performance figures using LLD patched as described, and using the same methodology as my earlier results, and got the following:</div><br><div>
<div>Link-time speed (s):</div><div><span style="font-family:monospace">+-----------------------------+---------------+</span><br></div><div><font face="monospace">| Package variant | GC 1 (normal) |</font></div><div><font face="monospace">+-----------------------------+---------------+<br></font></div><div><font face="monospace">| Game (DWARF linker) | 53.6 |
</font><div><font face="monospace">| Game (DWARF linker, no ODR) | 63.6 |</font></div>
</div><div><font face="monospace">| Clang (DWARF linker) | 200.6 |</font></div><div><font face="monospace">+-----------------------------+---------------+</font></div><div><font face="monospace"><br></font></div><div><font face="monospace">
</font><div><font face="monospace"><font face="arial,sans-serif">Output size - Game package (MB):</font></font></div><div><font face="monospace"><font face="arial,sans-serif"><span style="font-family:monospace">+-----------------------------+------+</span><br></font></font></div><div><span style="font-family:monospace">| Category | GC 1 |<br></span></div><div><span style="font-family:monospace">+-----------------------------+------+<br></span></div><div><span style="font-family:monospace">| DWARFLinker (total) | 696 |<br></span></div><div><span style="font-family:monospace">|
DWARFLinker
(DWARF*) | 429 |<br></span></div><div><span style="font-family:monospace">|
DWARFLinker
(other) | 267 |
</span><div><span style="font-family:monospace">| DWARFLinker no ODR (total) | 753 |<br></span></div><div><span style="font-family:monospace">|
DWARFLinker
no ODR (DWARF*) | 485 |<br></span></div><div><span style="font-family:monospace">|
DWARFLinker
no ODR (other) | 268 |</span></div>
</div><div><font face="monospace">+-----------------------------+------+</font><div><font face="monospace"><br></font></div><div><font face="monospace"><font face="arial,sans-serif">Output size - Clang (MB):
</font></font><div><font face="monospace"><font face="arial,sans-serif"><span style="font-family:monospace">+-----------------------------+------+</span><br></font></font></div><div><span style="font-family:monospace">| Category | GC 1 |<br></span></div><div><span style="font-family:monospace">+-----------------------------+------+</span></div>
<div><span style="font-family:monospace">| DWARFLinker (total) | 1294 |<br></span></div><div><span style="font-family:monospace">|
DWARFLinker
(DWARF*) | 743 |<br></span></div><div><span style="font-family:monospace">|
DWARFLinker
(other) | 551 |
<span style="font-family:monospace">
</span></span><div><span style="font-family:monospace">| DWARFLinker no ODR (total) | 1294 |<br></span></div><div><span style="font-family:monospace">|
DWARFLinker
no ODR (DWARF*) | 743 |<br></span></div><div><span style="font-family:monospace">|
DWARFLinker
no ODR (other) | 551 |
</span><div><div><span style="font-family:monospace"></span></div>
</div><div><font face="monospace">+-----------------------------+------+</font></div>
</div></div><div><font face="monospace"><br></font></div><div><font face="monospace"><span style="font-family:arial,sans-serif">*DWARF = just .debug_info, .debug_line, .debug_loc, .debug_aranges, .debug_ranges.</span></font></div><div><font face="monospace"><span style="font-family:arial,sans-serif"><br></span>
</font><div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">Peak Working Set Memory usage (GB):</span><br></span></div><div><span style="font-family:monospace">
</span><div><font face="monospace"><font face="arial,sans-serif"><span style="font-family:monospace">+-----------------------------+------+
</span></font></font></div><div><span style="font-family:monospace">
| Package variant | GC 1 |<br></span></div><div><span style="font-family:monospace">
+-----------------------------+------+<br></span></div><div><span style="font-family:monospace">
| Game (DWARFLinker) | 5.7 |
</span><div><span style="font-family:monospace">
| Game (DWARFLinker, no ODR) | 5.8 |</span></div>
</div><div><span style="font-family:monospace">
| Clang (DWARFLinker) | 22.4 |
</span><div><div><span style="font-family:monospace"></span></div>
</div><div><span style="font-family:monospace">
| Clang (DWARFLinker, no ODR) | 22.5 |</span></div>
</div><div><span style="font-family:monospace">
+-----------------------------+------+</span></div><div><span style="font-family:monospace"><br></span></div><div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">My opinion is that the time costs of the DWARF Linker approach are not really practical except on build servers, in the current state of affairs for larger packages: clang takes 8.8x as long as the fragmented approach and 11.2x as long as the plain approach (without the no ODR option). The size saving is certainly good, with my version of clang 51% of the total output size for the DWARF linker approach versus the plain approach and 55% of the fragmented approach (though it is likely that further size savings might be possible for the latter). The game produced reasonable size savings too: 62% and 74%, but I'd be surprised if these gains would be enough for people to want to use the approach in day-to-day situations, which presumably is the main use-case for smaller DWARF, due to improved debugger load times.<br></span></span></div><div><span style="font-family:arial,sans-serif"><br></span></div><div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">Interesting to note is that the GCC 7.5 build of clang I've used these figures with produced no difference in size results between the two variants, unlike other packages. Consequently, a significant amount of time is saved for no penalty.</span><br></span></div><div><span style="font-family:arial,sans-serif"><br></span></div><div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">I'll be interested to see what the time results of the DWARF linker are once further improvements to it have been made.<br></span></span></div><div><span style="font-family:arial,sans-serif"><br></span></div><div><span style="font-family:arial,sans-serif">Thanks,<br></span></div><div><span style="font-family:arial,sans-serif"><br></span></div><div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">James</span><br></span></div></div>
</div>
</div></div>
</div>
</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 4 Nov 2020 at 13:57, James Henderson <<a href="mailto:jh7370.2008@my.bristol.ac.uk">jh7370.2008@my.bristol.ac.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Great, thanks! Those results are about roughly what I was expecting. I assume "compilation time" is actually just the link time?</div><div><br></div><div>I find it particularly interesting that the DWARFLinker rewriting solution produces the same size improvement in .debug_line as the fragmented DWARF approach. That suggests that in that case, fragmented DWARF output is probably about as optimal as it can get. I'm not surprised that the same can't be said for other sections, but I'm also pleased to see that the full rewrite option isn't so much better in size improvements.</div><div><br></div><div>Regarding the problems I was having with the patch, if you want to try reproducing the problems with clang, I built commit 05d02e5a of clang using gcc 7.5.0 on Ubuntu 18.04, to generate an ELF package. I then used LLD to relink it to create a reproducible package. As I'm primarily a Windows developer, I transferred this package to my Windows machine so that I could use my existing Windows checkout of LLVM, applied your patch, rebuilt LLD, and used that to try linking the package, getting the stated message. I'm going to have another try at the latter now to see if I can figure out what the issue is myself.</div><div><br></div><div>James<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 4 Nov 2020 at 13:35, Alexey Lapshin <<a href="mailto:avl.lapshin@gmail.com" target="_blank">avl.lapshin@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 04.11.2020 15:28, James Henderson
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi Alexey,</div>
<div><br>
</div>
<div>Thanks for taking a look at these. I noticed you set the
--mark-live-pc value to a value other than 1 for the
fragmented DWARF version. This will mean additional GC-ing
will be done beyond the amount that --gc-sections will do, so
unless you use the same value for the option for other
versions, the result will not be comparable. (The option is
purely there to experiment with the effects were different
amounts of the input codebase to be considered dead). Would
you be okay to run those figures again without the option
specified?</div>
</div>
</blockquote>
<p>Oh, mis-interpreted that option. Following are updated results:<br>
</p>
<tt>1. llvm-strings:</tt><tt><br>
</tt><tt><br>
</tt><tt> source object files size: 381M.</tt><tt><br>
</tt><tt> fragmented source object files size: 451M(18% increase).</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> a. upstream version, </tt><tt><br>
</tt><tt> command line options: --gc-sections</tt><tt><br>
</tt><tt> binary size: 6,5M</tt><tt><br>
</tt><tt> compilation time: 0:00.13 sec</tt><tt><br>
</tt><tt> run-time memory: 111kb</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> b. "fragmented DWARF" version, </tt><tt><br>
</tt><tt> command line options: --gc-sections</tt><tt><br>
</tt><tt> binary size: 5,3M</tt><tt><br>
</tt><tt> compilation time: 0:00.11 sec</tt><tt><br>
</tt><tt> run-time memory: 125kb</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> c. DWARFLinker version, </tt><tt><br>
</tt><tt> command line options: --gc-sections --gc-debuginfo</tt><tt><br>
</tt><tt> binary size: 3,8M</tt><tt><br>
</tt><tt> compilation time: 0:00.33 sec</tt><tt><br>
</tt><tt> run-time memory: 141kb</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> d. DWARFLinker no-odr version, </tt><tt><br>
</tt><tt> command line options: --gc-sections --gc-debuginfo
--gc-debuginfo-no-odr</tt><tt><br>
</tt><tt> binary size: 4,3M</tt><tt><br>
</tt><tt> compilation time: 0:00.38 sec</tt><tt><br>
</tt><tt> run-time memory: 142kb</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt><br>
</tt><tt>2. clang:</tt><tt><br>
</tt><tt><br>
</tt><tt> source object files size: 6,5G.</tt><tt><br>
</tt><tt> fragmented source object files size: 7,3G(13% increase).</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> a. upstream version, </tt><tt><br>
</tt><tt> command line options: --gc-sections</tt><tt><br>
</tt><tt> binary size: 1,5G</tt><tt><br>
</tt><tt> compilation time: 6 sec </tt><tt><br>
</tt><tt> run-time memory: 9.7G</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> b. "fragmented DWARF" version, </tt><tt><br>
</tt><tt> command line options: --gc-sections </tt><tt><br>
</tt><tt> binary size: 1,4G</tt><tt><br>
</tt><tt> compilation time: 8 sec</tt><tt><br>
</tt><tt> run-time memory: 12G </tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> c. DWARFLinker version, </tt><tt><br>
</tt><tt> command line options: --gc-sections --gc-debuginfo</tt><tt><br>
</tt><tt> binary size: 836M</tt><tt><br>
</tt><tt> compilation time: 62 sec</tt><tt><br>
</tt><tt> run-time memory: 15G</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> d. DWARFLinker no-odr version, </tt><tt><br>
</tt><tt> command line options: --gc-sections --gc-debuginfo
--gc-debuginfo-no-odr</tt><tt><br>
</tt><tt> binary size: 1,3G</tt><tt><br>
</tt><tt> compilation time: 128 sec</tt><tt><br>
</tt><tt> run-time memory: 17G</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt>Detailed size results:</tt><tt><br>
</tt><tt><br>
</tt><tt>1. a)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 41.1% 2.64Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 24.9% 1.60Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 12.6% 827Ki 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 6.5% 428Ki 63.8% 428Ki .text</tt><tt><br>
</tt><tt> 4.8% 317Ki 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 3.4% 223Ki 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 2.0% 133Ki 19.8% 133Ki .eh_frame</tt><tt><br>
</tt><tt> 1.7% 110Ki 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.2% 77.6Ki 0.0% 0 .debug_abbrev</tt><tt><br>
</tt><tt><br>
</tt><tt> b)</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 40.2% 2.10Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 30.7% 1.60Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 8.0% 428Ki 63.8% 428Ki .text</tt><tt><br>
</tt><tt> 5.9% 317Ki 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 5.9% 313Ki 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 2.5% 133Ki 19.8% 133Ki .eh_frame</tt><tt><br>
</tt><tt> 2.1% 110Ki 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.5% 77.6Ki 0.0% 0 .debug_abbrev</tt><tt><br>
</tt><tt> 1.3% 69.2Ki 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt><br>
</tt><tt> c)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 33.0% 1.25Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 29.2% 1.11Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 11.0% 428Ki 63.8% 428Ki .text</tt><tt><br>
</tt><tt> 8.2% 317Ki 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 7.8% 304Ki 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 3.4% 133Ki 19.8% 133Ki .eh_frame</tt><tt><br>
</tt><tt> 2.8% 110Ki 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.7% 65.9Ki 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 1.0% 38.4Ki 5.7% 38.4Ki .rodata</tt><tt><br>
</tt><tt><br>
</tt><tt> d)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 39.7% 1.68Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 26.3% 1.11Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 9.9% 428Ki 63.8% 428Ki .text</tt><tt><br>
</tt><tt> 7.3% 317Ki 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 7.0% 304Ki 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 3.1% 133Ki 19.8% 133Ki .eh_frame</tt><tt><br>
</tt><tt> 2.6% 110Ki 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.5% 65.9Ki 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt>2. a)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 58.3% 878Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 11.8% 177Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 7.7% 115Mi 62.2% 115Mi .text</tt><tt><br>
</tt><tt> 7.7% 115Mi 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 6.0% 90.7Mi 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 2.4% 35.4Mi 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 1.5% 23.3Mi 12.5% 23.3Mi .eh_frame</tt><tt><br>
</tt><tt> 1.5% 23.0Mi 12.4% 23.0Mi .rodata</tt><tt><br>
</tt><tt> 1.2% 17.9Mi 0.0% 0 .symtab</tt><tt><br>
</tt><tt><br>
</tt><tt> b)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 59.6% 807Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 13.1% 177Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 8.5% 115Mi 62.2% 115Mi .text</tt><tt><br>
</tt><tt> 6.7% 90.7Mi 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 4.2% 57.4Mi 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 1.7% 23.3Mi 12.5% 23.3Mi .eh_frame</tt><tt><br>
</tt><tt> 1.7% 23.0Mi 12.4% 23.0Mi .rodata</tt><tt><br>
</tt><tt> 1.3% 17.9Mi 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.0% 13.0Mi 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 0.8% 10.6Mi 5.7% 10.6Mi .dynstr</tt><tt><br>
</tt><tt><br>
</tt><tt> c)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 35.1% 293Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 21.2% 177Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 13.9% 115Mi 62.2% 115Mi .text</tt><tt><br>
</tt><tt> 10.9% 90.7Mi 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 6.9% 57.4Mi 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 2.8% 23.3Mi 12.5% 23.3Mi .eh_frame</tt><tt><br>
</tt><tt> 2.8% 23.0Mi 12.4% 23.0Mi .rodata</tt><tt><br>
</tt><tt> 2.1% 17.9Mi 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.5% 12.4Mi 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 1.3% 10.6Mi 5.7% 10.6Mi .dynstr</tt><tt><br>
</tt><tt><br>
</tt><tt> d)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 58.3% 758Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 13.6% 177Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 8.9% 115Mi 62.2% 115Mi .text</tt><tt><br>
</tt><tt> 7.0% 90.7Mi 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 4.4% 57.4Mi 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 1.8% 23.3Mi 12.5% 23.3Mi .eh_frame</tt><tt><br>
</tt><tt> 1.8% 23.0Mi 12.4% 23.0Mi .rodata</tt><tt><br>
</tt><tt> 1.4% 17.9Mi 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.0% 12.4Mi 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 0.8% 10.6Mi 5.7% 10.6Mi .dynstr</tt>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>I'm still trying to figure out the problems on my end to
try running your experiment on the game package I used in my
presentation, but have been interrupted by other unrelated
issues. I'll try to get back to this in the coming days.</div>
<div><br>
</div>
<div>James<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, 4 Nov 2020 at 11:54,
Alexey Lapshin <<a href="mailto:avl.lapshin@gmail.com" target="_blank">avl.lapshin@gmail.com</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>Hi James,<br>
<br>
I did experiments with the clang code base and will do
experiments with our local codebase later. <br>
Overall, both solutions("Fragmented DWARF" and
"DWARFLinker without odr types deduplication") look having
similar size savings results for the final binary.
"DWARFLinker with odr types deduplication" has a bigger
size saving effect. "Fragmented DWARF" increases the size
of original object files up to 15%.<br>
LLD with "fragmented DWARF" works significantly faster
than with "DWARFLinker".<br>
<br>
Following are the results for "llvm-strings" and "clang"
binaries:<br>
<br>
1. llvm-strings:<br>
<br>
<tt> source object files size: 381M.</tt><tt><br>
</tt><tt> fragmented source object files size: 451M(18%
increase).</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> a. upstream version, </tt><tt><br>
</tt><tt> command line options: --gc-sections</tt><tt><br>
</tt><tt> binary size: 6,5M</tt><tt><br>
</tt><tt> compilation time: 0:00.13 sec</tt><tt><br>
</tt><tt> run-time memory: 111kb</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> b. "fragmented DWARF" version, </tt><tt><br>
</tt><tt> command line options: --gc-sections
--mark-live-pc=0.45</tt><tt><br>
</tt><tt> binary size: 3,7M</tt><tt><br>
</tt><tt> compilation time: 0:00.10 sec</tt><tt><br>
</tt><tt> run-time memory: 122kb</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> c. DWARFLinker version, </tt><tt><br>
</tt><tt> command line options: --gc-sections
--gc-debuginfo</tt><tt><br>
</tt><tt> binary size: 3,8M</tt><tt><br>
</tt><tt> compilation time: 0:00.33 sec</tt><tt><br>
</tt><tt> run-time memory: 141kb</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> d. DWARFLinker no-odr version, </tt><tt><br>
</tt><tt> command line options: --gc-sections
--gc-debuginfo --gc-debuginfo-no-odr</tt><tt><br>
</tt><tt> binary size: 4,3M</tt><tt><br>
</tt><tt> compilation time: 0:00.38 sec</tt><tt><br>
</tt><tt> run-time memory: 142kb</tt><br>
<br>
<br>
2. clang:<br>
<br>
<tt> source object files size: 6,5G.</tt><tt><br>
</tt><tt> fragmented source object files size: 7,3G(13%
increase).</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> a. upstream version, </tt><tt><br>
</tt><tt> command line options: --gc-sections</tt><tt><br>
</tt><tt> binary size: 1,5G</tt><tt><br>
</tt><tt> compilation time: 6 sec </tt><tt><br>
</tt><tt> run-time memory: 9.7G</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> b. "fragmented DWARF" version, </tt><tt><br>
</tt><tt> command line options: --gc-sections
--mark-live-pc=0.43</tt><tt><br>
</tt><tt> binary size: 1,1G</tt><tt><br>
</tt><tt> compilation time: 9 sec</tt><tt><br>
</tt><tt> run-time memory: 11G</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> c. DWARFLinker version, </tt><tt><br>
</tt><tt> command line options: --gc-sections
--gc-debuginfo</tt><tt><br>
</tt><tt> binary size: 836M</tt><tt><br>
</tt><tt> compilation time: 62 sec</tt><tt><br>
</tt><tt> run-time memory: 15G</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> d. DWARFLinker no-odr version, </tt><tt><br>
</tt><tt> command line options: --gc-sections
--gc-debuginfo --gc-debuginfo-no-odr</tt><tt><br>
</tt><tt> binary size: 1,3G</tt><tt><br>
</tt><tt> compilation time: 128 sec</tt><tt><br>
</tt><tt> run-time memory: 17G</tt><br>
<br>
Detailed size results:<br>
<br>
<tt>1. llvm-strings <br>
</tt></p>
<p><tt> a)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 41.1% 2.64Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 24.9% 1.60Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 12.6% 827Ki 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 6.5% 428Ki 63.8% 428Ki .text</tt><tt><br>
</tt><tt> 4.8% 317Ki 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 3.4% 223Ki 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 2.0% 133Ki 19.8% 133Ki .eh_frame</tt><tt><br>
</tt><tt> 1.7% 110Ki 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.2% 77.6Ki 0.0% 0 .debug_abbrev</tt><tt><br>
</tt><tt><br>
</tt><tt> b)</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 50.3% 1.85Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 43.6% 1.60Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 2.6% 98.2Ki 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 2.1% 77.6Ki 0.0% 0 .debug_abbrev</tt><tt><br>
</tt><tt> 0.5% 17.5Ki 54.9% 17.4Ki .text</tt><tt><br>
</tt><tt> 0.3% 9.94Ki 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 0.2% 6.27Ki 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 0.1% 5.09Ki 15.9% 5.03Ki .eh_frame</tt><tt><br>
</tt><tt> 0.1% 3.28Ki 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt><br>
</tt><tt> c)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 33.0% 1.25Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 29.2% 1.11Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 11.0% 428Ki 63.8% 428Ki .text</tt><tt><br>
</tt><tt> 8.2% 317Ki 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 7.8% 304Ki 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 3.4% 133Ki 19.8% 133Ki .eh_frame</tt><tt><br>
</tt><tt> 2.8% 110Ki 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.7% 65.9Ki 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 1.0% 38.4Ki 5.7% 38.4Ki .rodata</tt><tt><br>
</tt><tt><br>
</tt><tt> d)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 39.7% 1.68Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 26.3% 1.11Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 9.9% 428Ki 63.8% 428Ki .text</tt><tt><br>
</tt><tt> 7.3% 317Ki 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 7.0% 304Ki 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 3.1% 133Ki 19.8% 133Ki .eh_frame</tt><tt><br>
</tt><tt> 2.6% 110Ki 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.5% 65.9Ki 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt>2. clang</tt></p>
<p><tt> a)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 58.3% 878Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 11.8% 177Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 7.7% 115Mi 62.2% 115Mi .text</tt><tt><br>
</tt><tt> 7.7% 115Mi 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 6.0% 90.7Mi 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 2.4% 35.4Mi 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 1.5% 23.3Mi 12.5% 23.3Mi .eh_frame</tt><tt><br>
</tt><tt> 1.5% 23.0Mi 12.4% 23.0Mi .rodata</tt><tt><br>
</tt><tt> 1.2% 17.9Mi 0.0% 0 .symtab</tt><tt><br>
</tt><tt><br>
</tt><tt> b)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 71.5% 772Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 16.5% 177Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 3.7% 40.2Mi 59.2% 40.2Mi .text</tt><tt><br>
</tt><tt> 2.4% 25.8Mi 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 2.1% 23.0Mi 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 1.0% 10.6Mi 15.6% 10.6Mi .dynstr</tt><tt><br>
</tt><tt> 0.7% 7.18Mi 10.6% 7.18Mi .eh_frame</tt><tt><br>
</tt><tt> 0.5% 5.60Mi 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 0.4% 4.28Mi 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 0.4% 4.04Mi 0.0% 0 .debug_abbrev</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt> c)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 35.1% 293Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 21.2% 177Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 13.9% 115Mi 62.2% 115Mi .text</tt><tt><br>
</tt><tt> 10.9% 90.7Mi 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 6.9% 57.4Mi 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 2.8% 23.3Mi 12.5% 23.3Mi .eh_frame</tt><tt><br>
</tt><tt> 2.8% 23.0Mi 12.4% 23.0Mi .rodata</tt><tt><br>
</tt><tt> 2.1% 17.9Mi 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.5% 12.4Mi 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 1.3% 10.6Mi 5.7% 10.6Mi .dynstr</tt><tt><br>
</tt><tt><br>
</tt><tt> d)</tt><tt><br>
</tt><tt><br>
</tt><tt> FILE SIZE VM SIZE </tt><tt><br>
</tt><tt> -------------- -------------- </tt><tt><br>
</tt><tt> 58.3% 758Mi 0.0% 0 .debug_info</tt><tt><br>
</tt><tt> 13.6% 177Mi 0.0% 0 .debug_str</tt><tt><br>
</tt><tt> 8.9% 115Mi 62.2% 115Mi .text</tt><tt><br>
</tt><tt> 7.0% 90.7Mi 0.0% 0 .strtab</tt><tt><br>
</tt><tt> 4.4% 57.4Mi 0.0% 0 .debug_line</tt><tt><br>
</tt><tt> 1.8% 23.3Mi 12.5% 23.3Mi .eh_frame</tt><tt><br>
</tt><tt> 1.8% 23.0Mi 12.4% 23.0Mi .rodata</tt><tt><br>
</tt><tt> 1.4% 17.9Mi 0.0% 0 .symtab</tt><tt><br>
</tt><tt> 1.0% 12.4Mi 0.0% 0 .debug_ranges</tt><tt><br>
</tt><tt> 0.8% 10.6Mi 5.7% 10.6Mi .dynstr</tt></p>
<p><tt>Thank you, Alexey.</tt><br>
</p>
<div>On 19.10.2020 11:50, James Henderson wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Great, thanks Alexey! I'll try to take a
look at this in the near future, and will report my
results back here. I imagine our clang results will
differ, purely because we probably used different
toolchains to build the input in the first place.<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, 15 Oct 2020 at
10:08, Alexey Lapshin <<a href="mailto:avl.lapshin@gmail.com" target="_blank">avl.lapshin@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 13.10.2020 10:20, James Henderson wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>The script included in the patch can be
used to convert an object containing normal
DWARF into an object using fragmented DWARF.
It does this by using llvm-dwarfdump to dump
the various sections, parses the output to
identify where it should split (using the
offsets of the various entries), and then
writes new section headers accordingly - you
can see roughly what it's doing if you get a
chance to watch the talk recording. The
additional section headers are appended to the
end of the ELF section header table, whilst
the original DWARF is left in the same place
it was before (making use of the fact that
section headers don't have to appear in offset
order). The script also parses and fragments
the relocation sections targeting the DWARF
sections so that they match up with the
fragmented DWARF sections. This is clearly all
suboptimal - in practice the compiler should
be modified to do the fragmenting upfront, to
save having to parse a tool's stdout, but that
was just the simplest thing I could come up
with to quickly write the script. Full details
of the script usage are included in the patch
description, if you want to play around with
it.</div>
<div><br>
</div>
<div>If Alexey could point me at the latest
version of his patch, I'd be happy to run that
through either or both of the packages I used
to see what happens. Equally, I'd be happy if
Alexey is able to run my script to fragment
and measure the performance of a couple of
projects he's been working with. Based purely
on the two packages I've tried this with, I
can tell already that the results can vary
wildly. My expectation is that Alexey's
approach will be slower (at least in its
current form, but probably more generally),
but produce smaller output, but to what scale
I have no idea.<br>
</div>
</div>
</blockquote>
<p>James, I updated the patch - <a href="https://reviews.llvm.org/D74169" target="_blank">https://reviews.llvm.org/D74169</a>.</p>
<p>To make it working it is necessary to build
example with -ffunction-sections and specify
following options to the linker :</p>
<p>--gc-sections --gc-debuginfo
--gc-debuginfo-no-odr</p>
<p>For clang binary I got following results:</p>
<p>1. --gc-sections = binary size 1,5G, Debug Info
size(*)1.2G</p>
<p>2. --gc-sections --gc-debuginfo = binary size
840M, 8x performance decrease, Debug Info size
542M<br>
</p>
<p>3. --gc-sections --gc-debuginfo
--gc-debuginfo-no-odr = binary size 1,3G, 16x
performance decrease, Debug Info size 1G<br>
</p>
<p>(*)
.debug_info+.debug_str+.debug_line+.debug_ranges+.debug_loc<br>
</p>
<p><br>
</p>
<p>I added option --gc-debuginfo-no-odr, so that
size reduction could be compared correctly.
Without that option D74169 does types
deduplication and then it is not correct to
compare resulting size with "Fragmented DWARF"
solution which does not do types deduplication.<br>
</p>
<p>Also, I look at your <font face="monospace"><font face="arial,sans-serif"><a href="https://reviews.llvm.org/D89229" target="_blank">D89229</a>
and would share results some time later.<br>
</font></font></p>
<p>Thank you, Alexey.<br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>I think linkers parse .eh_frame partly
because they have no other choice. That being
said, I think it's format is not too complex,
so similarly the parser isn't too complex. You
can see LLD's ELF implementation in
ELF/EhFrame.cpp, how it is used in
ELF/InputSection.cpp (see the bits to do with
EhInputSection) and EhFrameSection in
ELF/SyntheticSections.h (plus various usages
of these two throughout the LLD code). I think
the key to any structural changes in the DWARF
format to make them more amenable to link-time
parsing is being able to read a minimal amount
without needing to parse the payload (e.g. a
length field, some sort of type, and then
using the relocations to associate it
accordingly).</div>
<div><br>
</div>
<div>James<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, 12 Oct
2020 at 20:48, David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Awesome! Sorry I missed the
lightning talk, but really interested to see
this sort of thing (though it's not
directly/immediately applicable to the use
case I work with - Split DWARF, something
similar could be used there with further
work)<br>
<br>
Though it looks like the patch has mostly
linker changes - where/how do you generate
the fragmented DWARF to begin with? Via the
Python script? Run over assembly? I'd be
surprised if it was achievable that way -
curious to know more.<br>
<br>
Got a rough sense/are you able to run
apples-to-apples comparisons with Alexey's
linker-based patches to compare linker
time/memory overhead versus resulting output
size gains?<br>
<br>
(& yeah, I'm a bit curious about how the
linkers do eh_frame rewriting, if the format
is especially amenable to a lightweight
parsing/rewriting and how we could make the
DWARF more amenable to that too)</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon,
Oct 12, 2020 at 6:41 AM James Henderson
<<a href="mailto:jh7370.2008@my.bristol.ac.uk" target="_blank">jh7370.2008@my.bristol.ac.uk</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div>Hi all,</div>
<div><br>
</div>
<div>At the recent LLVM developers'
meeting, I presented a lightning talk
on an approach to reduce the amount of
dead debug data left in an executable
following operations such as
--gc-sections and duplicate COMDAT
removal. In that presentation, I
presented some figures based on
linking a game that had been built by
our downstream clang port and
fragmented using the described
approach. Since recording the
presentation, I ran the same
experiment on a clang package (this
time built with a GCC version). The
comparable figures are below:</div>
<div><br>
</div>
<div>Link-time speed (s):</div>
<div><span style="font-family:monospace">+--------------------+-------+---------------+------+------+------+------+------+</span><br>
</div>
<div><font face="monospace">| Package
variant | No GC | GC 1 (normal) |
GC 2 | GC 3 | GC 4 | GC 5 | GC 6 |</font></div>
<div><font face="monospace">+--------------------+-------+---------------+------+------+------+------+------+<br>
</font></div>
<div><font face="monospace">| Game
(plain) | 4.5 |
4.9 | 4.2 | 3.6 | 3.4 |
3.3 | 3.2 |<br>
</font></div>
<div><font face="monospace">| Game
(fragmented) | 11.1 | 11.8
| 9.7 | 8.6 | 7.9 | 7.7 |
7.5 |<br>
</font></div>
<div><font face="monospace">| Clang
(plain) | 13.9 | 17.9
| 17.0 | 16.7 | 16.3 | 16.2 | 16.1 |<br>
</font></div>
<div><font face="monospace">| Clang
(fragmented) | 18.6 | 22.8
| 21.6 | 21.1 | 20.8 | 20.5 | 20.2 |</font></div>
<div><font face="monospace">+--------------------+-------+---------------+------+------+------+------+------+</font></div>
<div><font face="monospace"><br>
</font></div>
<div><font face="monospace"><font face="arial,sans-serif">Output
size - Game package (MB):</font></font></div>
<div><font face="monospace"><font face="arial,sans-serif"><span style="font-family:monospace">+---------------------+-------+------+------+------+------+------+------+</span><br>
</font></font></div>
<div><span style="font-family:monospace">|
Category | No GC | GC 1 |
GC 2 | GC 3 | GC 4 | GC 5 | GC 6 |<br>
</span></div>
<div><span style="font-family:monospace">+---------------------+-------+------+------+------+------+------+------+<br>
</span></div>
<div><span style="font-family:monospace">|
Plain (total) | 1149 | 1121 |
1017 | 965 | 938 | 930 | 928 |<br>
</span></div>
<div><span style="font-family:monospace">|
Plain (DWARF*) | 845 | 845
| 845 | 845 | 845 | 845 | 845 |<br>
</span></div>
<div><span style="font-family:monospace">|
Plain (other) | 304 | 276
| 172 | 120 | 93 | 85 | 82 |<br>
</span></div>
<div><span style="font-family:monospace">|
Fragmented (total) | 1044 | 940
| 556 | 373 | 287 | 263 | 255 |<br>
</span></div>
<div><span style="font-family:monospace">|
Fragmented (DWARF*) | 740 | 664
| 384 | 253 | 194 | 178 | 173 |<br>
</span></div>
<div><span style="font-family:monospace">|
Fragmented (other) | 304 | 276
| 172 | 120 | 93 | 85 | 82 |<br>
</span></div>
<div><font face="monospace">+---------------------+-------+------+------+------+------+------+------+
</font>
<div><font face="monospace"><br>
</font></div>
<div><font face="monospace"><font face="arial,sans-serif">Output
size - Clang (MB):</font></font></div>
<div><font face="monospace"><font face="arial,sans-serif"><span style="font-family:monospace">+---------------------+-------+------+------+------+------+------+------+</span><br>
</font></font></div>
<div><span style="font-family:monospace">|
Category | No GC | GC 1
| GC 2 | GC 3 | GC 4 | GC 5 | GC 6
|<br>
</span></div>
<div><span style="font-family:monospace">+---------------------+-------+------+------+------+------+------+------+<br>
</span></div>
<div><span style="font-family:monospace">|
Plain (total) | 2596 | 2546
| 2406 | 2332 | 2293 | 2273 | 2251
|<br>
</span></div>
<div><span style="font-family:monospace">|
Plain (DWARF*) | 1979 | 1979
| 1979 | 1979 | 1979 | 1979 | 1979
|<br>
</span></div>
<div><span style="font-family:monospace">|
Plain (other) | 616 | 567
| 426 | 353 | 314 | 294 | 272
|<br>
</span></div>
<div><span style="font-family:monospace">|
Fragmented (total) | 2397 | 2346
| 2164 | 2069 | 2017 | 1990 | 1963
|<br>
</span></div>
<div><span style="font-family:monospace">|
Fragmented (DWARF*) | 1780 | 1780
| 1738 | 1716 | 1703 | 1696 | 1691
|<br>
</span></div>
<div><span style="font-family:monospace">|
Fragmented (other) | 616 | 567
| 426 | 353 | 314 | 294 | 272
|<br>
</span></div>
<div><font face="monospace">+---------------------+-------+------+------+------+------+------+------+</font></div>
</div>
<div><font face="monospace"><br>
</font></div>
<div><font face="monospace">*DWARF size
== total size of .debug_info +
.debug_line + .debug_ranges +
.debug_aranges + .debug_loc<br>
</font></div>
<div><font face="monospace"><br>
</font></div>
<div><font face="monospace"><font face="arial,sans-serif">Additionally,
I have posted <a href="https://reviews.llvm.org/D89229" target="_blank">https://reviews.llvm.org/D89229</a>
which provides the python script
and linker patches used to
reproduce the above results on my
machine. The GC 1/2/3/4/5/6
correspond to the linker option
added in that patch --mark-live-pc
with values 1/0.8/0.6/0.4/0.2/0
respectively.<br>
</font></font></div>
<div><font face="monospace"><font face="arial,sans-serif"><br>
</font></font></div>
<div><font face="monospace"><font face="arial,sans-serif">During the
conference, the question was asked
what the memory usage and input
size impact was. I've summarised
these below:</font></font></div>
<div><font face="monospace"><font face="arial,sans-serif"><br>
</font></font></div>
<div><font face="monospace"><font face="arial,sans-serif">Input file
size total (GB):</font></font></div>
<div><font face="monospace"><font face="arial,sans-serif"> <span style="font-family:monospace">+--------------------+------------+
</span></font></font></div>
<div><span style="font-family:monospace">
| Package variant | Total Size |
<br>
</span></div>
<div><span style="font-family:monospace">
+--------------------+------------+<br>
</span></div>
<div><span style="font-family:monospace">
| Game (plain) | 2.9 |
<br>
</span></div>
<div><span style="font-family:monospace">
| Game (fragmented) | 4.2 |<br>
</span></div>
<div><span style="font-family:monospace">
| Clang (plain) | 10.9 |<br>
</span></div>
<div><span style="font-family:monospace">
| Clang (fragmented) | 12.3 |<br>
</span></div>
<div><span style="font-family:monospace">
+--------------------+------------+</span></div>
<div><span style="font-family:monospace"><br>
</span></div>
<div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">Peak Working Set Memory usage (GB):</span><br>
</span></div>
<div><span style="font-family:monospace">
</span>
<div><font face="monospace"><font face="arial,sans-serif"><span style="font-family:monospace">+--------------------+-------+------+
</span></font></font></div>
<div><span style="font-family:monospace"> |
Package variant | No GC | GC 1
|<br>
</span></div>
<div><span style="font-family:monospace">
+--------------------+-------+------+<br>
</span></div>
<div><span style="font-family:monospace"> |
Game (plain) | 4.3 | 4.7
|<br>
</span></div>
<div><span style="font-family:monospace"> |
Game (fragmented) | 8.9 | 8.6
|<br>
</span></div>
<div><span style="font-family:monospace"> |
Clang (plain) | 15.7 | 15.6
|<br>
</span></div>
<div><span style="font-family:monospace"> |
Clang (fragmented) | 19.4 | 19.2
|<br>
</span></div>
<div><span style="font-family:monospace">
+--------------------+-------+------+</span></div>
<div><span style="font-family:monospace"><br>
</span></div>
<div><span style="font-family:monospace"><font face="arial,sans-serif">I'm keen
to hear what people's feedback
is, and also interested to see
what results others might see by
running this experiment on other
input packages. Also, if anybody
has any alternative ideas that
meet the goals listed below, I'd
love to hear them!<br>
</font></span></div>
<div><span style="font-family:monospace"><font face="arial,sans-serif"><br>
</font></span></div>
<div><span style="font-family:monospace"><font face="arial,sans-serif">To
reiterate some key goals of
fragmented DWARF, similar to
what I said in the presentation:</font></span></div>
<div><span style="font-family:monospace"><font face="arial,sans-serif">1)
Devise a scheme that gives
significant size savings without
being too costly. It's clear
from just the two packages I've
tried this on that there is a
fairly hefty link time
performance cost, although the
exact cost depends on the nature
of the input package. On the
other hand, depending on the
nature of the input package,
there can also be some big
gains.<br>
</font></span></div>
<div><span style="font-family:monospace"><font face="arial,sans-serif">2)
Devise a scheme that doesn't
require any linker knowledge of
DWARF. The current approach
doesn't quite achieve this
properly due to the slight
misuse of SHF_LINK_ORDER, but I
expect that a pivot to using
non-COMDAT group sections should
solve this problem.</font></span></div>
<div><span style="font-family:monospace"><font face="arial,sans-serif">3)
Provide some kind of halfway
house between simply writing
tombstone values into dead DWARF
and fully parsing the DWARF to
reoptimise its/discard the dead
bits.<br>
</font></span></div>
<div><span style="font-family:monospace"><font face="arial,sans-serif"><br>
</font></span></div>
<div><span style="font-family:monospace"><font face="arial,sans-serif">I'm
hopeful that changes could be
made to the linker to improve
the link-time cost. There seems
to be a significant amount of
the link time spent creating the
input sections. An alternative
would be to devise a scheme that
would avoid the literal
splitting into section headers,
in favour of some sort of list
of split-points that the linker
uses to split things up (a bit
like it already does for
.eh_frame or mergeable
sections).</font><br>
</span></div>
<span style="font-family:monospace"> </span></div>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote></div>
</blockquote></div>