<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Hi James,<br>
      <br>
      I did experiments with the clang code base and will do experiments
      with our local codebase later. <br>
      Overall, both solutions("Fragmented DWARF" and "DWARFLinker
      without odr types deduplication") look having similar size savings
      results for the final binary. "DWARFLinker with odr types
      deduplication" has a bigger size saving effect. "Fragmented DWARF"
      increases the size of original object files up to 15%.<br>
      LLD with "fragmented DWARF" works significantly faster than with
      "DWARFLinker".<br>
      <br>
      Following are the results for "llvm-strings" and "clang" binaries:<br>
      <br>
      1. llvm-strings:<br>
      <br>
      <tt>   source object files size: 381M.</tt><tt><br>
      </tt><tt>   fragmented source object files size: 451M(18%
        increase).</tt><tt><br>
      </tt><tt>   </tt><tt><br>
      </tt><tt>   a. upstream version, </tt><tt><br>
      </tt><tt>      command line options: --gc-sections</tt><tt><br>
      </tt><tt>      binary size: 6,5M</tt><tt><br>
      </tt><tt>      compilation time: 0:00.13 sec</tt><tt><br>
      </tt><tt>      run-time memory: 111kb</tt><tt><br>
      </tt><tt>   </tt><tt><br>
      </tt><tt>   b. "fragmented DWARF" version, </tt><tt><br>
      </tt><tt>      command line options: --gc-sections
        --mark-live-pc=0.45</tt><tt><br>
      </tt><tt>      binary size: 3,7M</tt><tt><br>
      </tt><tt>      compilation time: 0:00.10 sec</tt><tt><br>
      </tt><tt>      run-time memory: 122kb</tt><tt><br>
      </tt><tt>      </tt><tt><br>
      </tt><tt>   c. DWARFLinker version, </tt><tt><br>
      </tt><tt>      command line options: --gc-sections --gc-debuginfo</tt><tt><br>
      </tt><tt>      binary size: 3,8M</tt><tt><br>
      </tt><tt>      compilation time: 0:00.33 sec</tt><tt><br>
      </tt><tt>      run-time memory: 141kb</tt><tt><br>
      </tt><tt>      </tt><tt><br>
      </tt><tt>   d. DWARFLinker no-odr version, </tt><tt><br>
      </tt><tt>      command line options: --gc-sections --gc-debuginfo
        --gc-debuginfo-no-odr</tt><tt><br>
      </tt><tt>      binary size: 4,3M</tt><tt><br>
      </tt><tt>      compilation time: 0:00.38 sec</tt><tt><br>
      </tt><tt>      run-time memory: 142kb</tt><br>
         <br>
      <br>
      2. clang:<br>
      <br>
      <tt>   source object files size: 6,5G.</tt><tt><br>
      </tt><tt>   fragmented source object files size: 7,3G(13%
        increase).</tt><tt><br>
      </tt><tt>   </tt><tt><br>
      </tt><tt>   a. upstream version, </tt><tt><br>
      </tt><tt>      command line options: --gc-sections</tt><tt><br>
      </tt><tt>      binary size: 1,5G</tt><tt><br>
      </tt><tt>      compilation time: 6 sec </tt><tt><br>
      </tt><tt>      run-time memory: 9.7G</tt><tt><br>
      </tt><tt>   </tt><tt><br>
      </tt><tt>   b. "fragmented DWARF" version, </tt><tt><br>
      </tt><tt>      command line options: --gc-sections
        --mark-live-pc=0.43</tt><tt><br>
      </tt><tt>      binary size: 1,1G</tt><tt><br>
      </tt><tt>      compilation time: 9 sec</tt><tt><br>
      </tt><tt>      run-time memory: 11G</tt><tt><br>
      </tt><tt>      </tt><tt><br>
      </tt><tt>   c. DWARFLinker version, </tt><tt><br>
      </tt><tt>      command line options: --gc-sections --gc-debuginfo</tt><tt><br>
      </tt><tt>      binary size: 836M</tt><tt><br>
      </tt><tt>      compilation time: 62 sec</tt><tt><br>
      </tt><tt>      run-time memory: 15G</tt><tt><br>
      </tt><tt>      </tt><tt><br>
      </tt><tt>   d. DWARFLinker no-odr version, </tt><tt><br>
      </tt><tt>      command line options: --gc-sections --gc-debuginfo
        --gc-debuginfo-no-odr</tt><tt><br>
      </tt><tt>      binary size: 1,3G</tt><tt><br>
      </tt><tt>      compilation time: 128 sec</tt><tt><br>
      </tt><tt>      run-time memory: 17G</tt><br>
            <br>
      Detailed size results:<br>
      <br>
      <tt>1. llvm-strings <br>
      </tt></p>
    <p><tt>   a)</tt><tt><br>
      </tt><tt><br>
      </tt><tt>    FILE SIZE        VM SIZE    </tt><tt><br>
      </tt><tt> --------------  -------------- </tt><tt><br>
      </tt><tt>  41.1%  2.64Mi   0.0%       0    .debug_info</tt><tt><br>
      </tt><tt>  24.9%  1.60Mi   0.0%       0    .debug_str</tt><tt><br>
      </tt><tt>  12.6%   827Ki   0.0%       0    .debug_line</tt><tt><br>
      </tt><tt>   6.5%   428Ki  63.8%   428Ki    .text</tt><tt><br>
      </tt><tt>   4.8%   317Ki   0.0%       0    .strtab</tt><tt><br>
      </tt><tt>   3.4%   223Ki   0.0%       0    .debug_ranges</tt><tt><br>
      </tt><tt>   2.0%   133Ki  19.8%   133Ki    .eh_frame</tt><tt><br>
      </tt><tt>   1.7%   110Ki   0.0%       0    .symtab</tt><tt><br>
      </tt><tt>   1.2%  77.6Ki   0.0%       0    .debug_abbrev</tt><tt><br>
      </tt><tt><br>
      </tt><tt>   b)</tt><tt><br>
      </tt><tt>   </tt><tt><br>
      </tt><tt>    FILE SIZE        VM SIZE    </tt><tt><br>
      </tt><tt> --------------  -------------- </tt><tt><br>
      </tt><tt>  50.3%  1.85Mi   0.0%       0    .debug_info</tt><tt><br>
      </tt><tt>  43.6%  1.60Mi   0.0%       0    .debug_str</tt><tt><br>
      </tt><tt>   2.6%  98.2Ki   0.0%       0    .debug_line</tt><tt><br>
      </tt><tt>   2.1%  77.6Ki   0.0%       0    .debug_abbrev</tt><tt><br>
      </tt><tt>   0.5%  17.5Ki  54.9%  17.4Ki    .text</tt><tt><br>
      </tt><tt>   0.3%  9.94Ki   0.0%       0    .strtab</tt><tt><br>
      </tt><tt>   0.2%  6.27Ki   0.0%       0    .symtab</tt><tt><br>
      </tt><tt>   0.1%  5.09Ki  15.9%  5.03Ki    .eh_frame</tt><tt><br>
      </tt><tt>   0.1%  3.28Ki   0.0%       0    .debug_ranges</tt><tt><br>
      </tt><tt><br>
      </tt><tt>   c)</tt><tt><br>
      </tt><tt><br>
      </tt><tt>    FILE SIZE        VM SIZE    </tt><tt><br>
      </tt><tt> --------------  -------------- </tt><tt><br>
      </tt><tt>  33.0%  1.25Mi   0.0%       0    .debug_info</tt><tt><br>
      </tt><tt>  29.2%  1.11Mi   0.0%       0    .debug_str</tt><tt><br>
      </tt><tt>  11.0%   428Ki  63.8%   428Ki    .text</tt><tt><br>
      </tt><tt>   8.2%   317Ki   0.0%       0    .strtab</tt><tt><br>
      </tt><tt>   7.8%   304Ki   0.0%       0    .debug_line</tt><tt><br>
      </tt><tt>   3.4%   133Ki  19.8%   133Ki    .eh_frame</tt><tt><br>
      </tt><tt>   2.8%   110Ki   0.0%       0    .symtab</tt><tt><br>
      </tt><tt>   1.7%  65.9Ki   0.0%       0    .debug_ranges</tt><tt><br>
      </tt><tt>   1.0%  38.4Ki   5.7%  38.4Ki    .rodata</tt><tt><br>
      </tt><tt><br>
      </tt><tt>   d)</tt><tt><br>
      </tt><tt><br>
      </tt><tt>       FILE SIZE        VM SIZE    </tt><tt><br>
      </tt><tt> --------------  -------------- </tt><tt><br>
      </tt><tt>  39.7%  1.68Mi   0.0%       0    .debug_info</tt><tt><br>
      </tt><tt>  26.3%  1.11Mi   0.0%       0    .debug_str</tt><tt><br>
      </tt><tt>   9.9%   428Ki  63.8%   428Ki    .text</tt><tt><br>
      </tt><tt>   7.3%   317Ki   0.0%       0    .strtab</tt><tt><br>
      </tt><tt>   7.0%   304Ki   0.0%       0    .debug_line</tt><tt><br>
      </tt><tt>   3.1%   133Ki  19.8%   133Ki    .eh_frame</tt><tt><br>
      </tt><tt>   2.6%   110Ki   0.0%       0    .symtab</tt><tt><br>
      </tt><tt>   1.5%  65.9Ki   0.0%       0    .debug_ranges</tt><tt><br>
      </tt><tt><br>
      </tt><tt><br>
      </tt><tt>2. clang</tt></p>
    <p><tt>   a)</tt><tt><br>
      </tt><tt><br>
      </tt><tt>    FILE SIZE        VM SIZE    </tt><tt><br>
      </tt><tt> --------------  -------------- </tt><tt><br>
      </tt><tt>  58.3%   878Mi   0.0%       0    .debug_info</tt><tt><br>
      </tt><tt>  11.8%   177Mi   0.0%       0    .debug_str</tt><tt><br>
      </tt><tt>   7.7%   115Mi  62.2%   115Mi    .text</tt><tt><br>
      </tt><tt>   7.7%   115Mi   0.0%       0    .debug_line</tt><tt><br>
      </tt><tt>   6.0%  90.7Mi   0.0%       0    .strtab</tt><tt><br>
      </tt><tt>   2.4%  35.4Mi   0.0%       0    .debug_ranges</tt><tt><br>
      </tt><tt>   1.5%  23.3Mi  12.5%  23.3Mi    .eh_frame</tt><tt><br>
      </tt><tt>   1.5%  23.0Mi  12.4%  23.0Mi    .rodata</tt><tt><br>
      </tt><tt>   1.2%  17.9Mi   0.0%       0    .symtab</tt><tt><br>
      </tt><tt><br>
      </tt><tt>   b)</tt><tt><br>
      </tt><tt><br>
      </tt><tt>    FILE SIZE        VM SIZE    </tt><tt><br>
      </tt><tt> --------------  -------------- </tt><tt><br>
      </tt><tt>  71.5%   772Mi   0.0%       0    .debug_info</tt><tt><br>
      </tt><tt>  16.5%   177Mi   0.0%       0    .debug_str</tt><tt><br>
      </tt><tt>   3.7%  40.2Mi  59.2%  40.2Mi    .text</tt><tt><br>
      </tt><tt>   2.4%  25.8Mi   0.0%       0    .debug_line</tt><tt><br>
      </tt><tt>   2.1%  23.0Mi   0.0%       0    .strtab</tt><tt><br>
      </tt><tt>   1.0%  10.6Mi  15.6%  10.6Mi    .dynstr</tt><tt><br>
      </tt><tt>   0.7%  7.18Mi  10.6%  7.18Mi    .eh_frame</tt><tt><br>
      </tt><tt>   0.5%  5.60Mi   0.0%       0    .symtab</tt><tt><br>
      </tt><tt>   0.4%  4.28Mi   0.0%       0    .debug_ranges</tt><tt><br>
      </tt><tt>   0.4%  4.04Mi   0.0%       0    .debug_abbrev</tt><tt><br>
      </tt><tt><br>
      </tt><tt><br>
      </tt><tt>   c)</tt><tt><br>
      </tt><tt><br>
      </tt><tt>    FILE SIZE        VM SIZE    </tt><tt><br>
      </tt><tt> --------------  -------------- </tt><tt><br>
      </tt><tt>  35.1%   293Mi   0.0%       0    .debug_info</tt><tt><br>
      </tt><tt>  21.2%   177Mi   0.0%       0    .debug_str</tt><tt><br>
      </tt><tt>  13.9%   115Mi  62.2%   115Mi    .text</tt><tt><br>
      </tt><tt>  10.9%  90.7Mi   0.0%       0    .strtab</tt><tt><br>
      </tt><tt>   6.9%  57.4Mi   0.0%       0    .debug_line</tt><tt><br>
      </tt><tt>   2.8%  23.3Mi  12.5%  23.3Mi    .eh_frame</tt><tt><br>
      </tt><tt>   2.8%  23.0Mi  12.4%  23.0Mi    .rodata</tt><tt><br>
      </tt><tt>   2.1%  17.9Mi   0.0%       0    .symtab</tt><tt><br>
      </tt><tt>   1.5%  12.4Mi   0.0%       0    .debug_ranges</tt><tt><br>
      </tt><tt>   1.3%  10.6Mi   5.7%  10.6Mi    .dynstr</tt><tt><br>
      </tt><tt><br>
      </tt><tt>   d)</tt><tt><br>
      </tt><tt><br>
      </tt><tt>    FILE SIZE        VM SIZE    </tt><tt><br>
      </tt><tt> --------------  -------------- </tt><tt><br>
      </tt><tt>  58.3%   758Mi   0.0%       0    .debug_info</tt><tt><br>
      </tt><tt>  13.6%   177Mi   0.0%       0    .debug_str</tt><tt><br>
      </tt><tt>   8.9%   115Mi  62.2%   115Mi    .text</tt><tt><br>
      </tt><tt>   7.0%  90.7Mi   0.0%       0    .strtab</tt><tt><br>
      </tt><tt>   4.4%  57.4Mi   0.0%       0    .debug_line</tt><tt><br>
      </tt><tt>   1.8%  23.3Mi  12.5%  23.3Mi    .eh_frame</tt><tt><br>
      </tt><tt>   1.8%  23.0Mi  12.4%  23.0Mi    .rodata</tt><tt><br>
      </tt><tt>   1.4%  17.9Mi   0.0%       0    .symtab</tt><tt><br>
      </tt><tt>   1.0%  12.4Mi   0.0%       0    .debug_ranges</tt><tt><br>
      </tt><tt>   0.8%  10.6Mi   5.7%  10.6Mi    .dynstr</tt></p>
    <p><tt>Thank you, Alexey.</tt><br>
    </p>
    <div class="moz-cite-prefix">On 19.10.2020 11:50, James Henderson
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CABqSp3nVhK3C030SzZVVfEhH1Rheht8qgchrXpsuZ-+9cTSPHg@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">Great, thanks Alexey! I'll try to take a look at
        this in the near future, and will report my results back here. I
        imagine our clang results will differ, purely because we
        probably used different toolchains to build the input in the
        first place.<br>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Thu, 15 Oct 2020 at 10:08,
          Alexey Lapshin <<a href="mailto:avl.lapshin@gmail.com"
            moz-do-not-send="true">avl.lapshin@gmail.com</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div>
            <p><br>
            </p>
            <div>On 13.10.2020 10:20, James Henderson wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div>The script included in the patch can be used to
                  convert an object containing normal DWARF into an
                  object using fragmented DWARF. It does this by using
                  llvm-dwarfdump to dump the various sections, parses
                  the output to identify where it should split (using
                  the offsets of the various entries), and then writes
                  new section headers accordingly - you can see roughly
                  what it's doing if you get a chance to watch the talk
                  recording. The additional section headers are appended
                  to the end of the ELF section header table, whilst the
                  original DWARF is left in the same place it was before
                  (making use of the fact that section headers don't
                  have to appear in offset order). The script also
                  parses and fragments the relocation sections targeting
                  the DWARF sections so that they match up with the
                  fragmented DWARF sections. This is clearly all
                  suboptimal - in practice the compiler should be
                  modified to do the fragmenting upfront, to save having
                  to parse a tool's stdout, but that was just the
                  simplest thing I could come up with to quickly write
                  the script. Full details of the script usage are
                  included in the patch description, if you want to play
                  around with it.</div>
                <div><br>
                </div>
                <div>If Alexey could point me at the latest version of
                  his patch, I'd be happy to run that through either or
                  both of the packages I used to see what happens.
                  Equally, I'd be happy if Alexey is able to run my
                  script to fragment and measure the performance of a
                  couple of projects he's been working with. Based
                  purely on the two packages I've tried this with, I can
                  tell already that the results can vary wildly. My
                  expectation is that Alexey's approach will be slower
                  (at least in its current form, but probably more
                  generally), but produce smaller output, but to what
                  scale I have no idea.<br>
                </div>
              </div>
            </blockquote>
            <p>James, I updated the patch - <a
                href="https://reviews.llvm.org/D74169" target="_blank"
                moz-do-not-send="true">https://reviews.llvm.org/D74169</a>.</p>
            <p>To make it working it is necessary to build example with
              -ffunction-sections and specify following options to the
              linker :</p>
            <p>--gc-sections --gc-debuginfo --gc-debuginfo-no-odr</p>
            <p>For clang binary I got following results:</p>
            <p>1. --gc-sections = binary size 1,5G, Debug Info
              size(*)1.2G</p>
            <p>2. --gc-sections --gc-debuginfo = binary size 840M, 8x
              performance decrease, Debug Info size 542M<br>
            </p>
            <p>3. --gc-sections --gc-debuginfo --gc-debuginfo-no-odr =
              binary size 1,3G, 16x performance decrease, Debug Info
              size 1G<br>
            </p>
            <p>(*)
              .debug_info+.debug_str+.debug_line+.debug_ranges+.debug_loc<br>
            </p>
            <p><br>
            </p>
            <p>I added option --gc-debuginfo-no-odr, so that size
              reduction could be compared correctly. Without that option
              D74169 does types deduplication and then it is not correct
              to compare resulting size with "Fragmented DWARF" solution
              which does not do types deduplication.<br>
            </p>
            <p>Also, I look at your <font face="monospace"><font
                  face="arial,sans-serif"><a
                    href="https://reviews.llvm.org/D89229"
                    target="_blank" moz-do-not-send="true">D89229</a>
                  and would share results some time later.<br>
                </font></font></p>
            <p>Thank you, Alexey.<br>
            </p>
            <blockquote type="cite">
              <div dir="ltr">
                <div><br>
                </div>
                <div>I think linkers parse .eh_frame partly because they
                  have no other choice. That being said, I think it's
                  format is not too complex, so similarly the parser
                  isn't too complex. You can see LLD's ELF
                  implementation in ELF/EhFrame.cpp, how it is used in
                  ELF/InputSection.cpp (see the bits to do with
                  EhInputSection) and EhFrameSection in
                  ELF/SyntheticSections.h (plus various usages of these
                  two throughout the LLD code). I think the key to any
                  structural changes in the DWARF format to make them
                  more amenable to link-time parsing is being able to
                  read a minimal amount without needing to parse the
                  payload (e.g. a length field, some sort of type, and
                  then using the relocations to associate it
                  accordingly).</div>
                <div><br>
                </div>
                <div>James<br>
                </div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Mon, 12 Oct 2020 at
                  20:48, David Blaikie <<a
                    href="mailto:dblaikie@gmail.com" target="_blank"
                    moz-do-not-send="true">dblaikie@gmail.com</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px
                  0px 0.8ex;border-left:1px solid
                  rgb(204,204,204);padding-left:1ex">
                  <div dir="ltr">Awesome! Sorry I missed the lightning
                    talk, but really interested to see this sort of
                    thing (though it's not directly/immediately
                    applicable to the use case I work with - Split
                    DWARF, something similar could be used there with
                    further work)<br>
                    <br>
                    Though it looks like the patch has mostly linker
                    changes - where/how do you generate the fragmented
                    DWARF to begin with? Via the Python script? Run over
                    assembly? I'd be surprised if it was achievable that
                    way - curious to know more.<br>
                    <br>
                    Got a rough sense/are you able to run
                    apples-to-apples comparisons with Alexey's
                    linker-based patches to compare linker time/memory
                    overhead versus resulting output size gains?<br>
                    <br>
                    (& yeah, I'm a bit curious about how the linkers
                    do eh_frame rewriting, if the format is especially
                    amenable to a lightweight parsing/rewriting and how
                    we could make the DWARF more amenable to that too)</div>
                  <br>
                  <div class="gmail_quote">
                    <div dir="ltr" class="gmail_attr">On Mon, Oct 12,
                      2020 at 6:41 AM James Henderson <<a
                        href="mailto:jh7370.2008@my.bristol.ac.uk"
                        target="_blank" moz-do-not-send="true">jh7370.2008@my.bristol.ac.uk</a>>
                      wrote:<br>
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex">
                      <div dir="ltr">
                        <div>Hi all,</div>
                        <div><br>
                        </div>
                        <div>At the recent LLVM developers' meeting, I
                          presented a lightning talk on an approach to
                          reduce the amount of dead debug data left in
                          an executable following operations such as
                          --gc-sections and duplicate COMDAT removal. In
                          that presentation, I presented some figures
                          based on linking a game that had been built by
                          our downstream clang port and fragmented using
                          the described approach. Since recording the
                          presentation, I ran the same experiment on a
                          clang package (this time built with a GCC
                          version). The comparable figures are below:</div>
                        <div><br>
                        </div>
                        <div>Link-time speed (s):</div>
                        <div><span style="font-family:monospace">+--------------------+-------+---------------+------+------+------+------+------+</span><br>
                        </div>
                        <div><font face="monospace">| Package variant   
                            | No GC | GC 1 (normal) | GC 2 | GC 3 | GC 4
                            | GC 5 | GC 6 |</font></div>
                        <div><font face="monospace">+--------------------+-------+---------------+------+------+------+------+------+<br>
                          </font></div>
                        <div><font face="monospace">| Game (plain)      
                            |  4.5  |  4.9          |  4.2 |  3.6 |  3.4
                            |  3.3 |  3.2 |<br>
                          </font></div>
                        <div><font face="monospace">| Game (fragmented) 
                            | 11.1  | 11.8          |  9.7 |  8.6 |  7.9
                            |  7.7 |  7.5 |<br>
                          </font></div>
                        <div><font face="monospace">| Clang (plain)     
                            | 13.9  | 17.9          | 17.0 | 16.7 | 16.3
                            | 16.2 | 16.1 |<br>
                          </font></div>
                        <div><font face="monospace">| Clang (fragmented)
                            | 18.6  | 22.8          | 21.6 | 21.1 | 20.8
                            | 20.5 | 20.2 |</font></div>
                        <div><font face="monospace">+--------------------+-------+---------------+------+------+------+------+------+</font></div>
                        <div><font face="monospace"><br>
                          </font></div>
                        <div><font face="monospace"><font
                              face="arial,sans-serif">Output size - Game
                              package (MB):</font></font></div>
                        <div><font face="monospace"><font
                              face="arial,sans-serif"><span
                                style="font-family:monospace">+---------------------+-------+------+------+------+------+------+------+</span><br>
                            </font></font></div>
                        <div><span style="font-family:monospace">|
                            Category            | No GC | GC 1 | GC 2 |
                            GC 3 | GC 4 | GC 5 | GC 6 |<br>
                          </span></div>
                        <div><span style="font-family:monospace">+---------------------+-------+------+------+------+------+------+------+<br>
                          </span></div>
                        <div><span style="font-family:monospace">| Plain
                            (total)       | 1149  | 1121 | 1017 |  965
                            |  938 |  930 |  928 |<br>
                          </span></div>
                        <div><span style="font-family:monospace">| Plain
                            (DWARF*)      |  845  |  845 |  845 |  845
                            |  845 |  845 |  845 |<br>
                          </span></div>
                        <div><span style="font-family:monospace">| Plain
                            (other)       |  304  |  276 |  172 |  120
                            |   93 |   85 |   82 |<br>
                          </span></div>
                        <div><span style="font-family:monospace">|
                            Fragmented (total)  | 1044  |  940 |  556 | 
                            373 |  287 |  263 |  255 |<br>
                          </span></div>
                        <div><span style="font-family:monospace">|
                            Fragmented (DWARF*) |  740  |  664 |  384 | 
                            253 |  194 |  178 |  173 |<br>
                          </span></div>
                        <div><span style="font-family:monospace">|
                            Fragmented (other)  |  304  |  276 |  172 | 
                            120 |   93 |   85 |   82 |<br>
                          </span></div>
                        <div><font face="monospace">+---------------------+-------+------+------+------+------+------+------+
                          </font>
                          <div><font face="monospace"><br>
                            </font></div>
                          <div><font face="monospace"><font
                                face="arial,sans-serif">Output size -
                                Clang (MB):</font></font></div>
                          <div><font face="monospace"><font
                                face="arial,sans-serif"><span
                                  style="font-family:monospace">+---------------------+-------+------+------+------+------+------+------+</span><br>
                              </font></font></div>
                          <div><span style="font-family:monospace">|
                              Category            | No GC | GC 1 | GC 2
                              | GC 3 | GC 4 | GC 5 | GC 6 |<br>
                            </span></div>
                          <div><span style="font-family:monospace">+---------------------+-------+------+------+------+------+------+------+<br>
                            </span></div>
                          <div><span style="font-family:monospace">|
                              Plain (total)       | 2596  | 2546 | 2406
                              | 2332 | 2293 | 2273 | 2251 |<br>
                            </span></div>
                          <div><span style="font-family:monospace">|
                              Plain (DWARF*)      | 1979  | 1979 | 1979
                              | 1979 | 1979 | 1979 | 1979 |<br>
                            </span></div>
                          <div><span style="font-family:monospace">|
                              Plain (other)       |  616  |  567 |  426
                              |  353 |  314 |  294 |  272 |<br>
                            </span></div>
                          <div><span style="font-family:monospace">|
                              Fragmented (total)  | 2397  | 2346 | 2164
                              | 2069 | 2017 | 1990 | 1963 |<br>
                            </span></div>
                          <div><span style="font-family:monospace">|
                              Fragmented (DWARF*) | 1780  | 1780 | 1738
                              | 1716 | 1703 | 1696 | 1691 |<br>
                            </span></div>
                          <div><span style="font-family:monospace">|
                              Fragmented (other)  |  616  |  567 |  426
                              |  353 |  314 |  294 |  272 |<br>
                            </span></div>
                          <div><font face="monospace">+---------------------+-------+------+------+------+------+------+------+</font></div>
                        </div>
                        <div><font face="monospace"><br>
                          </font></div>
                        <div><font face="monospace">*DWARF size == total
                            size of .debug_info + .debug_line +
                            .debug_ranges + .debug_aranges + .debug_loc<br>
                          </font></div>
                        <div><font face="monospace"><br>
                          </font></div>
                        <div><font face="monospace"><font
                              face="arial,sans-serif">Additionally, I
                              have posted <a
                                href="https://reviews.llvm.org/D89229"
                                target="_blank" moz-do-not-send="true">https://reviews.llvm.org/D89229</a>
                              which provides the python script and
                              linker patches used to reproduce the above
                              results on my machine. The GC 1/2/3/4/5/6
                              correspond to the linker option added in
                              that patch --mark-live-pc with values
                              1/0.8/0.6/0.4/0.2/0 respectively.<br>
                            </font></font></div>
                        <div><font face="monospace"><font
                              face="arial,sans-serif"><br>
                            </font></font></div>
                        <div><font face="monospace"><font
                              face="arial,sans-serif">During the
                              conference, the question was asked what
                              the memory usage and input size impact
                              was. I've summarised these below:</font></font></div>
                        <div><font face="monospace"><font
                              face="arial,sans-serif"><br>
                            </font></font></div>
                        <div><font face="monospace"><font
                              face="arial,sans-serif">Input file size
                              total (GB):</font></font></div>
                        <div><font face="monospace"><font
                              face="arial,sans-serif"> <span
                                style="font-family:monospace">+--------------------+------------+
                              </span></font></font></div>
                        <div><span style="font-family:monospace"> |
                            Package variant    | Total Size | <br>
                          </span></div>
                        <div><span style="font-family:monospace">
                            +--------------------+------------+<br>
                          </span></div>
                        <div><span style="font-family:monospace"> | Game
                            (plain)       |     2.9    |    <br>
                          </span></div>
                        <div><span style="font-family:monospace"> | Game
                            (fragmented)  |     4.2    |<br>
                          </span></div>
                        <div><span style="font-family:monospace"> |
                            Clang (plain)      |    10.9    |<br>
                          </span></div>
                        <div><span style="font-family:monospace"> |
                            Clang (fragmented) |    12.3    |<br>
                          </span></div>
                        <div><span style="font-family:monospace">
                            +--------------------+------------+</span></div>
                        <div><span style="font-family:monospace"><br>
                          </span></div>
                        <div><span style="font-family:monospace"><span
                              style="font-family:arial,sans-serif">Peak
                              Working Set Memory usage (GB):</span><br>
                          </span></div>
                        <div><span style="font-family:monospace"> </span>
                          <div><font face="monospace"><font
                                face="arial,sans-serif"><span
                                  style="font-family:monospace">+--------------------+-------+------+
                                </span></font></font></div>
                          <div><span style="font-family:monospace"> |
                              Package variant    | No GC | GC 1 |<br>
                            </span></div>
                          <div><span style="font-family:monospace">
                              +--------------------+-------+------+<br>
                            </span></div>
                          <div><span style="font-family:monospace"> |
                              Game (plain)       |  4.3  |  4.7 |<br>
                            </span></div>
                          <div><span style="font-family:monospace"> |
                              Game (fragmented)  |  8.9  |  8.6 |<br>
                            </span></div>
                          <div><span style="font-family:monospace"> |
                              Clang (plain)      | 15.7  | 15.6 |<br>
                            </span></div>
                          <div><span style="font-family:monospace"> |
                              Clang (fragmented) | 19.4  | 19.2 |<br>
                            </span></div>
                          <div><span style="font-family:monospace">
                              +--------------------+-------+------+</span></div>
                          <div><span style="font-family:monospace"><br>
                            </span></div>
                          <div><span style="font-family:monospace"><font
                                face="arial,sans-serif">I'm keen to hear
                                what people's feedback is, and also
                                interested to see what results others
                                might see by running this experiment on
                                other input packages. Also, if anybody
                                has any alternative ideas that meet the
                                goals listed below, I'd love to hear
                                them!<br>
                              </font></span></div>
                          <div><span style="font-family:monospace"><font
                                face="arial,sans-serif"><br>
                              </font></span></div>
                          <div><span style="font-family:monospace"><font
                                face="arial,sans-serif">To reiterate
                                some key goals of fragmented DWARF,
                                similar to what I said in the
                                presentation:</font></span></div>
                          <div><span style="font-family:monospace"><font
                                face="arial,sans-serif">1) Devise a
                                scheme that gives significant size
                                savings without being too costly. It's
                                clear from just the two packages I've
                                tried this on that there is a fairly
                                hefty link time performance cost,
                                although the exact cost depends on the
                                nature of the input package. On the
                                other hand, depending on the nature of
                                the input package, there can also be
                                some big gains.<br>
                              </font></span></div>
                          <div><span style="font-family:monospace"><font
                                face="arial,sans-serif">2) Devise a
                                scheme that doesn't require any linker
                                knowledge of DWARF. The current approach
                                doesn't quite achieve this properly due
                                to the slight misuse of SHF_LINK_ORDER,
                                but I expect that a pivot to using
                                non-COMDAT group sections should solve
                                this problem.</font></span></div>
                          <div><span style="font-family:monospace"><font
                                face="arial,sans-serif">3) Provide some
                                kind of halfway house between simply
                                writing tombstone values into dead DWARF
                                and fully parsing the DWARF to
                                reoptimise its/discard the dead bits.<br>
                              </font></span></div>
                          <div><span style="font-family:monospace"><font
                                face="arial,sans-serif"><br>
                              </font></span></div>
                          <div><span style="font-family:monospace"><font
                                face="arial,sans-serif">I'm hopeful that
                                changes could be made to the linker to
                                improve the link-time cost. There seems
                                to be a significant amount of the link
                                time spent creating the input sections.
                                An alternative would be to devise a
                                scheme that would avoid the literal
                                splitting into section headers, in
                                favour of some sort of list of
                                split-points that the linker uses to
                                split things up (a bit like it already
                                does for .eh_frame or mergeable
                                sections).</font><br>
                            </span></div>
                          <span style="font-family:monospace"> </span></div>
                      </div>
                    </blockquote>
                  </div>
                </blockquote>
              </div>
            </blockquote>
          </div>
        </blockquote>
      </div>
    </blockquote>
  </body>
</html>