<div dir="ltr"><div>Hi Alexey,<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 5 Nov 2020 at 21:02, Alexey Lapshin <<a href="mailto:avl.lapshin@gmail.com">avl.lapshin@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <p>Hi James,</p>
    On 05.11.2020 17:59, James Henderson wrote:<br>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div dir="ltr">(Resending with history trimmed to avoid it
          getting stuck in moderator queue).<br>
        </div>
        <div class="gmail_quote">
          <div dir="ltr" class="gmail_attr"><br>
          </div>
          <div>
            <div>Hi Alexey,</div>
            <div><br>
            </div>
            <div>Just an update - I identified the cause of the
              "Generated debug info is broken" error message when I
              tried to build things locally: the `outStreamer` instance
              is initialised with the host Triple, instead of whatever
              the target's triple is. For example, I build and run LLD
              on Windows, which means that a Windows triple will be
              generated, and consequently a COFF-emitting streamer will
              be created, rather than the ELF-emitting one I'd expect
              were the triple information to somehow be derived from the
              linker flavor/input objects etc. Hard-coding in my target
              triple resolved the issue (although I still got the other
              warnings mentioned from my game link).</div>
          </div>
        </div>
      </div>
    </blockquote>
    <p>   Thank you for the details. Actually, I did not test this on
      Windows.  But I would do and update the patch.</p>
    <p><br>
    </p>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_quote">
          <div>
            <div><br>
            </div>
            <div>I measured the performance figures using LLD patched as
              described, and using the same methodology as my earlier
              results, and got the following:</div>
            <br>
            <div>
              <div>Link-time speed (s):</div>
              <div><span style="font-family:monospace">+-----------------------------+---------------+</span><br>
              </div>
              <div><font face="monospace">| Package variant            
                  | GC 1 (normal) |</font></div>
              <div><font face="monospace">+-----------------------------+---------------+<br>
                </font></div>
              <div><font face="monospace">| Game (DWARF linker)        
                  |  53.6         |
                </font>
                <div><font face="monospace">| Game (DWARF linker, no
                    ODR) |  63.6         |</font></div>
              </div>
              <div><font face="monospace">| Clang (DWARF linker)       
                  | 200.6         |</font></div>
              <div><font face="monospace">+-----------------------------+---------------+</font></div>
              <div><font face="monospace"><br>
                </font></div>
              <div><font face="monospace">
                </font>
                <div><font face="monospace"><font face="arial,sans-serif">Output size - Game package
                      (MB):</font></font></div>
                <div><font face="monospace"><font face="arial,sans-serif"><span style="font-family:monospace">+-----------------------------+------+</span><br>
                    </font></font></div>
                <div><span style="font-family:monospace">| Category    
                                   | GC 1 |<br>
                  </span></div>
                <div><span style="font-family:monospace">+-----------------------------+------+<br>
                  </span></div>
                <div><span style="font-family:monospace">| DWARFLinker
                    (total)         |  696 |<br>
                  </span></div>
                <div><span style="font-family:monospace">| DWARFLinker
                    (DWARF*)        |  429 |<br>
                  </span></div>
                <div><span style="font-family:monospace">| DWARFLinker
                    (other)         |  267 |
                  </span>
                  <div><span style="font-family:monospace">| DWARFLinker
                      no ODR (total)  |  753 |<br>
                    </span></div>
                  <div><span style="font-family:monospace">| DWARFLinker
                      no ODR (DWARF*) |  485 |<br>
                    </span></div>
                  <div><span style="font-family:monospace">| DWARFLinker
                      no ODR (other)  |  268 |</span></div>
                </div>
                <div><font face="monospace">+-----------------------------+------+</font>
                  <div><font face="monospace"><br>
                    </font></div>
                  <div><font face="monospace"><font face="arial,sans-serif">Output size - Clang
                        (MB):
                      </font></font>
                    <div><font face="monospace"><font face="arial,sans-serif"><span style="font-family:monospace">+-----------------------------+------+</span><br>
                        </font></font></div>
                    <div><span style="font-family:monospace">| Category
                                           | GC 1 |<br>
                      </span></div>
                    <div><span style="font-family:monospace">+-----------------------------+------+</span></div>
                    <div><span style="font-family:monospace">|
                        DWARFLinker (total)         | 1294 |<br>
                      </span></div>
                    <div><span style="font-family:monospace">|
                        DWARFLinker (DWARF*)        |  743 |<br>
                      </span></div>
                    <div><span style="font-family:monospace">|
                        DWARFLinker (other)         |  551 |
                        <span style="font-family:monospace">
                        </span></span>
                      <div><span style="font-family:monospace">|
                          DWARFLinker no ODR (total)  | 1294 |<br>
                        </span></div>
                      <div><span style="font-family:monospace">|
                          DWARFLinker no ODR (DWARF*) |  743 |<br>
                        </span></div>
                      <div><span style="font-family:monospace">|
                          DWARFLinker no ODR (other)  |  551 |
                        </span>
                        <div>
                          <div><span style="font-family:monospace"></span></div>
                        </div>
                        <div><font face="monospace">+-----------------------------+------+</font></div>
                      </div>
                    </div>
                    <div><font face="monospace"><br>
                      </font></div>
                    <div><font face="monospace"><span style="font-family:arial,sans-serif">*DWARF =
                          just .debug_info, .debug_line, .debug_loc,
                          .debug_aranges, .debug_ranges.</span></font></div>
                    <div><font face="monospace"><span style="font-family:arial,sans-serif"></span></font><br>
                      <font face="monospace"><span style="font-family:arial,sans-serif"></span>
                      </font>
                      <div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">Peak
                            Working Set Memory usage (GB):</span><br>
                        </span></div>
                      <div><span style="font-family:monospace">
                        </span>
                        <div><font face="monospace"><font face="arial,sans-serif"><span style="font-family:monospace">+-----------------------------+------+
                              </span></font></font></div>
                        <div><span style="font-family:monospace">
                            | Package variant             | GC 1 |<br>
                          </span></div>
                        <div><span style="font-family:monospace">
                            +-----------------------------+------+<br>
                          </span></div>
                        <div><span style="font-family:monospace">
                            | Game (DWARFLinker)          |  5.7 |
                          </span>
                          <div><span style="font-family:monospace">
                              | Game (DWARFLinker, no ODR)  |  5.8 |</span></div>
                        </div>
                        <div><span style="font-family:monospace">
                            | Clang (DWARFLinker)         | 22.4 |
                          </span>
                          <div>
                            <div><span style="font-family:monospace"></span></div>
                          </div>
                          <div><span style="font-family:monospace">
                              | Clang (DWARFLinker, no ODR) | 22.5 |</span></div>
                        </div>
                        <div><span style="font-family:monospace">
                            +-----------------------------+------+</span></div>
                        <div><span style="font-family:monospace"><br>
                          </span></div>
                        <div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">My
                              opinion is that the time costs of the
                              DWARF Linker approach are not really
                              practical except on build servers, in the
                              current state of affairs for larger
                              packages: clang takes 8.8x as long as the
                              fragmented approach and 11.2x as long as
                              the plain approach (without the no ODR
                              option). The size saving is certainly
                              good, with my version of clang 51% of the
                              total output size for the DWARF linker
                              approach versus the plain approach and 55%
                              of the fragmented approach (though it is
                              likely that further size savings might be
                              possible for the latter). The game
                              produced reasonable size savings too: 62%
                              and 74%, but I'd be surprised if these
                              gains would be enough for people to want
                              to use the approach in day-to-day
                              situations, which presumably is the main
                              use-case for smaller DWARF, due to
                              improved debugger load times.<br>
                            </span></span></div>
                        <div><span style="font-family:arial,sans-serif"><br>
                          </span></div>
                        <div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">Interesting
                              to note is that the GCC 7.5 build of clang
                              I've used these figures with produced no
                              difference in size results between the two
                              variants, unlike other packages.
                              Consequently, a significant amount of time
                              is saved for no penalty.</span><br>
                          </span></div>
                        <div><span style="font-family:arial,sans-serif"><br>
                          </span></div>
                        <div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">I'll
                              be interested to see what the time results
                              of the DWARF linker are once further
                              improvements to it have been made.<br>
                            </span></span></div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <p>yep, current time costs of the DWARFLinker are too high. One of
      the reasons is that lld handles sections in parallel, while
      DWARFLinker handles data sequentially. Probably DWARFLinker
      numbers could be improved if it would be possible to teach it to
      handle data in parallel. Thank you for the comparison!<br></p></div></blockquote><div>No problem! It was useful for me to gather the numbers for internal investigations too. Parallelisation would hopefully help, but at this point it's hard to say by how much. There are likely going to be additional time costs for fragmented DWARF too, once I fix the remaining deficiencies, as they'll require more relocations.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>
    </p>
    <p>Speaking of "Fragmented DWARF" solution, how do you estimate
      memory requirements to support fragmented object files ?</p></div></blockquote><div>I'm not sure if you're referring to the memory usage at link time or the disk space required for the inputs, but I posted both those figures in my original post in this thread. If it's something else, please let me know. Based on those figures, it's clear the cost depends on the input code base, but it was between 25 and 75% or so bigger object file size and 50 and 100% more memory usage. Again, these are likely both to go up when I get around to fixing the remaining issues.<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>In
      comments for your Lightning Talk you have mentioned that it would
      be necessary to "<span id="gmail-m_78680551266622888336345162">update DebugInfo library
        to treat the fragmented sections as one continuous section</span>".
      Do you think it would be cheap to implement?<br></p></div></blockquote><div>I think so. I'd hope it would be possible to replace the data buffer underlying the DWARF section parsing to be able to "jump" to the next fragment (section) when it gets to the end of the previous one. I haven't experimented with this, but I wouldn't expect it to be costly in terms of code quality or performance, at least in comparison to parsing the DWARF itself.<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>
    </p>
    <p>Thank you, Alexey.<br>
    </p>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_quote">
          <div>
            <div>
              <div>
                <div>
                  <div>
                    <div>
                      <div>
                        <div><span style="font-family:arial,sans-serif"><br>
                          </span></div>
                        <div><span style="font-family:arial,sans-serif">Thanks,<br>
                          </span></div>
                        <div><span style="font-family:arial,sans-serif"><br>
                          </span></div>
                        <div><span style="font-family:monospace"><span style="font-family:arial,sans-serif">James</span><br>
                          </span></div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
          <blockquote class="gmail_quote" style="margin:0px 0.8ex;border-left:1px solid rgb(204,204,204);border-right:1px solid rgb(204,204,204);padding-left:1ex;padding-right:1ex">
            <div dir="ltr">
              <div>
                <div>
                  <div>
                    <div>
                      <div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
  </div>

</blockquote></div></div>