<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 27.08.2020 03:57, Jonas Devlieghere
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAJQy47cO1K_evTT9QdthCGHjr0tAtVgVCDz3BUcimwt+XRhMQw@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div dir="ltr">
          <div dir="ltr"><br>
          </div>
          <br>
          <div class="gmail_quote">
            <div dir="ltr" class="gmail_attr">On Wed, Aug 26, 2020 at
              7:01 AM Alexey <<a href="mailto:avl.lapshin@gmail.com"
                moz-do-not-send="true">avl.lapshin@gmail.com</a>>
              wrote:<br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
              <div>
                <p><br>
                </p>
                <div>On 26.08.2020 10:58, James Henderson wrote:<br>
                </div>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div>In principle, this sounds reasonable to me. I
                      don't know enough about dsymutil's interface to
                      know whether it makes sense to try to make it
                      multi-format compatible or not. If it doesn't I'm
                      perfectly happy for a new tool to be added using
                      the DWARFLinker library.</div>
                    <div><br>
                    </div>
                    <div>Some more general thoughts:</div>
                    <div>1) Assuming the proposal is accepted, this
                      should be introduced piecemeal into LLVM from the
                      beginning as it is developed, rather than having a
                      separate step 4 in the roadmap.</div>
                    <div>2) The default tombstone values used for dead
                      debug data should be those produced by LLD, in my
                      opinion. In an ideal world, we'd factor them into
                      some shared constant. Note that at the time of
                      writing, I believe LLD is currently using
                      BFD-style tombstones, not the new -1/-2.</div>
                  </div>
                </blockquote>
                <p>agreed.</p>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div>3) Does the DWARFLinker library already support
                      multi-threading? If not, it might be a lot of work
                      making things thread-safe.</div>
                  </div>
                </blockquote>
                <p>It does, but in a limited way. It can parallelize
                  analyzing and cloning stages. i.e. the maximal speedup
                  is two times.</p>
                <p>To have a greater performance impact it could
                  probably be parallelized per compilation unit basis.</p>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>
              <div>I want to elaborate on this a bit as it's been coming
                up several times now.</div>
              <div>With the current design you cannot process CUs in
                parallel. There are two</div>
              <div>reasons for that:</div>
              <div><br>
              </div>
              <div>1. When uniquing types, the first time a new type is
                encountered, it is marked</div>
              <div>   as canonical. Every subsequent encounter of that
                type is replaced by a</div>
              <div>   reference to the canonical type which is going to
                be coming from another CU.</div>
              <div>   So to process CUs in parallel, you have to
                guarantee the reproducibility of</div>
              <div>   what that canonical DIE is going to be, or
                postpone this to a sequential</div>
              <div>   step, which would potentially defeat the purpose
                of the parallel analysis.</div>
              <div><br>
              </div>
              <div>2. During emission, we emit the offset of the
                canonical type in the output</div>
              <div>   directly which allows us to stream out the DWARF.
                That means we need to know</div>
              <div>   the offset when processing the places where the
                uniquing is removing the</div>
              <div>   full type, which implies there’s a sequencing
                between cloning the CU with</div>
              <div>   the canonical type and processing further uses.
                Without this property, we'd</div>
              <div>   also need to keep all the output DIEs in memory
                until the offset can be</div>
              <div>   computed, or we need to do another iteration to
                patch up the offsets. </div>
              <div><br>
              </div>
              <div>I'm not saying this to discourage you, quite the
                opposite actually. I'd love to</div>
              <div>be able to speed-up dsymutil. I'm just sharing this
                based on my experience when</div>
              <div>adding the current concurrency which starts analyzing
                the next CU when we</div>
              <div>finished processing the current one and are emitting
                it.</div>
              <div><br>
              </div>
              <div>I'm sure we could design the uniquing algorithm in a
                way that would be able to</div>
              <div>process CUs in parallel and gather the types, then
                synchronize to make a</div>
              <div>uniquing decision, then clone all CUs in parallel and
                finally relocate all the</div>
              <div>offsets to the canonical DIE. In the current
                algorithm it’s just not that</div>
              <div>simple, because you don’t know whether a type is
                going to be kept in a</div>
              <div>particular CU before having processed it completely.</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    right. That is understood. I did not prepare detailed plan on this
    yet. <br>
    Generally, I think, For the point 1 we need to have multi-thread
    version of DeclContext.<br>
    Then at the first stage all CUs would be analyzed(in parallel) and
    canonical DIEs are determined. <br>
    At second stage all CUs would be emitted(in parallel) into own
    container. Finally all CUs <br>
    containers would be sequentially glued into resulting output and
    offsets referring <br>
    to de-duplicated types should be changed to offset of canonical DIE
    (in the similar <br>
    manner like it is currently done in
    CompileUnit::fixupForwardReferences()). <br>
    <br>
    It sounds pretty similar to your above description.<br>
    <br>
    It would be good if such "per CU" processing allows not only speed
    up execution time <br>
    but minimize memory requirements. So that it would not be necessary
    to load all DIEs at a time.<br>
    But that is not clear how to do yet. Because of DW_FORM_ref_addr
    attributes. Until we analyzed<br>
    all CUs(to understand whether we need to keep referenced dies) we
    could not start cloning stage.<br>
    <p><br>
    </p>
    <blockquote type="cite"
cite="mid:CAJQy47cO1K_evTT9QdthCGHjr0tAtVgVCDz3BUcimwt+XRhMQw@mail.gmail.com">
      <div dir="ltr">
        <div dir="ltr">
          <div class="gmail_quote">
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
              <div>
                <p>Another thing is that dsymutil currently loads all
                  DIEs from source object file into the memory. And
                  releases them after object file is processed. For
                  non-linked binary this works OK(big binaries usually
                  compiled from several object files). For linked binary
                  that means all DIEs are loaded into the memory. In the
                  result it requires a lot of memory resources. The
                  solution for this problem could be changing splitting
                  of source data from the file to the compilation unit
                  basis.</p>
                <p>yes, making dsymutil/dwarfutil to work on compilation
                  unit basis supporting multi-threading is a quite a big
                  piece of work. It looks like it would be good for both
                  dsymutil and dwarfutil.<br>
                </p>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div>4) Given that DWARF v6 doesn't exist yet, I
                      wouldn't include that as an option name just
                      yet...!</div>
                  </div>
                </blockquote>
                <p>Would "maxpc" be OK? --tombstone=maxpc ?<br>
                </p>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div><br>
                    </div>
                    <div>Thanks for looking at this! Please keep me
                      involved in any related reviews etc.</div>
                  </div>
                </blockquote>
                <p>sure. Thank you for the comments.</p>
                <p>Alexey.<br>
                </p>
                <p><br>
                </p>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div><br>
                    </div>
                    <div>James<br>
                    </div>
                  </div>
                  <br>
                  <div class="gmail_quote">
                    <div dir="ltr" class="gmail_attr">On Tue, 25 Aug
                      2020 at 15:29, Alexey via llvm-dev <<a
                        href="mailto:llvm-dev@lists.llvm.org"
                        target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
                      wrote:<br>
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi,<br>
                      <br>
                         We propose llvm-dwarfutil - a dsymutil-like
                      tool for ELF.<br>
                         Any thoughts on this?<br>
                         Thanks in advance, Alexey.<br>
                      <br>
======================================================================<br>
                      <br>
                      llvm-dwarfutil(Apndx A) - is a tool that is used
                      for processing debug <br>
                      info(DWARF)<br>
                      located in built binary files to improve debug
                      info quality,<br>
                      reduce debug info size and accelerate debug info
                      processing.<br>
                      Supported object files formats: ELF, MachO(Apndx
                      B), COFF(Apndx C), <br>
                      WASM(Apndx C).<br>
                      <br>
======================================================================<br>
                      <br>
                      Specifically, the tool would do:<br>
                      <br>
                         - Remove obsolete debug info which refers to
                      code deleted by the linker<br>
                           doing the garbage collection (gc-sections).<br>
                      <br>
                         - Deduplicate debug type definitions for
                      reducing resulting size of <br>
                      binary.<br>
                      <br>
                         - Build accelerator/index tables.<br>
                           = .debug_aranges, .debug_names, .gdb_index,
                      .debug_pubnames, <br>
                      .debug_pubtypes.<br>
                      <br>
                         - Strip unneeded tables.<br>
                           = .debug_aranges, .debug_names, .gdb_index,
                      .debug_pubnames, <br>
                      .debug_pubtypes.<br>
                      <br>
                         - Compress or decompress debug info as
                      requested.<br>
                      <br>
                      Possible feature:<br>
                      <br>
                         - Join split dwarf .dwo files in a single file
                      containing all debug info<br>
                           (convert split DWARF into monolithic DWARF).<br>
                      <br>
======================================================================<br>
                      <br>
                      User interface:<br>
                      <br>
                         OVERVIEW: A tool for optimizing debug info
                      located in the built binary.<br>
                      <br>
                         USAGE: llvm-dwarfutil [options] input output<br>
                      <br>
                         OPTIONS: (Apndx E)<br>
                      <br>
======================================================================<br>
                      <br>
                      Implementation notes:<br>
                      <br>
                      1. Removing obsolete debug info would be done
                      using DWARFLinker llvm <br>
                      library.<br>
                      <br>
                      2. Data types deduplication would be done using
                      DWARFLinker llvm library.<br>
                      <br>
                      3. Accelerator/index tables would be generated
                      using DWARFLinker llvm <br>
                      library.<br>
                      <br>
                      4. Interface of DWARFLinker library would be
                      changed in such way that it<br>
                          would be possible to switch on/off various
                      stages:<br>
                      <br>
                         class DWARFLinker {<br>
                           setDoRemoveObsoleteInfo ( bool
                      DoRemoveObsoleteInfo = false);<br>
                      <br>
                           setDoAppleNames ( bool DoAppleNames = false
                      );<br>
                           setDoAppleNamespaces ( bool DoAppleNamespaces
                      = false );<br>
                           setDoAppleTypes ( bool DoAppleTypes = false
                      );<br>
                           setDoObjC ( bool DoObjC = false );<br>
                           setDoDebugPubNames ( bool DoDebugPubNames =
                      false );<br>
                           setDoDebugPubTypes ( bool DoDebugPubTypes =
                      false );<br>
                      <br>
                           setDoDebugNames (bool DoDebugNames = false);<br>
                           setDoGDBIndex (bool DoGDBIndex = false);<br>
                         }<br>
                      <br>
                      5. Copying source file contents, stripping tables,
                      <br>
                      compressing/decompressing tables<br>
                          would be done by ObjCopy llvm
                      library(extracted from llvm-objcopy):<br>
                      <br>
                         Error executeObjcopyOnBinary(const CopyConfig
                      &Config,<br>
                                                   
                      object::COFFObjectFile &In, Buffer &Out);<br>
                         Error executeObjcopyOnBinary(const CopyConfig
                      &Config,<br>
                                                   
                      object::ELFObjectFileBase &In, Buffer
                      &Out);<br>
                         Error executeObjcopyOnBinary(const CopyConfig
                      &Config,<br>
                                                   
                      object::MachOObjectFile &In, Buffer &Out);<br>
                         Error executeObjcopyOnBinary(const CopyConfig
                      &Config,<br>
                                                   
                      object::WasmObjectFile &In, Buffer &Out);<br>
                      <br>
                      6. Address ranges and single addresses pointing to
                      removed code should <br>
                      be marked<br>
                          with tombstone value in the input file:<br>
                      <br>
                          -2 for .debug_ranges and .debug_loc.<br>
                          -1 for other .debug* tables.<br>
                      <br>
                      7. Prototype implementation - <a
                        href="https://reviews.llvm.org/D86539"
                        rel="noreferrer" target="_blank"
                        moz-do-not-send="true">https://reviews.llvm.org/D86539</a>.<br>
                      <br>
======================================================================<br>
                      <br>
                      Roadmap:<br>
                      <br>
                      1. Refactor llvm-objcopy to extract it`s
                      implementation into separate <br>
                      library<br>
                          ObjCopy(in LLVM tree).<br>
                      <br>
                      2. Create a command line utility using existed
                      DWARFLinker and ObjCopy<br>
                          implementation. First version is supposed to
                      work with only ELF <br>
                      input object files.<br>
                          It would take input ELF file with unoptimized
                      debug info and create <br>
                      output<br>
                          ELF file with optimized debug info. That
                      version would be done out <br>
                      of the llvm tree.<br>
                      <br>
                      3. Make a tool to be able to work in multi-thread
                      mode.<br>
                      <br>
                      4. Consider it to be included into LLVM tree.<br>
                      <br>
                      5. Support DWARF5 tables.<br>
                      <br>
======================================================================<br>
                      <br>
                      Appendix A. Should this tool be implemented as a
                      new tool or as an extension<br>
                                   to dsymutil/llvm-objcopy?<br>
                      <br>
                          There already exists a tool which removes
                      obsolete debug info on <br>
                      darwin - dsymutil.<br>
                          Why create another tool instead of extending
                      the already existed <br>
                      dsymutil/llvm-objcopy?<br>
                      <br>
                          The main functionality of dsymutil is located
                      in a separate library <br>
                      - DWARFLinker.<br>
                          Thus, dsymutil utility is a command-line
                      interface for DWARFLinker. <br>
                      dsymutil has<br>
                          another type of input/output data: it takes
                      several object files and <br>
                      address map<br>
                          as input and creates a .dSYM bundle with
                      linked debug info as <br>
                      output. llvm-dwarfutil<br>
                          would take a built executable as input and
                      create an optimized <br>
                      executable as output.<br>
                          Additionally, there would be many command-line
                      options specific for <br>
                      only one utility.<br>
                          This means that these utilities(implementing
                      command line interface) <br>
                      would significantly<br>
                          differ. It makes sense not to put another
                      command-line utility <br>
                      inside existing dsymutil,<br>
                          but make it as a separate utility. That is the
                      reason why <br>
                      llvm-dwarfutil suggested to be<br>
                          implemented not as sub-part of dsymutil but as
                      a separate tool.<br>
                      <br>
                          Please share your preference: whether
                      llvm-dwarfutil should be<br>
                          separate utility, or a variant of dsymutil
                      compiled for ELF?<br>
                      <br>
======================================================================<br>
                      <br>
                      Appendix B. The machO object file format is
                      already supported by dsymutil.<br>
                          Depending on the decision whether
                      llvm-dwarfutil would be done as a <br>
                      subproject<br>
                          of dsymutil or as a separate utility - machO
                      would be supported or not.<br>
                      <br>
======================================================================<br>
                      <br>
                      Appendix C. Support for the COFF and WASM object
                      file formats presented as<br>
                           possible future improvement. It would be
                      quite easy to add them <br>
                      assuming<br>
                           that llvm-objcopy already supports these
                      formats. It also would require<br>
                           supporting DWARF6-suggested tombstone
                      values(-1/-2).<br>
                      <br>
======================================================================<br>
                      <br>
                      Appendix D. Documentation.<br>
                      <br>
                         - proposal for DWARF6 which suggested -1/-2
                      values for marking bad <br>
                      addresses<br>
                           <a
                        href="http://www.dwarfstd.org/ShowIssue.php?issue=200609.1"
                        rel="noreferrer" target="_blank"
                        moz-do-not-send="true">http://www.dwarfstd.org/ShowIssue.php?issue=200609.1</a><br>
                         - dsymutil tool <a
                        href="https://llvm.org/docs/CommandGuide/dsymutil.html"
                        rel="noreferrer" target="_blank"
                        moz-do-not-send="true">https://llvm.org/docs/CommandGuide/dsymutil.html</a>.<br>
                         - proposal "Remove obsolete debug info in lld."<br>
                      <a
                        href="http://lists.llvm.org/pipermail/llvm-dev/2020-May/141468.html"
                        rel="noreferrer" target="_blank"
                        moz-do-not-send="true">http://lists.llvm.org/pipermail/llvm-dev/2020-May/141468.html</a><br>
                      <br>
======================================================================<br>
                      <br>
                      Appendix E. Possible command line options:<br>
                      <br>
                      DwarfUtil Options:<br>
                      <br>
                         --build-aranges           - generate
                      .debug_aranges table.<br>
                         --build-debug-names       - generate
                      .debug_names table.<br>
                         --build-debug-pubnames    - generate
                      .debug_pubnames table.<br>
                         --build-debug-pubtypes    - generate
                      .debug_pubtypes table.<br>
                         --build-gdb-index         - generate .gdb_index
                      table.<br>
                         --compress                - Compress debug
                      tables.<br>
                         --decompress              - Decompress debug
                      tables.<br>
                         --deduplicate-types       - Do ODR
                      deduplication for debug types.<br>
                         --garbage-collect         - Do garbage
                      collecting for debug info.<br>
                         --num-threads=<n>         - Specify the
                      maximum number (n) of <br>
                      simultaneous threads<br>
                                                     to use when
                      optimizing input file.<br>
                                                     Defaults to the
                      number of cores on the <br>
                      current machine.<br>
                         --strip-all               - Strip all debug
                      tables.<br>
                         --strip=<name1,name2>     - Strip
                      specified debug info tables.<br>
                         --strip-unoptimized-debug - Strip all
                      unoptimized debug tables.<br>
                         --tombstone=<value>       - Tombstone
                      value used as a marker of <br>
                      invalid address.<br>
                           =bfd                    -   BFD default value<br>
                           =dwarf6                 -   Dwarf v6.<br>
                         --verbose                 - Enable verbose
                      logging and encoding details.<br>
                      <br>
                      Generic Options:<br>
                      <br>
                         --help                    - Display available
                      options (--help-hidden <br>
                      for more)<br>
                         --version                 - Display the version
                      of this program<br>
                      <br>
                      _______________________________________________<br>
                      LLVM Developers mailing list<br>
                      <a href="mailto:llvm-dev@lists.llvm.org"
                        target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
                      <a
                        href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                        rel="noreferrer" target="_blank"
                        moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
                    </blockquote>
                  </div>
                </blockquote>
              </div>
            </blockquote>
          </div>
        </div>
      </div>
    </blockquote>
  </body>
</html>