<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 23.10.2020 19:43, David Blaikie
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAENS6EvdtmTPM0UTN4M=kbVmv4VF5jYS00kH-7bJ2mQ1ucK69w@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <blockquote type="cite">
              <div dir="ltr">
                <div class="gmail_quote">
                  <blockquote class="gmail_quote" style="margin:0px 0px
                    0px 0.8ex;border-left:1px solid
                    rgb(204,204,204);padding-left:1ex">
                    <div><br>
                      <br>
                    </div>
                  </blockquote>
                  <div><br>
                  </div>
                  <div>Ah, yeah - that seems like a missed opportunity -
                    duplicating the whole type DIE. LTO does this by
                    making monolithic types - merging all the members
                    from different definitions of the same type into
                    one, but that's maybe too expensive for dsymutil
                    (might still be interesting to know how much more
                    expensive, etc). But I think the other way to go
                    would be to produce a declaration of the type, with
                    the relevant members - and let the DWARF consumer
                    identify this declaration as matching up with the
                    earlier definition. That's the sort of DWARF you get
                    from the non-MachO default -fno-standalone-debug
                    anyway, so it's already pretty well tested/supported
                    (support in lldb's a bit younger/more
                    work-in-progress, admittedly). I wonder how much
                    dsym size there is that could be reduced by such an
                    implementation.</div>
                </div>
              </div>
            </blockquote>
            <p>I see. Yes, that could be done and I think it would
              result in noticeable size reduction(I do not know exact
              numbers at the moment).</p>
            <p>I work on multi-thread DWARFLinker now and it`s first
              version will do exactly the same type processing like
              current dsymutil.</p>
          </blockquote>
          <div>Yeah, best to keep the behavior the same through that</div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div>
              <p>Above scheme could be implemented as a next step and it
                would result in better size reduction(better than
                current state).</p>
              <p>But I think the better scheme could be done also and it
                would result in even bigger size reduction and in faster
                execution. This scheme is something similar to what
                you`ve described above: "LTO does - making monolithic
                types - merging all the members from different
                definitions of the same type into one".</p>
            </div>
          </blockquote>
          <div>I believe the reason that's probably not been done is
            that it can't be streamed - it'd lead to buffering more of
            the output </div>
        </div>
      </div>
    </blockquote>
    <p>yes. The fact that DWARF should be streamed into AsmPrinter
      complicates parallel dwarf generation. In my prototype, I generate
      <br>
      several resulting files(each for one source compilation unit) and
      then sequentially glue them into the final resulting file.<br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite"
cite="mid:CAENS6EvdtmTPM0UTN4M=kbVmv4VF5jYS00kH-7bJ2mQ1ucK69w@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_quote">
          <div>(if two of these expandable types were in one CU - the
            start of the second type couldn't be known until the end
            because it might keep getting pushed later due to expansion
            of the first type) and/or having to revisit all the type
            references (the offset to the second type wouldn't be known
            until the end - so writing the offsets to refer to the type
            would have to be deferred until then).<br>
          </div>
        </div>
      </div>
    </blockquote>
    <p>That is the second problem: offsets are not known until the end
      of file.<br>
      dsymutil already has that situation for inter-CU references, so it
      has extra pass to<br>
      fixup offsets. With multi-thread implementation such situation
      would arise more often <br>
      for type references and so more offsets should be fixed during
      additional pass.<br>
    </p>
    <blockquote type="cite"
cite="mid:CAENS6EvdtmTPM0UTN4M=kbVmv4VF5jYS00kH-7bJ2mQ1ucK69w@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div>
              <p>DWARFLinker could create additional artificial compile
                unit and put all merged types there. Later patch all
                type references to point into this additional
                compilation unit.  No any bits would be duplicated in
                that case. The performance improvement could be achieved
                due to less amount of the copied DWARF and due to the
                fact that type references could be updated when DWARF is
                cloned(no need in additional pass for that).<br>
              </p>
            </div>
          </blockquote>
          <div>"later patch all type references to point into this
            additional compilation unit" - that's the additional pass
            that people are probably talking/concerned about. Rewalking
            all the DWARF. The current dsymutil approach, as far as I
            know, is single pass - it knows the final, absolute offset
            to the type from the moment it emits that type/needs to
            refer to it. <br>
          </div>
        </div>
      </div>
    </blockquote>
    <p>Right. Current dsymutil approach is single pass. And from that
      point of view, solution <br>
      which you`ve described(to produce a declaration of the type, with
      the relevant members) <br>
      allows to keep that single pass implementation.<br>
      <br>
      But there is a restriction for current dsymutil approach: To
      process inter-CU references <br>
      it needs to load all DWARF into the memory(While it analyzes which
      part of DWARF is live, <br>
      it needs to have all CUs loaded into the memory). That leads to
      huge memory usage. <br>
      It is less important when source is a set of object files(like in
      dsymutil case) and this <br>
      become a real problem for llvm-dwarfutil utility when source is a
      single file(With current <br>
      implementation it needs 30G of memory for compiling clang binary).<br>
      <br>
      Without loading all CU into the memory it would require two passes
      solution. First to analyze <br>
      which part of DWARF relates to live code and then second pass to
      generate the result. <br>
      If we would have a two passes solution then we could create a
      compilation unit with all <br>
      types at first pass and at the second pass we could generate
      result with correct offsets(no <br>
      need to fix up them as it is currently required by dsymutil for
      forward inter-CU references).<br>
      The open question currently: how expensive this two passes
      approach is.<br>
    </p>
    <p>Thank you, Alexey.<br>
    </p>
    <blockquote type="cite"
cite="mid:CAENS6EvdtmTPM0UTN4M=kbVmv4VF5jYS00kH-7bJ2mQ1ucK69w@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div>
              <p> </p>
              <p>Anyway, that might be the next step after multi-thread
                DWARFLinker would be ready.<br>
              </p>
            </div>
          </blockquote>
          <div>Yep, be interesting to see how it all goes! </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div>
              <p> </p>
              <blockquote type="cite">
                <div dir="ltr">
                  <div class="gmail_quote">
                    <div> </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex">
                      <div> <br>
                        Do you suggest that 0x0000011b should be
                        transformed into something like that:<br>
                        <br>
                        0x000000fc: DW_TAG_compile_unit<br>
                                      DW_AT_language   
                        (DW_LANG_C_plus_plus)<br>
                                      DW_AT_name        ("templ.cpp")<br>
                                      DW_AT_stmt_list   (0x00000090)<br>
                                      DW_AT_low_pc     
                        (0x0000000100000fa0)<br>
                                      DW_AT_high_pc    
                        (0x0000000100000fab)<br>
                        <br>
                        0x0000011b:   DW_TAG_structure_type<br>
                                        DW_AT_specification (0x0000002a
                        "x")<br>
                        <br>
                        0x00000124:     DW_TAG_subprogram<br>
                                          DW_AT_linkage_name   
                        ("_ZN1x2f3IiEEiv")<br>
                                          DW_AT_name   
                        ("f3<int>")<br>
                                          DW_AT_type   
                        (0x000000000000005e "int")<br>
                                          DW_AT_declaration     (true)<br>
                                          DW_AT_external        (true)<br>
                                          DW_AT_APPLE_optimized (true)<br>
                        0x00000138:       NULL<br>
                        0x00000139:     NULL<br>
                        <br>
                        0x00000140:   DW_TAG_subprogram<br>
                                        DW_AT_low_pc   
                        (0x0000000100000fa0)<br>
                                        DW_AT_high_pc  
                        (0x0000000100000fab)<br>
                                        DW_AT_specification    
                        (0x0000000000000124 "_ZN1x2f3IiEEiv")<br>
                        0x00000155:     NULL<br>
                        <br>
                        Did I correctly get the idea?<br>
                      </div>
                    </blockquote>
                    <div><br>
                    </div>
                    <div>Yep, more or less. It'd be "safer" if 11b
                      didn't use DW_AT_specification to refer to 2a, but
                      instead was only a completely independent
                      declaration of "x" - that path is already well
                      supported/tested (well, it's the work-in-progress
                      stuff for lldb to support -fno-standalone-debug,
                      but gdb's been consuming DWARF like this for
                      years, Clang and GCC both produce DWARF like this
                      (if the type is "homed" in another file, then
                      Clang/GCC produce DWARF that emits a declaration
                      with just the members needed to define any member
                      functions defined/inlined/referenced in this CU))
                      for years.<br>
                      <br>
                      But using DW_AT_specification, or maybe some other
                      extension attribute might make the consumers task
                      a bit easier (could do both - use an extension
                      attribute to tie them up, leave
                      DW_AT_declaration/DW_AT_name here for consumers
                      that don't understand the extension attribute) in
                      finding that they're all the same type/pieces of
                      teh same type.</div>
                    <div> </div>
                  </div>
                </div>
              </blockquote>
              <p>yes. would try this solution.</p>
              <p>Thank you, Alexey.<br>
              </p>
              <br>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
  </body>
</html>