<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 20, 2019 at 1:41 PM Alexey Lapshin <<a href="mailto:a.v.lapshin@mail.ru" target="_blank">a.v.lapshin@mail.ru</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div bgcolor="#FFFFFF">

    <p><br>

    </p>

    <div>19.09.2019 4:24, David Blaikie пишет:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div dir="ltr"><br>

        </div>

        <br>

        <div class="gmail_quote">

          <div dir="ltr" class="gmail_attr">On Wed, Sep 18, 2019 at 7:25

            AM Alexey Lapshin <<a href="mailto:a.v.lapshin@mail.ru" target="_blank">a.v.lapshin@mail.ru</a>> wrote:<br>

          </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF">

              <p><br>

              </p>

              <blockquote type="cite">

                <div dir="ltr">

                  <div class="gmail_quote"><br>

                    <div> </div>

                  </div>

                </div>

              </blockquote>

              Generally speaking, dsymutil does a very similar thing. It

              parses DWARF DIEs, analyzes relocations, scans through

              references and throws out unused DIEs. But it`s current

              interface does not allow to use it at link stage. <br>

               I think it would be perfect to have a singular

              implementation. <br>

               Though I did not analyze how easy or is it possible to

              reuse its code at the link stage, it looked like it needs

              a significant rework. <br>

               <br>

               Implementation from this proposal does removing of

              obsolete debug info at link stage. <br>

               And so has benefits of already loaded object files,

              already created liveness information, <br>

               generating an optimized binary from scratch.<br>

              <p><br>

              </p>

              <p>If dsymutil could be refactored in such manner that

                could be used at the link stage, then it`s

                implementation could be reused. I would research the

                possibility of such a refactoring.</p>

            </div>

          </blockquote>

          <div>Yeah, if this is going to be implemented, I think that

            would be strongly preferred - though I realize it may be

            substantial work to refactor. The alternative - duplicating

            all this work - doesn't seem like something that would be

            good for the LLVM project.<br>

          </div>

        </div>

      </div>

    </blockquote>

    <p>I see. So I would research the question of whether it is possible

      to refactor it accordingly.<br>

    </p>

    <p><br>

    </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF">

              <blockquote type="cite">

                <div dir="ltr">

                  <div class="gmail_quote">

                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                      <div>1. Minimize or entirely avoid references from

                        subprograms into other parts of .debug_info

                        section. That would simplify splitting and

                        removing subprograms out in that sense that it

                        would minimize the number of references that

                        should be parsed and followed.

                        (DW_FORM_ref_subroutine instead of

                        DW_FORM_ref_*, ?)<br>

                      </div>

                    </blockquote>

                    <div><br>

                      Not sure I follow - by "other parts of the

                      .debug_info section" do you mean in the same CU,

                      or cross CU references? Any particular references

                      you have in mind? Or encountered in practice?<br>

                    </div>

                  </div>

                </div>

              </blockquote>

              I mean here all kinds of references into .debug_info

              section.</div>

          </blockquote>

          <div><br>

          </div>

          <div>Ah, not only references from other places /into/

            .debug_info (which don't really exist, so far as I know) but

            any references to locations within debug_info.<br>

            <br>

            Reducing these isn't super-viable - types being the most

            common examples. Though now I understand what you're getting

            at partly around the debug_type_table idea - adding a level

            of indirection to type references. So it'd be easy to find

            only one place to fix when removing chunks of debug_info

            (updating only the type table without having to find all the

            places inside debug_info to touch). That indirection would

            come at a size cost, of course - and an overhead for DWARF

            parsers having to follow that indirection. Doesn't make it

            impossible - just tradeoffs to be aware of.<br>

            <br>

            Though that's not the only DIE references - without removing

            them all there'd still be a fair bit of overhead for finding

            any remaining ones and applying them. If an indirection

            table is to be added, maybe a generalized one (for any DIE

            reference) rather than one only for types would be good.<br>

            <br>

          </div>

        </div>

      </div>

    </blockquote>

    <p>yes, some general indirection table would probably be useful. <br>

      But, types would still require specialized handling.<br>

      Types have "type hash" and need some specific logic around that.<br></p></div></blockquote><div>This indirection is essentially the same as relocations & could be implemented that way (though no matter the solution you'd need some attribute on the CU that says "I don't use any CU-local DIE offsets" so an implementation didn't have to go searching/scanning for such offsets (though I guess it'd be cheap to scan for that by just looking at the abbreviations & if you don't see any CU-local DIE offset forms, use the fast-path)). A custom DWARF format would be potentially more compact than general ELF relocations. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF"><p>

    </p>

    <p><br>

    </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <div>(aspects of this have been discusesd before - we've

            sometimes nicknamed it "bag of DWARF" when discussing it in

            the context of type units (currently you can only reference

            the type DIE in a type unit - which adds overhead when

            wanting to reference subprogram declaration DIEs, etc (or

            maybe multiple types are clustered together and don't need a

            separate type unit each - if only you could refer to

            multiple types in a type unit) - so we've discussed

            generalizing the type unit header (actually it could

            generalize even as far as the classic CU header) to have N

            type DIE offset+hash pairs (zero for a normal CU, one for a

            classic type unit, and any number for more interesting

            cases))</div>

        </div>

      </div>

    </blockquote>

    <p>As far as I understand, "generalizing the type unit header

      (actually it could generalize even as far as the classic CU

      header) to have N type DIE offset+hash pairs" looks very close to

      "global type table" which I am talking about.<br>

    </p>

    <p><br>

    </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <div> </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF"> Going through references is the

              time-consuming task. <br>

              Thus the fewer references there should be followed then

              the faster it works.<br>

               <br>

              For the cross CU references - It requires to load

              referenced CU. I do not know use cases where cross CU

              references are used.</div>

          </blockquote>

          <div><br>

            Cross-CU inlining due to LTO. Try something like this:<br>

            <br>

            <font face="monospace">a.cpp:<br>

                void f2();<br>

                __attribute__((always_inline)) void f1() {<br>

                  f2();<br>

                }<br>

              <br>

              b.cpp:<br>

                void f1();<br>

                int main() {<br>

                  f1();<br>

                }<br>

              <br>

              $ clang++ a.cpp b.cpp -emit-llvm -S -c -g<br>

              $ llvm-link a.ll b.ll -o ab.bc<br>

              $ clang++ ab.bc -c<br>

              $ llvm-dwarfdump ab.o -v -debug-info | <br>

              0x0b: DW_TAG_compile_unit<br>

                      DW_AT_name "a.cpp"<br>

              0x2a:   DW_TAG_subprogram<br>

                        DW_AT_abstract_origin [DW_FORM_ref4] (cu +

              0x0056 => {0x00000056} "_Z2f1v")<br>

                      DW_TAG_subprogram<br>

                        DW_AT_name "f1"<br>

              0x6e: DW_TAG_compile_unit<br>

                      DW_AT_name "b.cpp"<br>

              0x8d:   DW_TAG_subprogram<br>

                        DW_AT_name "main"<br>

              0xa6:     DW_TAG_inlined_subroutine<br>

                          DW_AT_abstract_origin [DW_FORM_ref_addr]

              (0x0000000000000056 "_Z2f1v")</font><br>

            <br>

            ueaueoa</div>

          <div>ueaoueoa<br>

            <br>

            Notice that the inlined_subroutine's abstract_origin uses a

            linker relocation into the debug_info section to give an

            absolute offset within the finally linked debug_info section

            (since the debugger wouldn't know that these two

            compile_units are bound together and to use some particular

            compile_unit as the base offset - either it's absolute

            across the whole debug_info section (FORM_ref_addr) or it's

            local to the CU (FORM_refN (such as FORM_ref4 above)))<br>

          </div>

        </div>

      </div>

    </blockquote>

    <p>Got it. Thank you.<br>

    </p>

    <p><br>

    </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <div> </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF"> If that is the specific case and is

              not used inside subprograms usually, then probably it is

              possible to avoid it.<br>

            </div>

          </blockquote>

          <div><br>

            It's fairly specifically used inside subprograms (&

            would need to be adjusted even if it wasn't inside a

            subprogram - when bytes are removed, etc) - though possibly

            general relocation handling in the linker could be used to

            implement handling ref_addr.<br>

             </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF">For the same CU - there could

              probably be cases when references could be ignored: <a href="https://reviews.llvm.org/P8165" target="_blank">https://reviews.llvm.org/P8165</a></div>

          </blockquote>

          <div><br>

            How would references be ignored while keeping them correct?

            Ah, by making subprograms more self-contained - maybe, but

            the work to figure out which things are only referenced from

            one place and structure the DWARF differently probably

            wouldn't be ideal in the compiler & wouldn't save the

            debug info linker from having to haev code to handle the

            case where it wasn't only used from that subprogram anyway.<br>

             </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF">

              <blockquote type="cite">

                <div dir="ltr">

                  <div class="gmail_quote">

                    <div> </div>

                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                      <div>2. Create additional section - global types

                        table (.debug_types_table). That would

                        significantly reduce the number of references

                        inside .debug_info section. It also makes it

                        possible to have a 4-byte reference in this

                        section instead of 8-bytes reference into type

                        unit (DW_FORM_ref_types instead of

                        DW_FORM_ref_sig8). It also makes it possible to

                        place base types into this section and avoid

                        per-compile unit duplication of them.

                        Additionally, there could be achieved size

                        reduction by not generating type unit header.

                        Note, that new section - .debug_types_table -

                        differs from DWARF4 section .debug_types in that

                        sense that: it contains unique type descriptors

                        referenced by offsets instead of list of type

                        units referenced by DW_FORM_ref_sig8;  all table

                        entries share the same abbreviations and do not

                        have type unit headers.<br>

                      </div>

                    </blockquote>

                    <div><br>

                      What do you mean when you say "global types table"

                      the phrasing in the above paragraph is

                      present-tense, as though this thing exists but

                      doesn't seem to describe what it actually is and

                      how it achieves the things the text says it

                      achieves. Perhaps I've missed some context here.<br>

                    </div>

                  </div>

                </div>

              </blockquote>

              <p><br>

              </p>

              <p>The "global types table" does not exist yet. It could

                be created if the discussed approach would be considered

                useful. <br>

              </p>

            </div>

          </blockquote>

          <div><br>

          </div>

          <div>Ah, the present-tense language was a bit confusing for me

            when discussing a thing that doesn't exist yet & not

            having provided a description of what it might be or might

            contain and why it would exist/what it would achieve.</div>

        </div>

      </div>

    </blockquote>

    <p>I should've written it more precise.<br>

    </p>

    <p><br>

    </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <div> </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF">

              <p> Please check the comparison of possible "global types

                table" and currently existed type units: <a href="https://reviews.llvm.org/P8164" target="_blank">https://reviews.llvm.org/P8164</a></p>

            </div>

          </blockquote>

          <div>Ah, that proposed version makes it easy to remove

            subprograms from debug_info without having to fix up type

            references (but you still have to have the code to fix up

            other cross-CU references, like abstract_origin, so I'm not

            sure it provides that much value) but doesn't make it easy

            to remove types (becaues you'd have to go looking through

            the debug_info section to update all the type offsets (which

            I guess you have to do anyway to find the type references) 

            and removing the types still also requires fixing up the

            types that reference each other... <br>

            <br>

            So I'm not seeing a big win there.</div>

        </div>

      </div>

    </blockquote>

    <p>Correct. Even if types were put into a separated table, there

      still would be necessary to:  <br>

       "go looking through the debug_info section to update all the type

      offsets"; <br>

       "removing the types still also requires fixing up the types that

      reference each other".<br>

      <br>

       But additionally it allows to have following benefits:<br>

      <br>

       1. Size reduction by remove fragmentation. In

      "-fdebug-types-section" solution every type which is put  into

      type unit requires:<br>

         - additional type unit header, <br>

         - section header(since it put into separate section), <br>

         - proxy type copies inside compilation unit. <br>

      <br>

        Putting types into separate table allows not to create above

      data for every type.</p>

    <p>2. Size reduction by deduplicate base types. In

      "-fdebug-types-section" solution base types are not deduplicated

      at all.</p></div></blockquote><div><br>Base types are pretty small - not sure there'd be much to save by indirection (for classic base types like "int" - for non-trivial but non-user-defined types like subroutine types there might be more opportunity for savings). & you'd still have some cost of indirection to tradeoff - so I don't think it's always going to be the right solution to indirect everything. <br><br>There's a lot of design considerations in this problem space, let's put it that way.<br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF"><p> </p>

    <p>3. Performance improvement by handling fewer data. #1 leads to

      loading and parsing fewer bits.<br>

      <br>

      4. Performance improvement by handling fewer references. Simpler

      reference chains allow parsing references faster.<br>

        Instead of this : <br>

      <br>

type_offset->proxy_type->DW_FORM_ref_sig8->type_unit->type_offset->type.<br>

      <br>

        There would be this :<br>

      <br>

        type_offset->type_table->type.<br></p></div></blockquote><div>Yep, though to avoid the need for the proxy type you'd need to be able to refer to other entities in the "bag of DWARF"/generalized type unit (things like member function declarations and the like)<br><br>Yes, "bag of DWARF" or generalized type units (where you can refer to multiple entities in a single unit by some kind of hash) has some benefits.<br><br>But it seems somewhat orthogonal to your debug info linking goals here, unless it is a solution that removes the need for parsing the DWARF.<br><br>Another way to consider this would be to model (or actually implement) inter-DIE references as relocations (DW_FORM_sec_offset instead of a cu offset) - ah, I mentioned that earlier (I'm writing this reply out of order).</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF"><p>

    </p>

    <p> </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF">

              <blockquote type="cite">

                <div dir="ltr">

                  <div class="gmail_quote">

                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                      <div><br>

                        We evaluated the approach on LLVM and Clang

                        codebases. The results obtained are summarized

                        in the tables below:<br>

                      </div>

                    </blockquote>

                    <div><br>

                      Memory usage statistics (& confidence

                      intervals for the build time) would probably be

                      especially useful for comparing these tradeoffs.<br>

                      Doubly so when using compression (since the

                      decompression would need to use more memory, as

                      would the recompression - so, two different

                      tradeoffs (compressed input, compressed output,

                      and then both at the same time))<br>

                    </div>

                  </div>

                </div>

              </blockquote>

              <p>I would measure memory impact for that PoC

                implementation, but I expect it would be significant. <br>

                Memory usage was not optimized yet. There are several

                things which might be done to reduce memory footprint:<br>

                do not load all compile units into memory, avoid adding

                Parent field to all DIEs.</p>

            </div>

          </blockquote>

          <div>Yep, this is the sort of thing where I suspect the

            dsymutil implementation may've already had at least some of

            that work done - or, if not, that doing the work once for

            both/all implementations would be very preferable to

            duplicating the effort.<br>

          </div>

        </div>

      </div>

    </blockquote>

    <p>Ok, <br>

    </p>

    <p><br>

    </p>

    <p>Thank you, Alexey.<br>

    </p>

    <p><br>

    </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <div><br>

            - Dave </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF">

              <p>Alexey.<br>

              </p>

              <blockquote type="cite">

                <div dir="ltr">

                  <div class="gmail_quote">

                    <div> </div>

                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                      <div><br>

                      </div>

                      _______________________________________________<br>

                      LLVM Developers mailing list<br>

                      <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

                      <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

                    </blockquote>

                  </div>

                </div>

              </blockquote>

            </div>

          </blockquote>

        </div>

      </div>

    </blockquote>

  </div>

</blockquote></div></div>