[llvm] r191792 - Debug Info: remove duplication of DIEs when a DIE is part of the type system

David Blaikie dblaikie at gmail.com
Fri Oct 4 11:22:03 PDT 2013


On Fri, Oct 4, 2013 at 9:41 AM, Eric Christopher <echristo at gmail.com> wrote:

> Hi Manman,
>
> In reviewing this code I've found a few problems with how you're
> emitting the debug info for ref_addr. It needs to be a relocatable
> value from the beginning of the section and while it's a reasonably
> straightforward change it is going to require some work to plumb I
> think. With this and the design discussions we've been going through
> with it I'm going to revert the patch for now and get you a testcase.
>

To expound on this further we saw crashes in GDB. Here's a worked example
demonstrating the problem:

hdr.h:
  struct foo {
  }

tu1.cpp
  #include "hdr.h"
  foo f;

tu2.cpp
  #include "hdr.h"
  foo g;

tu3.cpp
  int main() {
  }

$ clang++ tu1.cpp tu2.cpp -g -emit-llvm -c
$ llvm-link tu1.bc tu2.bc -o tu12.bc -S
$ llc -filetype=obj tu12.bc
$ clang++ tu3.cpp -g -c
$ clang++ tu3.o tu12.o
$ llvm-dwarfdump a.out
.debug_info contents:
0x00000000: Compile Unit: length = 0x00000047 version = 0x0004 abbr_offset
= 0x0000 addr_size = 0x08 (next unit at 0x0000004b)

0x0000000b: DW_TAG_compile_unit [1] *
              DW_AT_producer [DW_FORM_strp]     ( .debug_str[0x00000000] =
"clang version 3.4 ")
              DW_AT_language [DW_FORM_data2]    (0x0004)
              DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000013] =
"tu3.cpp")
              DW_AT_low_pc [DW_FORM_addr]       (0x0000000000000000)
              DW_AT_stmt_list [DW_FORM_sec_offset]      (0x00000000)
              DW_AT_comp_dir [DW_FORM_strp]     ( .debug_str[0x0000001b] =
"/tmp/dbginfo")

0x00000026:   DW_TAG_subprogram [2]
                DW_AT_name [DW_FORM_strp]       ( .debug_str[0x00000028] =
"main")
                DW_AT_decl_file [DW_FORM_data1] (0x01)
                DW_AT_decl_line [DW_FORM_data1] (0x01)
                DW_AT_type [DW_FORM_ref4]       (cu + 0x0043 =>
{0x00000043})
                DW_AT_external [DW_FORM_flag_present]   (true)
                DW_AT_low_pc [DW_FORM_addr]     (0x0000000000400590)
                DW_AT_high_pc [DW_FORM_addr]    (0x000000000040059b)
                DW_AT_frame_base [DW_FORM_block1]       (<0x01> 56 )

0x00000043:   DW_TAG_base_type [3]
                DW_AT_name [DW_FORM_strp]       ( .debug_str[0x0000002d] =
"int")
                DW_AT_encoding [DW_FORM_data1]  (0x05)
                DW_AT_byte_size [DW_FORM_data1] (0x04)

0x0000004a:   NULL
0x0000004b: Compile Unit: length = 0x00000040 version = 0x0004 abbr_offset
= 0x0032 addr_size = 0x08 (next unit at 0x0000008f)

0x00000056: DW_TAG_compile_unit [1] *
              DW_AT_producer [DW_FORM_strp]     ( .debug_str[0x00000000] =
"clang version 3.4 ")
              DW_AT_language [DW_FORM_data2]    (0x0004)
              DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000031] =
"tu1.cpp")
              DW_AT_low_pc [DW_FORM_addr]       (0x0000000000000000)
              DW_AT_stmt_list [DW_FORM_sec_offset]      (0x0000003b)
              DW_AT_comp_dir [DW_FORM_strp]     ( .debug_str[0x0000001b] =
"/tmp/dbginfo")

0x00000071:   DW_TAG_structure_type [2]
                DW_AT_name [DW_FORM_strp]       ( .debug_str[0x0000003b] =
"foo")
                DW_AT_byte_size [DW_FORM_data1] (0x01)
                DW_AT_decl_file [DW_FORM_data1] (0x02)
                DW_AT_decl_line [DW_FORM_data1] (0x01)

0x00000079:   DW_TAG_variable [3]
                DW_AT_name [DW_FORM_strp]       ( .debug_str[0x00000039] =
"f")
                DW_AT_type [DW_FORM_ref4]       (cu + 0x0026 =>
{0x00000071})
                DW_AT_external [DW_FORM_flag_present]   (true)
                DW_AT_decl_file [DW_FORM_data1] (0x01)
                DW_AT_decl_line [DW_FORM_data1] (0x02)
                DW_AT_location [DW_FORM_block1] (<0x09> 03 2c 20 40 00 00
00 00 00 )

0x0000008e:   NULL
0x0000008f: Compile Unit: length = 0x00000038 version = 0x0004 abbr_offset
= 0x0032 addr_size = 0x08 (next unit at 0x000000cb)

0x0000009a: DW_TAG_compile_unit [1] *
              DW_AT_producer [DW_FORM_strp]     ( .debug_str[0x00000000] =
"clang version 3.4 ")
              DW_AT_language [DW_FORM_data2]    (0x0004)
              DW_AT_name [DW_FORM_strp] ( .debug_str[0x0000003f] =
"tu2.cpp")
              DW_AT_low_pc [DW_FORM_addr]       (0x0000000000000000)
              DW_AT_stmt_list [DW_FORM_sec_offset]      (0x0000003b)
              DW_AT_comp_dir [DW_FORM_strp]     ( .debug_str[0x0000001b] =
"/tmp/dbginfo")

0x000000b5:   DW_TAG_variable [4]
                DW_AT_name [DW_FORM_strp]       ( .debug_str[0x00000047] =
"g")
                DW_AT_external [DW_FORM_flag_present]   (true)
                DW_AT_decl_file [DW_FORM_data1] (0x03)
                DW_AT_decl_line [DW_FORM_data1] (0x02)
                DW_AT_location [DW_FORM_block1] (<0x09> 03 2d 20 40 00 00
00 00 00 )
                DW_AT_type [DW_FORM_ref_addr]   (0x0000000000000026)

0x000000ca:   NULL

So, while the type of the variable 'g' in tu2.cpp should be 'foo' (offset
0x71) it instead refers to the DIE for the main function in tu3.cpp. (this
gets even worse, of course, once the DIEs don't line up - and the offset
might not even point to the start of a DIE, but instead to some attribute
or portion of an attribute).

Does this help/make sense?


>
> This is a good idea for work and is something I definitely want us to
> pursue in the long run (or immediate term depending on your
> priorities) so don't look at this as anything more than me needing to
> revert due to a bug.
>
> Thanks!
>
> -eric
>
>
> On Thu, Oct 3, 2013 at 8:50 PM, Manman Ren <manman.ren at gmail.com> wrote:
> >
> >
> >
> > On Thu, Oct 3, 2013 at 4:24 PM, David Blaikie <dblaikie at gmail.com>
> wrote:
> >>
> >>
> >>
> >>
> >> On Thu, Oct 3, 2013 at 4:10 PM, Manman Ren <manman.ren at gmail.com>
> wrote:
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Oct 3, 2013 at 2:06 PM, David Blaikie <dblaikie at gmail.com>
> wrote:
> >>>>
> >>>>
> >>>>>> 3GB memory reduction for xalan built with lto -g (without removing
> DIE
> >>>>>> duplication, the memory usage is 7GB)
> >>>>>
> >>>>>
> >>>>> OK, so you're looking at memory usage during LTO, not the size of the
> >>>>> resulting debug info? (that's fine, just trying to understand
> exactly what
> >>>>> metrics you're targeting)
> >>>>
> >>>>
> >>>> One thought I had (and just discussed with Eric - he seemed to think
> it
> >>>> was plausible, at least):
> >>>>
> >>>> If you're interested in reducing peak memory usage of LLVM during LTO,
> >>>> have you considered modifying debug info generation not to cache all
> the
> >>>> compile_units before emitting them? If we emitted one CU at a time as
> they
> >>>> were created I imagine the memory footprint would be much lower.
> >>>
> >>>
> >>> Yes, that is on my todo list after type uniquing is done.
> >>
> >>
> >> So, as I said, I think it might be better to do that work first. It may
> >> have a substantial impact on cross-CU references. (see below)
> >>
> >>>
> >>> Right now, we don't release any memory and we have layers of memory
> usage
> >>> related to debug info: MDNodes, DIEs and some data in MC layer.
> >>>  All these layers exist for the whole program with LTO.
> >>>
> >>>>
> >>>> (and we could still, potentially, add the type cross-referencing by
> just
> >>>> caching the section offsets or labels, etc, of the types emitted in
> prior
> >>>> units - though that wouldn't be perfect for types that contain some
> members
> >>>> (implicit special members) in one CU and a different set of members in
> >>>> another CU (and it'll be a bit trickier for type units - we'd need to
> cache
> >>>> any subgraph of potentially self-referencing type units until all the
> >>>> signatures are resolved, then they could all be emitted))
> >>>>
> >>>> This isn't necessarily a place you should start with - I can
> appreciate
> >>>> that your current approach is probably higher value to you for lower
> >>>> engineering cost (and thus gets you closer sooner) but if it's still
> not
> >>>> going to be sufficient for your needs, especially if you're going to
> have to
> >>>> implement something like emit-CU-at-a-time as I describe above
> anyway, it
> >>>> might be more valuable to lay that foundation first.
> >>>
> >>>
> >>> Even if we emit one CU at a time, it is still good to share the types
> >>> across-CU so we don't waist time constructing the type DIEs for each
> CU, and
> >>> this approach (remove DIE duplication) also reduces the raw DWARF size.
> >>
> >>
> >> I agree, from an output size perspective we'll still want to share DIEs
> >> across CUs - but I think you might see bigger memory footprint (since
> that's
> >> the stat you're aiming at at the moment) wins from this change, but more
> >> importantly - I think a change like that will have a significant impact
> on a
> >> feature like the cross-CU DIE referencing you're doing here.
> >>
> >> If we emit one CU at a time, cross-CU DIE references will look very
> >> different - it won't be simply a matter of getting a DIE from the
> cross-CU
> >> map anymore. Likely we'll want to store a much more minimal form of the
> DIE
> >> (just it's section offset or label) in a table once the prior CU has
> been
> >> emitted. Then when a future CU is emitted and wants to reference that,
> we
> >> can lookup that remnant in the table - the whole DIE won't exist
> anymore.
> >
> >
> > With "emit-a-CU-at-a-time", we can still keep the type DIEs around since
> > they are going to be shared across CUs.
> > We can also store a minimal form of the type DIEs to further reduce the
> > memory footprint.
> >
> > I agree that we have to change the cross-CU references when we only
> store a
> > minimal form of the type DIEs. But the change should
> > not be complicated, we now store a map from MDNode to DIE, and that can
> be
> > changed to map from MDNode to MinimalDIE.
> >
> > "emit-a-CU-at-a-time" is a known solution to reduce memory footprint,
> but I
> > don't have a time line for when I will work on it.
> >
> >>
> >>
> >> It seems like the bigger change should go first so we don't avoid
> >> discussing a lot of this design only to have to rework it substantially
> when
> >> we emit a CU-at-a-time.
> >
> > Since we already did a lot of discussion about this design and it is not
> > complicated (about 30 to 40 lines of code), let's discuss
> > "emit-a-CU-at-a-time" when it is somebody's immediate task.
> >
> > Thanks,
> > Manman
> >
> >>
> >>>
> >>>
> >>>>
> >>>>
> >>>> If not, I'd still probably like to discuss the design of the cross-CU
> >>>> referencing work you have here to decide on the right design in a
> normal
> >>>> patch review process rather than trying to think about how the
> current code
> >>>> can be morphed into a better design.
> >>>
> >>>
> >>> What other designs do you have in mind?
> >>
> >>
> >> Just some of the stuff we've already been discussing such as merging the
> >> maps and simplifying the DwarfDebug entry points. But if you're already
> >> planning to do CU-at-a-time, I wonder whether it's worth discussing the
> >> design of cross-CU references before we do CU-at-a-time - it seems to me
> >> like the implementation would change quite significantly.
> >>
> >>>
> >>>
> >>> Manman
> >>>>
> >>>>
> >>>>
> >>>> - David
> >>>
> >>>
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131004/be0f0d05/attachment.html>


More information about the llvm-commits mailing list