[PATCH] D67469: [WIP][Debuginfo][LLD] Remove obsolete debug info while garbage collecting.

Alexey Lapshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 11 14:44:48 PDT 2019


avl created this revision.
Herald added subscribers: llvm-commits, mgrang, MaskRay, hiraditya, arichardson, aprantl, mgorny, emaste.
Herald added a reviewer: espindola.
Herald added a project: LLVM.

Remove obsolete debug info while garbage collecting.

This patch does "Proof of Concept" implementation for the "removing obsolete debug info while garbage collecting" problem.
It implements two features "Remove obsolete debug info" and "Alternative implementation for the types deduplication":




1. Remove obsolete debug info.

  Currently when the linker does garbage collection a lot of abandoned debug info is left behind. The patch skips the debug info (subprograms, address ranges, line sentences) that corresponds to the dead sections.

  Command-line option is added to lld: --gc-debuginfo. It removes pieces of debug information related to the discarded sections.

  Short description of the implementation:
  1. Examine compile unit`s address ranges, check whether they resolved into live section and put the corresponding subprogram into live subprograms list.

    For .debug_info section:
  2. Parse abbreviations and note attributes of lld/test/ELF/reference type. DIE with references should be patched, since offsets could become incorrect if some parts of .debug_info sections would be removed.
  3. Go through the list and parse live subprograms, scan through all references and mark reached DIEs as live.
  4. Scan all compile unit`s DIEs in continuous order, mark live DIEs for copying into the output section.
  5. After the whole marking finished: scan for .debug_info references and patch them according to the new section layout.

    For .debug_ranges/debug_rnglists
  6. Mark address entries related to live sections for copying into the output section.

    For .debug_line section:
  7. Mark line program pieces related to live sections for copying into the output section.

    Result(for gc-debuginfo-wo-types.s):

    A. .debug_info section:



              w/o -gc-debuginfo:                 |         with -gc-debuginfo:
                                                 |
                                                 |
  0x0000000b: DW_TAG_compile_unit                | 0x0000000b: DW_TAG_compile_unit
    DW_AT_name        ("mod1.cpp")               |   DW_AT_name        ("mod1.cpp")
    DW_AT_low_pc      (0x0000000000000000)       |   DW_AT_low_pc      (0x0000000000000000)
    DW_AT_ranges      (0x00000000                |   DW_AT_ranges      (0x00000000
       [0x0000000000201010, 0x0000000000201018)  |     [0x0000000000201010, 0x0000000000201018))
       [0x0000000000000000, 0x0000000000000008)) |
                                                 |
  0x0000002a:   DW_TAG_subprogram                | 0x0000002a:   DW_TAG_subprogram
    DW_AT_low_pc    (0x0000000000201010)         |   DW_AT_low_pc    (0x0000000000201010)
    DW_AT_high_pc   (0x0000000000201018)         |   DW_AT_high_pc   (0x0000000000201018)
    DW_AT_name      ("func_mod1_used")           |   DW_AT_name      ("func_mod1_used")
    DW_AT_type      (0x00000082 "int")           |   DW_AT_type      (0x00000056 "int")
                                                 |
  0x00000047:     DW_TAG_formal_parameter        | 0x00000047:     DW_TAG_formal_parameter
    DW_AT_name    ("p1")                         |   DW_AT_name    ("p1")
    DW_AT_type    (0x00000082 "int")             |   DW_AT_type    (0x00000056 "int")
                                                 |
  0x00000055:     NULL                           | 0x00000055:     NULL
                                                 |
  0x00000056:   DW_TAG_subprogram                | 0x00000056:   DW_TAG_base_type
    DW_AT_low_pc    (0x0000000000000000)         |   DW_AT_name      ("int")
    DW_AT_high_pc   (0x0000000000000008)         |
    DW_AT_name      ("func_mod1_not_used")       | 0x0000005d:   NULL
    DW_AT_type      (0x00000082 "int")           |
                                                 |
  0x00000073:     DW_TAG_formal_parameter        |
    DW_AT_name    ("p1")                         |
    DW_AT_type    (0x00000082 "int")             |
                                                 | 
  0x00000081:     NULL                           |
                                                 |
  0x00000082:   DW_TAG_base_type                 |
    DW_AT_name      ("int")                      |
                                                 |
  0x00000089:   NULL                             |

  B. .debug_ranges section:

              w/o -gc-debuginfo:                 |       with -gc-debuginfo:
                                                 |
  00000000 0000000000201010 0000000000201018     | 00000000 0000000000201010 0000000000201018 
  00000000 0000000000000000 0000000000000008     | 00000000 <End of list>
  00000000 <End of list>                         | 00000020 0000000000201020 0000000000201033
  00000030 0000000000201020 0000000000201033     | 00000020 <End of list>
  00000030 0000000000000000 000000000000000c     |
  00000030 <End of list>                         |

  C. .debug_line section:

w/o -gc-debuginfo:

  debug_line[0x0000005c]
  .....
  Address            Line   Column File   ISA Discriminator Flags
  ------------------ ------ ------ ------ --- ------------- -------------
  0x0000000000201010      3      0      1   0             0  is_stmt
  0x0000000000201014      4     15      1   0             0  is_stmt prologue_end
  0x0000000000201017      4      5      1   0             0 
  0x0000000000201018      4      5      1   0             0  end_sequence
  0x0000000000000000      7      0      1   0             0  is_stmt
  0x0000000000000004      8     15      1   0             0  is_stmt prologue_end
  0x0000000000000007      8      5      1   0             0 
  0x0000000000000008      8      5      1   0             0  end_sequence

with -gc-debuginfo:

  debug_line[0x0000005c]
  .....
  Address            Line   Column File   ISA Discriminator Flags
  ------------------ ------ ------ ------ --- ------------- -------------
  0x0000000000201010      3      0      1   0             0  is_stmt
  0x0000000000201014      4     15      1   0             0  is_stmt prologue_end
  0x0000000000201017      4      5      1   0             0 
  0x0000000000201018      4      5      1   0             0  end_sequence




2. Alternative implementation for the types deduplication.

  There already exists implementation for types deduplication: -fdebug-types-section. This patch uses another solution : parse DWARF, cut out duplicated types, patch type references to point to the single type definition.

  Command-line option is added to lld: --gc-debuginfo-types. It does alternative type deduplication while doing --gc-debuginfo.

  Short description of the implementation:
  1. Parse abbreviations and note attributes of reference type. references of DW_FORM_ref4 kind pointing to the types would be changed into DW_FORM_ref_addr.
  2. Scan all compile unit`s DIEs in continuous order, calculate type hash for types. Mark new type descriptions as live and put it`s offset into type map. Skip type descriptions which already have matching entry in the type map.
  3. After the whole marking finished: scan for type references and patch them according to the created type offsets map. Also scan for .debug_info references and patch them according to the new section layout.

    Result(for gc-debuginfo-with-types.s):



  0x00000000: Compile Unit: length = 0x0000005a           | 0x00000000: Compile Unit: length = 0x0000005a 
                                                          |
  0x0000000b: DW_TAG_compile_unit                         | 0x0000000b: DW_TAG_compile_unit
    DW_AT_name        ("mod1.cpp")                        |   DW_AT_name        ("mod1.cpp")
                                                          |
  0x0000002a:   DW_TAG_subprogram                         | 0x0000002a:   DW_TAG_subprogram 
    DW_AT_name      ("func_mod1_used")                    |   DW_AT_name      ("func_mod1_used")
    DW_AT_type      (0x00000056 "int")                    |   DW_AT_type      (0x0000000000000056 "int")
                                                          |
  0x00000047:     DW_TAG_formal_parameter                 | 0x00000047:     DW_TAG_formal_parameter
    DW_AT_name    ("p1")                                  |   DW_AT_name    ("p1")
    DW_AT_type    (0x00000056 "int")                      |   DW_AT_type    (0x0000000000000056 "int")
                                                          |
  0x00000055:     NULL                                    | 0x00000055:     NULL
                                                          |
  0x00000056:   DW_TAG_base_type                          | 0x00000056:   DW_TAG_base_type
    DW_AT_name      ("int")                               |   DW_AT_name      ("int")
                                                          |
  0x0000005d:   NULL                                      | 0x0000005d:   NULL
                                                          |
  0x0000005e: Compile Unit: length = 0x000000a7           | 0x0000005e: Compile Unit: length = 0x000000a0
                                                          |
  0x00000069: DW_TAG_compile_unit                         | 0x00000069: DW_TAG_compile_unit
    DW_AT_name        ("mod2.cpp")                        |   DW_AT_name        ("mod2.cpp")
                                                          |
  0x00000088:   DW_TAG_subprogram                         | 0x00000088:   DW_TAG_subprogram
    DW_AT_name      ("func_mod2_used")                    |   DW_AT_name      ("func_mod2_used")
    DW_AT_type      (0x000000c2 "int")                    |   DW_AT_type      (0x0000000000000056 "int")          
                                                          |
  0x000000a5:     DW_TAG_formal_parameter                 | 0x000000a5:     DW_TAG_formal_parameter
    DW_AT_name    ("p1")                                  |   DW_AT_name    ("p1")
    DW_AT_type    (0x000000c9 "SS")                       |   DW_AT_type    (0x00000000000000c2 "SS")
                                                          |
  0x000000b3:     DW_TAG_variable                         | 0x000000b3:     DW_TAG_variable
    DW_AT_name    ("is")                                  |   DW_AT_name    ("is")
    DW_AT_type    (0x000000ea "inner_SS")                 |   DW_AT_type    (0x00000000000000e3 "inner_SS")
                                                          |
  0x000000c1:     NULL                                    | 0x000000c1:     NULL
                                                          |
  0x000000c2:   DW_TAG_base_type                          |
    DW_AT_name      ("int")                               |
                                                          |
  0x000000c9:   DW_TAG_structure_type                     | 0x000000c2:   DW_TAG_structure_type
    DW_AT_name      ("SS")                                |   DW_AT_name      ("SS")
                                                          |
  0x000000d2:     DW_TAG_member                           | 0x000000cb:     DW_TAG_member
    DW_AT_name    ("a1")                                  |   DW_AT_name    ("a1")
    DW_AT_type    (0x000000c2 "int")                      |   DW_AT_type    (0x0000000000000056 "int")
                                                          |
  0x000000de:     DW_TAG_member                           | 0x000000d7:     DW_TAG_member
    DW_AT_name    ("a2")                                  |   DW_AT_name    ("a2")
    DW_AT_type    (0x00000101 "float")                    |   DW_AT_type    (0x00000000000000fa "float")
                                                          |
  0x000000ea:     DW_TAG_structure_type                   | 0x000000e3:     DW_TAG_structure_type
    DW_AT_name    ("inner_SS")                            |   DW_AT_name    ("inner_SS")
                                                          |
  0x000000f3:       DW_TAG_member                         | 0x000000ec:       DW_TAG_member
    DW_AT_name  ("a1")                                    |   DW_AT_name  ("a1")
    DW_AT_type  (0x000000c2 "int")                        |   DW_AT_type  (0x0000000000000056 "int")
                                                          |
  0x000000ff:       NULL                                  | 0x000000f8:       NULL
                                                          |
  0x00000100:     NULL                                    | 0x000000f9:     NULL
                                                          |
  0x00000101:   DW_TAG_base_type                          | 0x000000fa:   DW_TAG_base_type
    DW_AT_name      ("float")                             |   DW_AT_name      ("float")
                                                          |
  0x00000108:   NULL                                      | 0x00000101:   NULL
                                                          |
  0x00000109: Compile Unit: length = 0x000000a3           | 0x00000102: Compile Unit: length = 0x0000005d
                                                          |
  0x00000114: DW_TAG_compile_unit                         | 0x0000010d: DW_TAG_compile_unit
    DW_AT_name        ("main.cpp")                        |   DW_AT_name        ("main.cpp")
                                                          |
  0x00000133:   DW_TAG_subprogram                         | 0x0000012c:   DW_TAG_subprogram
    DW_AT_name      ("main")                              |   DW_AT_name      ("main")
    DW_AT_type      (0x00000169 "int")                    |   DW_AT_type      (0x0000000000000056 "int")
                                                          |
  0x0000014c:     DW_TAG_variable                         | 0x00000145:     DW_TAG_variable
    DW_AT_name    ("s")                                   |   DW_AT_name    ("s")
    DW_AT_type    (0x00000170 "SS")                       |   DW_AT_type    (0x00000000000000c2 "SS")
                                                          |
  0x0000015a:     DW_TAG_variable                         | 0x00000153:     DW_TAG_variable
    DW_AT_name    ("is")                                  |   DW_AT_name    ("is")
    DW_AT_type    (0x00000191 "inner_SS")                 |   DW_AT_type    (0x00000000000000e3 "inner_SS")
                                                          |
  0x00000168:     NULL                                    | 0x00000161:     NULL
                                                          |
  0x00000169:   DW_TAG_base_type                          | 0x00000162:   NULL
                  DW_AT_name      ("int")                 |
                                                          |
  0x00000170:   DW_TAG_structure_type                     |
                  DW_AT_name      ("SS")                  |
                                                          |
  0x00000179:     DW_TAG_member                           |
                    DW_AT_name    ("a1")                  |
                    DW_AT_type    (0x00000169 "int")      |
                                                          |
  0x00000185:     DW_TAG_member                           |
                    DW_AT_name    ("a2")                  |
                    DW_AT_type    (0x000001a8 "float")    |
                                                          |
  0x00000191:     DW_TAG_structure_type                   |
                    DW_AT_name    ("inner_SS")            |
                                                          |
  0x0000019a:       DW_TAG_member                         |
                      DW_AT_name  ("a1")                  |
                      DW_AT_type  (0x00000169 "int")      |
                                                          |
  0x000001a6:       NULL                                  |
                                                          |
  0x000001a7:     NULL                                    |
                                                          |
  0x000001a8:   DW_TAG_base_type                          |
                  DW_AT_name      ("float")               |
                                                          |
  0x000001af:   NULL                                      |

Files description :

lld/ELF/MarkDebuginfo.h (20 lines)	
lld/ELF/MarkDebuginfo.cpp (98 lines)

  main routine: markUsedDebuginfo().    

lld/ELF/MarkDebugSectionInfo.h (197 lines)	
lld/ELF/MarkDebugSectionInfo.cpp (639 lines)

    Parses .debug_info
  	

lld/ELF/MarkDebugSectionLines.h (142 lines)

    Parses .debug_line
  	

lld/ELF/MarkDebugSectionRanges.h (212 lines)

  Parses .debug_ranges and /debug_rnglists

lld/ELF/MarkDebuginfoTypeHash.h (95 lines)	
lld/ELF/MarkDebuginfoTypeHash.cpp (404 lines)

  Implements type hash, for types comparison.

lld/ELF/InputSection.h

  Implements DebugInputSection.


https://reviews.llvm.org/D67469

Files:
  lld/ELF/CMakeLists.txt
  lld/ELF/Config.h
  lld/ELF/Driver.cpp
  lld/ELF/InputFiles.cpp
  lld/ELF/InputSection.cpp
  lld/ELF/InputSection.h
  lld/ELF/MarkDebugSectionInfo.cpp
  lld/ELF/MarkDebugSectionInfo.h
  lld/ELF/MarkDebugSectionLines.h
  lld/ELF/MarkDebugSectionRanges.h
  lld/ELF/MarkDebuginfo.cpp
  lld/ELF/MarkDebuginfo.h
  lld/ELF/MarkDebuginfoCommon.h
  lld/ELF/MarkDebuginfoTypeHash.cpp
  lld/ELF/MarkDebuginfoTypeHash.h
  lld/ELF/Options.td
  lld/test/ELF/Inputs/gc-debuginfo-main.ll
  lld/test/ELF/Inputs/gc-debuginfo-mod1.ll
  lld/test/ELF/Inputs/gc-debuginfo-mod2.ll
  lld/test/ELF/gc-debuginfo-with-types.s
  lld/test/ELF/gc-debuginfo-wo-types.s
  llvm/include/llvm/DebugInfo/DWARF/DWARFAbbreviationDeclaration.h
  llvm/include/llvm/DebugInfo/DWARF/DWARFDebugLine.h
  llvm/include/llvm/DebugInfo/DWARF/DWARFDie.h
  llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
  llvm/lib/DebugInfo/DWARF/DWARFAbbreviationDeclaration.cpp
  llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
  llvm/lib/DebugInfo/DWARF/DWARFDie.cpp
  llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
  llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D67469.219808.patch
Type: text/x-patch
Size: 122931 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190911/e03a587e/attachment.bin>


More information about the llvm-commits mailing list