[PATCH] D115325: [DWARF] Fix PR51087 Extraneous enum record in DWARF with type units

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 17 13:16:40 PST 2022


dblaikie added a comment.

In D115325#3248802 <https://reviews.llvm.org/D115325#3248802>, @Orlando wrote:

> Sorry it's taken me a while to get back to you on this - thanks for all that info. Here is a summary of the situation as far as I understand it:
>
> (without any version of these patches applied)
>
> 1. When type units are enabled, types that get a TU also get a skeleton CU DIE.

Not entirely true - if a type is referenced from another type unit, and not from a retained type, there's no skeleton DIE in the CU.

eg:

  $ clang++-tot test.cpp -g -c -fdebug-types-section && llvm-dwarfdump-tot test.o -debug-info -debug-types 
  test.o: file format elf64-x86-64
  
  .debug_info contents:
  0x00000000: Compile Unit: length = 0x00000039, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x0000003d)
  
  0x0000000b: DW_TAG_compile_unit
                DW_AT_producer    ("clang version 14.0.0 (git at github.com:llvm/llvm-project.git 499703e9c08a055a9007278e5222b0550aba89bf)")
                DW_AT_language    (DW_LANG_C_plus_plus_14)
                DW_AT_name        ("test.cpp")
                DW_AT_stmt_list   (0x00000000)
                DW_AT_comp_dir    ("/usr/local/google/home/blaikie/dev/scratch")
  
  0x0000001e:   DW_TAG_variable
                  DW_AT_name      ("v1")
                  DW_AT_type      (0x00000033 "t2")
                  DW_AT_external  (true)
                  DW_AT_decl_file ("/usr/local/google/home/blaikie/dev/scratch/test.cpp")
                  DW_AT_decl_line (3)
                  DW_AT_location  (DW_OP_addr 0x0)
  
  0x00000033:   DW_TAG_structure_type
                  DW_AT_declaration       (true)
                  DW_AT_signature (0x4e6669a661523280)
  
  0x0000003c:   NULL
  
  .debug_types contents:
  0x00000000: Type Unit: length = 0x0000003a, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08, name = 't2', type_signature = 0x4e6669a661523280, type_offset = 0x001e (next unit at 0x0000003e)
  
  0x00000017: DW_TAG_type_unit
                DW_AT_language    (DW_LANG_C_plus_plus_14)
                DW_AT_stmt_list   (0x00000000)
  
  0x0000001e:   DW_TAG_structure_type
                  DW_AT_calling_convention        (DW_CC_pass_by_value)
                  DW_AT_name      ("t2")
                  DW_AT_byte_size (0x01)
                  DW_AT_decl_file ("test.cpp")
                  DW_AT_decl_line (2)
  
  0x00000027:     DW_TAG_member
                    DW_AT_name    ("v1")
                    DW_AT_type    (0x00000034 "t1")
                    DW_AT_decl_file       ("test.cpp")
                    DW_AT_decl_line       (2)
                    DW_AT_data_member_location    (0x00)
  
  0x00000033:     NULL
  
  0x00000034:   DW_TAG_structure_type
                  DW_AT_declaration       (true)
                  DW_AT_signature (0xc6694e51369161f2)
  
  0x0000003d:   NULL
  0x00000000: Type Unit: length = 0x00000024, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08, name = 't1', type_signature = 0xc6694e51369161f2, type_offset = 0x001e (next unit at 0x00000028)
  
  0x00000017: DW_TAG_type_unit
                DW_AT_language    (DW_LANG_C_plus_plus_14)
                DW_AT_stmt_list   (0x00000000)
  
  0x0000001e:   DW_TAG_structure_type
                  DW_AT_calling_convention        (DW_CC_pass_by_value)
                  DW_AT_name      ("t1")
                  DW_AT_byte_size (0x01)
                  DW_AT_decl_file ("test.cpp")
                  DW_AT_decl_line (1)
  
  0x00000027:   NULL



> 2. Some unused types are retained on purpose.

Yep.

> 3. Types that are unreferenced within a CU still get (skeleton) DIEs in the CU, even if type units enabled (what this patch is trying to "fix").

Right.

> (using pubnames to refer to both pubnames and pubtypes)
>
> 4. -gpubnames / .debug_pubnames is not compatible with type units.

Doesn't seem to be correct:

  $ clang++-tot test.cpp -g -c -fdebug-types-section -gpubnames && llvm-dwarfdump-tot test.o -debug-info -debug-types -debug-pubtypes
  test.o: file format elf64-x86-64
  
  .debug_info contents:
  0x00000000: Compile Unit: length = 0x00000039, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x0000003d)
  
  0x0000000b: DW_TAG_compile_unit
                DW_AT_producer    ("clang version 14.0.0 (git at github.com:llvm/llvm-project.git 499703e9c08a055a9007278e5222b0550aba89bf)")
                DW_AT_language    (DW_LANG_C_plus_plus_14)
                DW_AT_name        ("test.cpp")
                DW_AT_stmt_list   (0x00000000)
                DW_AT_comp_dir    ("/usr/local/google/home/blaikie/dev/scratch")
                DW_AT_GNU_pubnames        (true)
  
  0x0000001e:   DW_TAG_variable
                  DW_AT_name      ("v1")
                  DW_AT_type      (0x00000033 "t2")
                  DW_AT_external  (true)
                  DW_AT_decl_file ("/usr/local/google/home/blaikie/dev/scratch/test.cpp")
                  DW_AT_decl_line (3)
                  DW_AT_location  (DW_OP_addr 0x0)
  
  0x00000033:   DW_TAG_structure_type
                  DW_AT_declaration       (true)
                  DW_AT_signature (0x4e6669a661523280)
  
  0x0000003c:   NULL
  
  .debug_types contents:
  0x00000000: Type Unit: length = 0x0000003a, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08, name = 't2', type_signature = 0x4e6669a661523280, type_offset = 0x001e (next unit at 0x0000003e)
  
  0x00000017: DW_TAG_type_unit
                DW_AT_language    (DW_LANG_C_plus_plus_14)
                DW_AT_stmt_list   (0x00000000)
  
  0x0000001e:   DW_TAG_structure_type
                  DW_AT_calling_convention        (DW_CC_pass_by_value)
                  DW_AT_name      ("t2")
                  DW_AT_byte_size (0x01)
                  DW_AT_decl_file ("test.cpp")
                  DW_AT_decl_line (2)
  
  0x00000027:     DW_TAG_member
                    DW_AT_name    ("v1")
                    DW_AT_type    (0x00000034 "t1")
                    DW_AT_decl_file       ("test.cpp")
                    DW_AT_decl_line       (2)
                    DW_AT_data_member_location    (0x00)
  
  0x00000033:     NULL
  
  0x00000034:   DW_TAG_structure_type
                  DW_AT_declaration       (true)
                  DW_AT_signature (0xc6694e51369161f2)
  
  0x0000003d:   NULL
  0x00000000: Type Unit: length = 0x00000024, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08, name = 't1', type_signature = 0xc6694e51369161f2, type_offset = 0x001e (next unit at 0x00000028)
  
  0x00000017: DW_TAG_type_unit
                DW_AT_language    (DW_LANG_C_plus_plus_14)
                DW_AT_stmt_list   (0x00000000)
  
  0x0000001e:   DW_TAG_structure_type
                  DW_AT_calling_convention        (DW_CC_pass_by_value)
                  DW_AT_name      ("t1")
                  DW_AT_byte_size (0x01)
                  DW_AT_decl_file ("test.cpp")
                  DW_AT_decl_line (1)
  
  0x00000027:   NULL
  
  .debug_pubtypes contents:
  length = 0x0000001c, format = DWARF32, version = 0x0002, unit_offset = 0x00000000, unit_size = 0x0000003d
  Offset     Name
  0x0000000b "t1"
  0x00000033 "t2"

So "t2" pubtype entry references the offset of the skeleton DIE in the CU. "t1" references the CU DIE because there is no skeleton to reference. But "t1" is reachable from that CU (via "t2").

GCC's debug info uses the CU DIE offset for both "t1" and "t2", so that's promising that CU references are adequate.

> 5. -ggnu-pubnames / .debug_gnu_pubnames is compatible with type units.

Yep, same as non-gnu pubnames/pubtypes here. (generally they have the same content - gnu_pub* came into existence because GCC couldn't rely on the contents of the standard pubnames/pubtypes, so they invented a separate name that could contain a more reliable list/agreement between GCC and GDB, and Clang got in on that at some point - but there's no need for Clang or GCC to produce anything different into non-gnu pubnames/pubtypes)

> 6. .debug_names (DWARFv5) is compatible with type units.

In the spec yes, but not in LLVM yet. https://github.com/llvm/llvm-project/blob/964dc368e7c72ad5b7ab995c97920c4deb624631/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L335

> Do you know what the deciding factor is in whether or not one of these kinds of tables is compatible with type units (4, 5, 6)? Naively, I assume that it is whether there provision for referencing a TU directly from the name table? i.e., .debug_gnu_pubnames and .debug_names can both directly reference a TU, but .debug_pubnames cannot.

Nah, pubnames/pubtypes (gnu and non-gnu) don't have any way to actually reference a type unit - they "support" type units by describing the CU that references the type. This is why it's not obvious how to fix the bug you're dealing with - emitting a type in a type unit that would be unreferenced (even indirectly) from any CU. There'd be no place to put an index entry that'd allow the DWARF Consumer to find that entity.

For accelerator tables (Apple and the standardized for in DWARFv5) - the Apple version was designed for Apple's situation, where they don't use type units (because they use dsymutil, a DWARF-aware linker - so there's no need to make types in linker-agnostic units, etc). But when it was standardized into debug_names it got support for type units in some form. That support looks more like what you're describing - the ability to reference different units, specifically to reference CUs and TUs by offset, but also to reference "foreign" TUs by hash (this works for referencingc type units in Split DWARF - since they have no skeleton unit (not talking about skeleton type DIEs here, but the skeleton units as part of Split DWARF) in the .o file and so can only be referenced from the .o/exe by type hash, not by unit offset).

> If my assumption is true, then I would think the answer to:
>
>> But doesn't explain how we should create pubnames for types that don't have skeleton references from a CU.
>
> is that we only have to do something special for DWARF v4 name tables (.debug_pubnames). But this seems to contradict "No accelerator tables if type units are enabled, as DWARF v4 type units are not compatible with accelerator tables.", which seems to suggest that we don't emit DWARF v4 accelerator tables at all when type units are enabled (name tables == accelerator tables?). I think I am still missing part of this picture. On that note, I found .debug_pubnames and .debug_names documentation easily (v4 and v5 spec), but I'm struggling to find any for .debug_gnu_pubnames.
>
> In any case, the final point in summary:
>
> 7. The complexity we'd introduce in order to adhere to the various constraints makes fixing the issue probably not worth it.
>
> So this should probably be abandoned, at least in it's current form.

Possibly. If you have a really pathalogical case it may still be worth further discussion, but yeah, I don't know of a particularly easy path forward.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115325/new/

https://reviews.llvm.org/D115325



More information about the llvm-commits mailing list