[PATCH] D123534: [dwarf] Emit a DIGlobalVariable for constant strings.

Mitch Phillips via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Apr 26 16:08:11 PDT 2022


hctim added a comment.

> summary of DWARF:
> & how many of these descriptions get added to the debug info?

afaict, there is now:
 1x .debug_addr entry for each string
 1x. debug_info DW_TAG_variable for each string
 1x. DW_TAG_array_type + DW_TAG_subrange_type for each unique sizeof(string)

i tried to measure if there's other bits laying around that could be optimised. i thought briefly about diffing the llvm-dwarfdump for the before/after for clang, but as the dumpfiles reached 20gb, rethought that decision. the dwarfdump for the clang/test/CodeGen/debug-info-variables.c dwo is below.

> Numbers for Split DWARF may be helpful too - given this'll add an extra address/relocation for every string literal, it might make object size (specifically unlinked object size where relocations are expensive/plentiful) significantly larger in problematic ways.

sorry, i don't understand why split-dwarf means this requires an additional relocation (i'm not really sure what split-dwarf is outside of just putting the dwarf in a separate file, but don't see why that would change relocations). i made a quick dwarfdump diff on clang/test/CodeGen/debug-info-variables.c (with split-dwarf):

sections old:

  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 2] .debug_str.dwo    PROGBITS        0000000000000000 000040 0000eb 01 MSE  0   0  1
  [ 3] .debug_str_offsets.dwo PROGBITS   0000000000000000 00012b 00002c 00   E  0   0  1
  [ 4] .debug_info.dwo   PROGBITS        0000000000000000 000157 000077 00   E  0   0  1
  [ 5] .debug_abbrev.dwo PROGBITS        0000000000000000 0001ce 000091 00   E  0   0  1

sections new:

  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 3] .debug_str.dwo    PROGBITS        0000000000000000 000078 0000ff 01 MSE  0   0  1
  [ 2] .debug_str_offsets.dwo PROGBITS   0000000000000000 000040 000038 00   E  0   0  1
  [ 4] .debug_info.dwo   PROGBITS        0000000000000000 000177 000092 00   E  0   0  1
  [ 5] .debug_abbrev.dwo PROGBITS        0000000000000000 000209 0000aa 00   E  0   0  1

so `.debug_string += 0x14`, `.debug_str_offsets += 0xc`, `.debug_info += 0x1b` and `.debug_abbrev += 0x19`.

as before, the DW_TAG_array_type + DW_TAG_subrange_type would be amortised across strings with the same size.

unfortunately, i don't see any further places to optimise (except for the full `const char*` amortization, which as in a previous comment, didn't make much of an improvement for the entire clang binary)

  diff --git a/tmp/dwo/dwo-dump b/dwo-dump
  index e0fdd77..6415086 100644
  --- a/tmp/dwo/dwo-dump
  +++ b/dwo-dump
  @@ -3,14 +3,13 @@ debug-info-variables.dwo:	file format elf64-x86-64
   .debug_abbrev.dwo contents:
   Abbrev table for offset: 0x00000000
   [1] DW_TAG_compile_unit	DW_CHILDREN_yes
  -	DW_AT_producer	DW_FORM_GNU_str_index
  +	DW_AT_producer	DW_FORM_strx1
   	DW_AT_language	DW_FORM_data2
  -	DW_AT_name	DW_FORM_GNU_str_index
  -	DW_AT_GNU_dwo_name	DW_FORM_GNU_str_index
  -	DW_AT_GNU_dwo_id	DW_FORM_data8
  +	DW_AT_name	DW_FORM_strx1
  +	DW_AT_dwo_name	DW_FORM_strx1
   
   [2] DW_TAG_variable	DW_CHILDREN_no
  -	DW_AT_name	DW_FORM_GNU_str_index
  +	DW_AT_name	DW_FORM_strx1
   	DW_AT_type	DW_FORM_ref4
   	DW_AT_external	DW_FORM_flag_present
   	DW_AT_decl_file	DW_FORM_data1
  @@ -18,155 +17,194 @@ Abbrev table for offset: 0x00000000
   	DW_AT_location	DW_FORM_exprloc
   
   [3] DW_TAG_base_type	DW_CHILDREN_no
  -	DW_AT_name	DW_FORM_GNU_str_index
  +	DW_AT_name	DW_FORM_strx1
   	DW_AT_encoding	DW_FORM_data1
   	DW_AT_byte_size	DW_FORM_data1
   
  -[4] DW_TAG_subprogram	DW_CHILDREN_no
  -	DW_AT_low_pc	DW_FORM_GNU_addr_index
  +[4] DW_TAG_variable	DW_CHILDREN_no
  +	DW_AT_type	DW_FORM_ref4
  +	DW_AT_decl_file	DW_FORM_data1
  +	DW_AT_decl_line	DW_FORM_data1
  +	DW_AT_location	DW_FORM_exprloc
  +
  +[5] DW_TAG_array_type	DW_CHILDREN_yes
  +	DW_AT_type	DW_FORM_ref4
  +
  +[6] DW_TAG_subrange_type	DW_CHILDREN_no
  +	DW_AT_type	DW_FORM_ref4
  +	DW_AT_count	DW_FORM_data1
  +
  +[7] DW_TAG_base_type	DW_CHILDREN_no
  +	DW_AT_name	DW_FORM_strx1
  +	DW_AT_byte_size	DW_FORM_data1
  +	DW_AT_encoding	DW_FORM_data1
  +
  +[8] DW_TAG_subprogram	DW_CHILDREN_no
  +	DW_AT_low_pc	DW_FORM_addrx
   	DW_AT_high_pc	DW_FORM_data4
   	DW_AT_frame_base	DW_FORM_exprloc
  -	DW_AT_name	DW_FORM_GNU_str_index
  +	DW_AT_name	DW_FORM_strx1
   	DW_AT_decl_file	DW_FORM_data1
   	DW_AT_decl_line	DW_FORM_data1
   	DW_AT_type	DW_FORM_ref4
   	DW_AT_external	DW_FORM_flag_present
   
  -[5] DW_TAG_subprogram	DW_CHILDREN_yes
  -	DW_AT_low_pc	DW_FORM_GNU_addr_index
  +[9] DW_TAG_subprogram	DW_CHILDREN_yes
  +	DW_AT_low_pc	DW_FORM_addrx
   	DW_AT_high_pc	DW_FORM_data4
   	DW_AT_frame_base	DW_FORM_exprloc
  -	DW_AT_name	DW_FORM_GNU_str_index
  +	DW_AT_name	DW_FORM_strx1
   	DW_AT_decl_file	DW_FORM_data1
   	DW_AT_decl_line	DW_FORM_data1
   	DW_AT_prototyped	DW_FORM_flag_present
   	DW_AT_type	DW_FORM_ref4
   	DW_AT_external	DW_FORM_flag_present
   
  -[6] DW_TAG_formal_parameter	DW_CHILDREN_no
  +[10] DW_TAG_formal_parameter	DW_CHILDREN_no
   	DW_AT_location	DW_FORM_exprloc
  -	DW_AT_name	DW_FORM_GNU_str_index
  +	DW_AT_name	DW_FORM_strx1
   	DW_AT_decl_file	DW_FORM_data1
   	DW_AT_decl_line	DW_FORM_data1
   	DW_AT_type	DW_FORM_ref4
   
  -[7] DW_TAG_variable	DW_CHILDREN_no
  +[11] DW_TAG_variable	DW_CHILDREN_no
   	DW_AT_location	DW_FORM_exprloc
  -	DW_AT_name	DW_FORM_GNU_str_index
  +	DW_AT_name	DW_FORM_strx1
   	DW_AT_decl_file	DW_FORM_data1
   	DW_AT_decl_line	DW_FORM_data1
   	DW_AT_type	DW_FORM_ref4
   
  -[8] DW_TAG_pointer_type	DW_CHILDREN_no
  +[12] DW_TAG_pointer_type	DW_CHILDREN_no
   	DW_AT_type	DW_FORM_ref4
   
  -[9] DW_TAG_const_type	DW_CHILDREN_no
  +[13] DW_TAG_const_type	DW_CHILDREN_no
   	DW_AT_type	DW_FORM_ref4
   
   
   .debug_info.dwo contents:
  -0x00000000: Compile Unit: length = 0x00000073, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000077)
  +0x00000000: Compile Unit: length = 0x0000008e, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size = 0x08, DWO_id = 0xa7a0aceb112fa998 (next unit at 0x00000092)
   
  -0x0000000b: DW_TAG_compile_unit
  -              DW_AT_producer	("clang version 14.0.0 (https://github.com/llvm/llvm-project.git 5357a98c823a5262814e269db265b5d8e1f2c4f2)")
  +0x00000014: DW_TAG_compile_unit
  +              DW_AT_producer	("clang version 15.0.0 (https://github.com/llvm/llvm-project.git 4c7fff8247ec5485701d4e04ea45b9fe399e1c5a)")
                 DW_AT_language	(DW_LANG_C99)
                 DW_AT_name	("/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c")
  -              DW_AT_GNU_dwo_name	("debug-info-variables.dwo")
  -              DW_AT_GNU_dwo_id	(0x02255219cf8b78f3)
  +              DW_AT_dwo_name	("debug-info-variables.dwo")
   
  -0x00000019:   DW_TAG_variable
  +0x0000001a:   DW_TAG_variable
                   DW_AT_name	("global")
  -                DW_AT_type	(0x00000024 "int")
  +                DW_AT_type	(0x00000025 "int")
                   DW_AT_external	(true)
                   DW_AT_decl_file	(0x01)
                   DW_AT_decl_line	(4)
  -                DW_AT_location	(DW_OP_GNU_addr_index 0x0)
  +                DW_AT_location	(DW_OP_addrx 0x0)
   
  -0x00000024:   DW_TAG_base_type
  +0x00000025:   DW_TAG_base_type
                   DW_AT_name	("int")
                   DW_AT_encoding	(DW_ATE_signed)
                   DW_AT_byte_size	(0x04)
   
  -0x00000028:   DW_TAG_subprogram
  -                DW_AT_low_pc	(indexed (00000001) address = <unresolved>)
  +0x00000029:   DW_TAG_variable
  +                DW_AT_type	(0x00000033 "char [11]")
  +                DW_AT_decl_file	(0x01)
  +                DW_AT_decl_line	(8)
  +                DW_AT_location	(DW_OP_addrx 0x1)
  +
  +0x00000033:   DW_TAG_array_type
  +                DW_AT_type	(0x0000003f "char")
  +
  +0x00000038:     DW_TAG_subrange_type
  +                  DW_AT_type	(0x00000043 "__ARRAY_SIZE_TYPE__")
  +                  DW_AT_count	(0x0b)
  +
  +0x0000003e:     NULL
  +
  +0x0000003f:   DW_TAG_base_type
  +                DW_AT_name	("char")
  +                DW_AT_encoding	(DW_ATE_signed_char)
  +                DW_AT_byte_size	(0x01)
  +
  +0x00000043:   DW_TAG_base_type
  +                DW_AT_name	("__ARRAY_SIZE_TYPE__")
  +                DW_AT_byte_size	(0x08)
  +                DW_AT_encoding	(DW_ATE_unsigned)
  +
  +0x00000047:   DW_TAG_subprogram
  +                DW_AT_low_pc	(indexed (00000002) address = <unresolved>)
                   DW_AT_high_pc	(0x0000000d)
                   DW_AT_frame_base	(DW_OP_reg6 RBP)
                   DW_AT_name	("s")
                   DW_AT_decl_file	(0x01)
                   DW_AT_decl_line	(7)
  -                DW_AT_type	(0x00000068 "const char *")
  +                DW_AT_type	(0x00000087 "const char *")
                   DW_AT_external	(true)
   
  -0x00000037:   DW_TAG_subprogram
  -                DW_AT_low_pc	(indexed (00000002) address = <unresolved>)
  +0x00000056:   DW_TAG_subprogram
  +                DW_AT_low_pc	(indexed (00000003) address = <unresolved>)
                   DW_AT_high_pc	(0x00000018)
                   DW_AT_frame_base	(DW_OP_reg6 RBP)
                   DW_AT_name	("sum")
                   DW_AT_decl_file	(0x01)
                   DW_AT_decl_line	(14)
                   DW_AT_prototyped	(true)
  -                DW_AT_type	(0x00000024 "int")
  +                DW_AT_type	(0x00000025 "int")
                   DW_AT_external	(true)
   
  -0x00000046:     DW_TAG_formal_parameter
  +0x00000065:     DW_TAG_formal_parameter
                     DW_AT_location	(DW_OP_fbreg -4)
                     DW_AT_name	("p")
                     DW_AT_decl_file	(0x01)
                     DW_AT_decl_line	(14)
  -                  DW_AT_type	(0x00000024 "int")
  +                  DW_AT_type	(0x00000025 "int")
   
  -0x00000051:     DW_TAG_formal_parameter
  +0x00000070:     DW_TAG_formal_parameter
                     DW_AT_location	(DW_OP_fbreg -8)
                     DW_AT_name	("q")
                     DW_AT_decl_file	(0x01)
                     DW_AT_decl_line	(14)
  -                  DW_AT_type	(0x00000024 "int")
  +                  DW_AT_type	(0x00000025 "int")
   
  -0x0000005c:     DW_TAG_variable
  +0x0000007b:     DW_TAG_variable
                     DW_AT_location	(DW_OP_fbreg -12)
                     DW_AT_name	("r")
                     DW_AT_decl_file	(0x01)
                     DW_AT_decl_line	(15)
  -                  DW_AT_type	(0x00000024 "int")
  -
  -0x00000067:     NULL
  +                  DW_AT_type	(0x00000025 "int")
   
  -0x00000068:   DW_TAG_pointer_type
  -                DW_AT_type	(0x0000006d "const char")
  +0x00000086:     NULL
   
  -0x0000006d:   DW_TAG_const_type
  -                DW_AT_type	(0x00000072 "char")
  +0x00000087:   DW_TAG_pointer_type
  +                DW_AT_type	(0x0000008c "const char")
   
  -0x00000072:   DW_TAG_base_type
  -                DW_AT_name	("char")
  -                DW_AT_encoding	(DW_ATE_signed_char)
  -                DW_AT_byte_size	(0x01)
  +0x0000008c:   DW_TAG_const_type
  +                DW_AT_type	(0x0000003f "char")
   
  -0x00000076:   NULL
  +0x00000091:   NULL
   
   .debug_str.dwo contents:
   0x00000000: "global"
   0x00000007: "int"
  -0x0000000b: "s"
  -0x0000000d: "char"
  -0x00000012: "sum"
  -0x00000016: "p"
  -0x00000018: "q"
  -0x0000001a: "r"
  -0x0000001c: "clang version 14.0.0 (https://github.com/llvm/llvm-project.git 5357a98c823a5262814e269db265b5d8e1f2c4f2)"
  -0x00000085: "/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c"
  -0x000000d2: "debug-info-variables.dwo"
  +0x0000000b: "char"
  +0x00000010: "__ARRAY_SIZE_TYPE__"
  +0x00000024: "s"
  +0x00000026: "sum"
  +0x0000002a: "p"
  +0x0000002c: "q"
  +0x0000002e: "r"
  +0x00000030: "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 4c7fff8247ec5485701d4e04ea45b9fe399e1c5a)"
  +0x00000099: "/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c"
  +0x000000e6: "debug-info-variables.dwo"
   
   .debug_str_offsets.dwo contents:
  -0x00000000: Contribution size = 44, Format = DWARF32, Version = 4
  -0x00000000: 00000000 "global"
  -0x00000004: 00000007 "int"
  -0x00000008: 0000000b "s"
  -0x0000000c: 0000000d "char"
  -0x00000010: 00000012 "sum"
  -0x00000014: 00000016 "p"
  -0x00000018: 00000018 "q"
  -0x0000001c: 0000001a "r"
  -0x00000020: 0000001c "clang version 14.0.0 (https://github.com/llvm/llvm-project.git 5357a98c823a5262814e269db265b5d8e1f2c4f2)"
  -0x00000024: 00000085 "/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c"
  -0x00000028: 000000d2 "debug-info-variables.dwo"
  +0x00000000: Contribution size = 52, Format = DWARF32, Version = 5
  +0x00000008: 00000000 "global"
  +0x0000000c: 00000007 "int"
  +0x00000010: 0000000b "char"
  +0x00000014: 00000010 "__ARRAY_SIZE_TYPE__"
  +0x00000018: 00000024 "s"
  +0x0000001c: 00000026 "sum"
  +0x00000020: 0000002a "p"
  +0x00000024: 0000002c "q"
  +0x00000028: 0000002e "r"
  +0x0000002c: 00000030 "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 4c7fff8247ec5485701d4e04ea45b9fe399e1c5a)"
  +0x00000030: 00000099 "/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c"
  +0x00000034: 000000e6 "debug-info-variables.dwo"


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123534/new/

https://reviews.llvm.org/D123534



More information about the cfe-commits mailing list