[llvm-dev] distinct DISubprograms hindering sharing inlined subprogram descriptions
David Blaikie via llvm-dev
llvm-dev at lists.llvm.org
Thu Dec 15 13:44:07 PST 2016
Beefed up the prototype to use a unit if it's there, and manually modified
some IR to drop the unit and distinct on the linkonce_odr functions in the
example - linked them (did the right thin, deduplicated the subprograms) &
codegen'd the result, got the right answer (like the original improved
output I proposed)
Just an FYI/idea/thing.
On Thu, Dec 15, 2016 at 11:53 AM David Blaikie <dblaikie at gmail.com> wrote:
> On Thu, Dec 15, 2016 at 11:35 AM Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>
>
> > On Dec 15, 2016, at 10:54 AM, David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > Branching off from a discussion of improvements to DIGlobalVariable
> representations that Adrian's working on - got me thinking about related
> changes that have already been made to DISubprogram.
> >
> > To reduce duplicate debug info when things like linkonce_odr functions
> were deduplicated in LTO linking, the relationship between a CU and
> DISubprogram was inverted (instead of a CU maintaining a list of
> subprograms, subprograms specify which CU they come from - and the
> llvm::Function references the DISubprogram, so if the llvm::Function goes
> away, so does the associated DISubprogram)
> >
> > I'm not sure if this caused a regression, but at least seems to miss a
> possible improvement:
> >
> > During IR linking (for LTO, ThinLTO, etc) these distinct DISubprogram
> definitions (& their CU link, even if they weren't marked 'distinct', the
> CU link would cause them to effectively be so) remain separate - this means
> that inlined versions in one CU don't refer to an existing subprogram
> definition in another CU.
> >
> > To demonstrate:
> > inl.h:
> > void f1();
> > inline __attribute__((always_inline)) void f2() {
> > f1();
> > }
> > inl1.cpp:
> > #include "inl.h"
> > void c1() {
> > f2();
> > }
> > inl2.cpp:
> > #include "inl.h"
> > void c2() {
> > f2();
> > }
> >
> > Compile to IR, llvm-link the result. The DWARF you get is basically the
> same as the DWARF you'd get without linking:
> >
> > DW_TAG_compile_unit
> > DW_AT_name "inl1.cpp"
> > DW_TAG_subprogram #0
> > DW_AT_name "f2"
> > DW_TAG_subprogram
> > DW_AT_name "c1"
> > DW_TAG_inlined_subroutine
> > DW_TAG_abstract_origin #0 "f2"
> > DW_TAG_compile_unit
> > DW_AT_name "inl2.cpp"
> > DW_TAG_subprogram #1
> > DW_AT_name "f2"
> > DW_TAG_subprogram
> > DW_AT_name "c2"
> > DW_TAG_inlined_subroutine
> > DW_TAG_abstract_origin #1 "f2"
> >
> > Instead of something more like this:
> >
> > DW_TAG_compile_unit
> > DW_AT_name "inl1.cpp"
> > DW_TAG_subprogram #0
> > DW_AT_name "f2"
> > DW_TAG_subprogram
> > DW_AT_name "c1"
> > DW_TAG_inlined_subroutine
> > DW_TAG_abstract_origin #0 "f2"
> > DW_TAG_compile_unit
> > DW_AT_name "inl2.cpp"
> > DW_TAG_subprogram
> > DW_AT_name "c2"
> > DW_TAG_inlined_subroutine
> > DW_TAG_abstract_origin #0 "f2"
> >
> > (note that only one abstract definition of f2 is produced here)
> >
> > Any thoughts? I imagine this is probably worth a reasonable amount of
> savings in an optimized build. Not huge, but not nothing. (probably not the
> top of anyone's list though, I realize)
>
> I should see the IR metadata in both cases to really know how it worked
> before, but it seems that to be able to merge the two subprogram
> definitions when linking, we’d need to be able to have a list of CU per
> subprogram instead of a single one, right?
>
> > Should we remove the CU link from a non-internal linkage subprogram (&
> this may have an effect on the GV representation issue originally being
> discussed) and just emit it into whichever CU happens to need it first?
>
> Looks like this should work as well, but I don’t enough about the way we
> emit Dwarf...
>
>
> Trying a simple prototype, I think the only thing that went really screwy,
> was once we don't have that link, how do we pick which CU to put it in?
> Does it matter?
>
> So my really rough prototype (that just used the first CU in the CUMap for
> all subprograms) of course ended up with this:
>
> DW_TAG_compile_unit
>
> DW_TAG_subprogram #0
> DW_AT_name "f2"
> DW_TAG_subprogram
> DW_AT_name "c1"
> DW_TAG_inlined_subroutine
> DW_AT_abstract_origin #0
> DW_TAG_subprogram
> DW_AT_name "c2"
> DW_TAG_inlined_subroutine
> DW_AT_abstract_origin #0
> DW_TAG_compile_unit
>
> So I suppose we could use the unit when present - and make it present (&
> use distinct) when the function definition has external linkage. There's
> nothing we intend to deduplicate there & that means external linkage
> definitions will go in the right CU, and inline definitions just go
> wherever their first use is.
>
> Hmm - I guess that doesn't quite work for ThinLTO? Oh, maybe it does -
> it'd mean we'd have to drop the CU reference and un-distinct when we import.
>
> Hmm - that might simplify importing greatly - then we wouldn't actually
> need to import CUs at all?
>
> Don't know if that's a good thing.
>
> - Dave
>
>
>
> —
> Mehdi
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161215/2e605043/attachment-0001.html>
-------------- next part --------------
inl-link.o: file format ELF64-x86-64
.debug_info contents:
0x00000000: Compile Unit: length = 0x00000060 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x00000064)
0x0000000b: DW_TAG_compile_unit [1] *
DW_AT_producer [DW_FORM_strp] ( .debug_str[0x00000000] = "clang version 4.0.0 (trunk 289850) (llvm/trunk 289854)")
DW_AT_language [DW_FORM_data2] (DW_LANG_C_plus_plus)
DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000037] = "inl1.cpp")
DW_AT_stmt_list [DW_FORM_sec_offset] (0x00000000)
DW_AT_comp_dir [DW_FORM_strp] ( .debug_str[0x00000040] = "/tmp/dbginfo")
DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000)
DW_AT_high_pc [DW_FORM_data4] (0x0000000b)
0x0000002a: DW_TAG_subprogram [2]
DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x00000056] = "_Z2f2v")
DW_AT_name [DW_FORM_strp] ( .debug_str[0x0000005d] = "f2")
DW_AT_decl_file [DW_FORM_data1] ("/tmp/dbginfo/./inl.h")
DW_AT_decl_line [DW_FORM_data1] (2)
DW_AT_external [DW_FORM_flag_present] (true)
DW_AT_inline [DW_FORM_data1] (DW_INL_inlined)
0x00000036: DW_TAG_subprogram [3] *
DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000)
DW_AT_high_pc [DW_FORM_data4] (0x0000000b)
DW_AT_frame_base [DW_FORM_exprloc] (<0x1> 56 )
DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x00000060] = "_Z2c1v")
DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000067] = "c1")
DW_AT_decl_file [DW_FORM_data1] ("/tmp/dbginfo/inl1.cpp")
DW_AT_decl_line [DW_FORM_data1] (2)
DW_AT_external [DW_FORM_flag_present] (true)
0x0000004f: DW_TAG_inlined_subroutine [4]
DW_AT_abstract_origin [DW_FORM_ref4] (cu + 0x002a => {0x0000002a} "_Z2f2v")
DW_AT_low_pc [DW_FORM_addr] (0x0000000000000004)
DW_AT_high_pc [DW_FORM_data4] (0x00000005)
DW_AT_call_file [DW_FORM_data1] ("/tmp/dbginfo/inl1.cpp")
DW_AT_call_line [DW_FORM_data1] (3)
0x00000062: NULL
0x00000063: NULL
0x00000064: Compile Unit: length = 0x00000054 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x000000bc)
0x0000006f: DW_TAG_compile_unit [1] *
DW_AT_producer [DW_FORM_strp] ( .debug_str[0x00000000] = "clang version 4.0.0 (trunk 289850) (llvm/trunk 289854)")
DW_AT_language [DW_FORM_data2] (DW_LANG_C_plus_plus)
DW_AT_name [DW_FORM_strp] ( .debug_str[0x0000004d] = "inl2.cpp")
DW_AT_stmt_list [DW_FORM_sec_offset] (0x00000050)
DW_AT_comp_dir [DW_FORM_strp] ( .debug_str[0x00000040] = "/tmp/dbginfo")
DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010)
DW_AT_high_pc [DW_FORM_data4] (0x0000000b)
0x0000008e: DW_TAG_subprogram [3] *
DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010)
DW_AT_high_pc [DW_FORM_data4] (0x0000000b)
DW_AT_frame_base [DW_FORM_exprloc] (<0x1> 56 )
DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000006a] = "_Z2c2v")
DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000071] = "c2")
DW_AT_decl_file [DW_FORM_data1] ("/tmp/dbginfo/inl2.cpp")
DW_AT_decl_line [DW_FORM_data1] (2)
DW_AT_external [DW_FORM_flag_present] (true)
0x000000a7: DW_TAG_inlined_subroutine [5]
DW_AT_abstract_origin [DW_FORM_ref_addr] (0x000000000000002a "_Z2f2v")
DW_AT_low_pc [DW_FORM_addr] (0x0000000000000014)
DW_AT_high_pc [DW_FORM_data4] (0x00000005)
DW_AT_call_file [DW_FORM_data1] ("/tmp/dbginfo/inl2.cpp")
DW_AT_call_line [DW_FORM_data1] (3)
0x000000ba: NULL
0x000000bb: NULL
-------------- next part --------------
A non-text attachment was scrubbed...
Name: optional_cu_distinct_subprograms.diff
Type: text/x-patch
Size: 4704 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161215/2e605043/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: inl1.ll
Type: application/octet-stream
Size: 2470 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161215/2e605043/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: inl-link.ll
Type: application/octet-stream
Size: 3283 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161215/2e605043/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: inl2.ll
Type: application/octet-stream
Size: 2470 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161215/2e605043/attachment-0005.obj>
More information about the llvm-dev
mailing list