[llvm-dev] distinct DISubprograms hindering sharing inlined subprogram descriptions

Thu Dec 15 11:26:15 PST 2016

Trying to wrap my brain around this, so a few questions below. =)

On Thu, Dec 15, 2016 at 10:54 AM, David Blaikie <dblaikie at gmail.com> wrote:

> Branching off from a discussion of improvements to DIGlobalVariable
> representations that Adrian's working on - got me thinking about related
> changes that have already been made to DISubprogram.
>
> To reduce duplicate debug info when things like linkonce_odr functions
> were deduplicated in LTO linking, the relationship between a CU and
> DISubprogram was inverted (instead of a CU maintaining a list of
> subprograms, subprograms specify which CU they come from - and the
> llvm::Function references the DISubprogram, so if the llvm::Function goes
> away, so does the associated DISubprogram)
>
> I'm not sure if this caused a regression, but at least seems to miss a
> possible improvement:
>
> During IR linking (for LTO, ThinLTO, etc) these distinct DISubprogram
> definitions (& their CU link, even if they weren't marked 'distinct', the
> CU link would cause them to effectively be so) remain separate - this means
> that inlined versions in one CU don't refer to an existing subprogram
> definition in another CU.
>
> To demonstrate:
> inl.h:
> void f1();
> inline __attribute__((always_inline)) void f2() {
>   f1();
> }
> inl1.cpp:
> #include "inl.h"
> void c1() {
>   f2();
> }
> inl2.cpp:
> #include "inl.h"
> void c2() {
>   f2();
> }
>
> Compile to IR, llvm-link the result. The DWARF you get is basically the
> same as the DWARF you'd get without linking:
>
> DW_TAG_compile_unit
>   DW_AT_name "inl1.cpp"
>   DW_TAG_subprogram #0
>     DW_AT_name "f2"
>   DW_TAG_subprogram
>     DW_AT_name "c1"
>     DW_TAG_inlined_subroutine
>       DW_TAG_abstract_origin #0 "f2"
> DW_TAG_compile_unit
>   DW_AT_name "inl2.cpp"
>   DW_TAG_subprogram #1
>     DW_AT_name "f2"
>   DW_TAG_subprogram
>     DW_AT_name "c2"
>     DW_TAG_inlined_subroutine
>       DW_TAG_abstract_origin #1 "f2"
>
> Instead of something more like this:
>
> DW_TAG_compile_unit
>   DW_AT_name "inl1.cpp"
>   DW_TAG_subprogram #0
>     DW_AT_name "f2"
>   DW_TAG_subprogram
>     DW_AT_name "c1"
>     DW_TAG_inlined_subroutine
>       DW_TAG_abstract_origin #0 "f2"
> DW_TAG_compile_unit
>   DW_AT_name "inl2.cpp"
>   DW_TAG_subprogram
>     DW_AT_name "c2"
>     DW_TAG_inlined_subroutine
>       DW_TAG_abstract_origin #0 "f2"
>
> (note that only one abstract definition of f2 is produced here)
>

I think I understand what you are saying. Essentially, having the SP->CU
link allows the SP to be deduplicated when multiple *outline* copies of the
corresponding function are deduplicated. But not when the multiple copies
are inlined, as it looks like we need all the copies, right?

>
> Any thoughts? I imagine this is probably worth a reasonable amount of
> savings in an optimized build. Not huge, but not nothing. (probably not the
> top of anyone's list though, I realize)
>
> Should we remove the CU link from a non-internal linkage subprogram (&
> this may have an effect on the GV representation issue originally being
> discussed) and just emit it into whichever CU happens to need it first?
>

I can see how this would be done in LTO where the compiler has full
visibility. For ThinLTO presumably we would need to do some index-based
marking? Can we at least do something when we import an inlined SP and drop
it since we know it is defined elsewhere?

Thanks,
Teresa

>
> This might be slightly sub-optimal, due to, say, the namespace being
> foreign to that CU. But it's how we do types currently, I think? So at
> least it'd be consistent and probably cheap enough/better.
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161215/c0ff6ab6/attachment.html>