[llvm-dev] distinct DISubprograms hindering sharing inlined subprogram descriptions

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Fri Oct 15 12:06:48 PDT 2021


(came across this while looking for other things)

This is still an issue:

$ cat a.cpp

#include "a.h"

void f3() { f2(); }

$ cat b.cpp

#include "a.h"

void f3();

void f1() { }

int main() {

  f3();

  f2();

}

$ cat a.h

void f1();

struct t1 { int x; };

inline __attribute__((always_inline)) void f2(t1 = t1()) { f1(); }
$ $ clang++-tot -fuse-ld=lld -flto a.cpp b.cpp -g && llvm-dwarfdump-tot
a.out -debug-info | grep "DW_AT_name\|DW_TAG_\|DW_AT_type" | sed -e
"s/^............//"

DW_TAG_compile_unit

  DW_AT_name    ("a.cpp")

  DW_TAG_subprogram

    DW_AT_name  ("f2")

    DW_TAG_formal_parameter

      DW_AT_type        (0x0000003e "t1")

  DW_TAG_structure_type

    DW_AT_name  ("t1")

    DW_TAG_member

      DW_AT_name        ("x")

      DW_AT_type        (0x00000054 "int")
  ...

DW_TAG_compile_unit

  DW_AT_name    ("b.cpp")

  DW_TAG_subprogram

    DW_AT_name  ("f2")

    DW_TAG_formal_parameter

      DW_AT_type        (0x000000000000003e "t1")

  ...

The type (t1) gets deduplicated/shared across CUs (the DW_AT_type in b.cpp
uses FORM_ref_addr) but the f2 subprogram is duplicated - though b.cpp's
use of it could use ref_addr, and would if the function hadn't been inlined
away.

eg: if we move 'f2' to be defined in a.cpp but attributed always_inline, so
it does get cross-unit inlined due to LTO:

DW_TAG_compile_unit

  DW_AT_name    ("a.cpp")

  DW_TAG_subprogram

    DW_AT_name  ("f2")

    DW_TAG_formal_parameter

      DW_AT_type        (0x0000003e "t1")

  DW_TAG_structure_type

    DW_AT_name  ("t1")

    DW_TAG_member

      DW_AT_name        ("x")

      DW_AT_type        (0x00000054 "int")

  DW_TAG_base_type

    DW_AT_name  ("int")

  DW_TAG_subprogram

    DW_AT_name  ("f3")

    DW_TAG_inlined_subroutine

      DW_AT_abstract_origin     (0x0000002a "_Z2f22t1")

      DW_TAG_formal_parameter

        DW_AT_abstract_origin   (0x00000036)

DW_TAG_compile_unit

  DW_AT_name    ("b.cpp")

  DW_TAG_subprogram

    DW_AT_name  ("f1")

  DW_TAG_subprogram

    DW_AT_name  ("main")

    DW_AT_type  (0x0000000000000054 "int")

    DW_TAG_inlined_subroutine

      DW_AT_abstract_origin     (0x000000000000002a "_Z2f22t1")

So now because the cross-CU inlining happened during LTO after module
merging - the merging happens when the function's not inlined, module merge
picks one llvm::Function, dropping the other one and dropping the other
DISubprogram - and then the inlining kicks in, referencing just that single
DIsubprogram.

Not sure if it's of interest to anyone/a priority.

(side note: Anyone interested in making ThinLTO home type definitions?
That'd be great to reduce type duplication - would mean ThinLTO could turn
off type units and/or that .o/.dwo/etc files would just generally be
smaller anyway)

On Sat, Dec 24, 2016 at 10:18 AM David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Fri, Dec 23, 2016 at 7:25 PM Duncan Exon Smith <dexonsmith at apple.com>
> wrote:
>
>>
>>
>> On Dec 23, 2016, at 18:36, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Fri, Dec 23, 2016 at 11:47 AM Duncan P. N. Exon Smith <
>> dexonsmith at apple.com> wrote:
>>
>>> A few disjoint thoughts; sorry they're so delayed (I skimmed the
>>> responses below, and I think these are still relevant/not covered
>>> elsewhere).
>>>
>>> Firstly, why *should* DISubprogram definitions be distinct?  There were
>>> two reasons this was valuable (this was from before there was a cu: link).
>>>
>>> - It helped to fix long-standing bugs in the IRLinker, where
>>> uniqued-DISubprograms in different compile units would coalesce.
>>>
>>
>> Any idea why that ^ was a problem/bugs?
>>
>>
>>>  Now that the Function <-> DISubprogram connection has been reversed,
>>> I'm not entirely sure it's still a necessary precaution.
>>>
>>> - It broke a several uniquing cycles involving DISubprogram.  The
>>> existence of the uniquing cycles meant that affected DISubprograms weren't
>>> really uniqued, anyway (since we don't bother to solve graph isomorphism);
>>> and breaking the cycles sped up the IRLinker and reduced memory waste.
>>>
>>
>> Ah, fair point - my trivial example didn't have any local variables, so
>> didn't hit this problem. They do indeed form cycles and I haven't tested,
>> but seems totally reasonable/likely that that would break my simple example
>> because cycles break uniquing and all that.
>>
>>
>>>  Making DISubprogram uniqued again would either have no real effect
>>> (because distinct nodes they reference aren't uniqued anymore either) or
>>> bring back uniquing cycles (which would have effects... but not the desired
>>> one).
>>>
>>> But there ought to be a way to get coalescing in the right places.  We
>>> just need to ensure that the uniqued/coalesced parts of the graph:
>>> - are acyclic, and
>>> - reference only coalesced nodes (note that leaves can happily include
>>> the strangely-coalesced distinct DICompositeType beast I created).
>>>
>>
>> Perhaps before I dive into your suggestion (which, it seems, is to
>> separate out the definition but let the declaration/locals form cycles
>>
>>
>> No (new) cycles; in fact this breaks some.
>> - the Function itself references the defn and the locals;
>> - the locals and the defn reference (transitively) the decl; and
>> - the decl doesn't reference any of those.
>>
>> The only cycle is the decl <-> composite type one that already exists
>> (and that odr magic (mostly) fixes).  Maybe that one could be removed
>> somehow too, but it isn't directly relevant to the problem you're looking
>> at.
>>
>
> Oh, yeah - I've an aside there too, maybe for another thread. (punchy
> summary: anonymous enums in headers used for constants are technically ODR
> violations (well, if they're ever used in an ODR entity - like an inline
> function) - they're not public types, so we don't put the mangled name on
> them & don't deduplicate them in DWARF type units - and probably similarly
> don't deduplicate them in the IR... - not sure what the right answer is
> there)
>
>
>> (Note that this is somewhat inspired by Reid's point in his recent debug
>> info talk, that CodeView avoids type cycles by breaking types into two
>> records.)
>>
>
> Ah, thanks, that helps with me picture it!
>
>
>>
>> - that would at least be dropped because the definition would only be
>> kept due to it being referenced from the llvm::Function), what about
>> something simpler:
>>
>> What if scope chains didn't go all the way to the top (the subprogram)
>> but stopped short of that?
>>
>> Now, granted - this was part of the great assertion I added several years
>> back that caught all kinds of silliness - mostly around inlined call sites
>> not having debugloc and leading to scope chains that lead to distinct
>> roots. But the alternative to having and maintaining an invariant - is just
>> to make things true by construction. All debug info scope chains within a
>> function with an associated DISubprogram have an implicit last element of
>> that DISubprogram?
>>
>>
>> That seems reasonable, I think, now that the function points directly at
>> its subprogram.
>>
>> I'm not sure it's simpler to get right though.  Dropping the scope runs
>> the risk of coalescing things incorrectly.  I imagine the backend relies on
>> pointer distinctness of two local variables called "a" that are in (or are
>> parameters of) unrelated subprograms.  I think in some situations they
>> could be coalesced incorrectly if they had no scope.
>>
>
> Just no top level scope - so it'd only happen if you had two variables
> with the same name in the same scope. It might break in debug-having inline
> functions inlined into non-debug-having functions (which then might in turn
> be inlined into a debug-having function) - would have to check.
>
> Also, not sure if inlined situations might introduce circularity. Yep. The
> scope for inlinedAt DILocations point to the DISubprogram - so that'd
> create a cycle. Ah well.
>
>
>>
>> That would mean it'd be harder to catch the bugs I found
>>
>>
>> A shame if so, but if we can find another way to catch it or design it
>> away, as you mention, it doesn't much matter.
>>
>> - but I think most of them came from that specific inlining situation (I
>> can go back and do some archaeology and see if there's other exposure where
>> we could sure up some defenses as well) which can be caught more directly
>> (I forget if the inliner now catches it itself, or if we enhanced the debug
>> info verifier to catch it after the inliner runs - if it's the verifier,
>> then that wouldn't suffice if we made the change I'm suggesting - and we'd
>> have to hook it up in the inliner itself)
>>
>> Thoughts?
>>
>>
>> I've been thinking idly whether we can/should break DISubprogram apart
>>> into uniqued (coalesce-able) and distinct parts.  Still being someone
>>> ignorant of DWARF itself (and CodeView, for that matter), I'm not entirely
>>> sure.  But...
>>>
>>> Here's my understanding of the current uses of DISubprogram (where I'm
>>> using DISubprogramDecl for the usually uniqued declarations, and
>>> DISubprogramDefn for the always distinct definitions):
>>>
>>> // !dbg
>>> Function -> DISubprogram
>>>
>>> // Declaration (optional, only for member functions).
>>> DISubprogramDefn -> DISubprogramDecl
>>>
>>> // Variables:
>>> DISubprogramDefn -> MDNode -> DILocalVariable
>>>
>>> // Terminator for scope chains:
>>> DILocalVariable [-> DILocalScope [-> ...]] -> DISubprogramDefn
>>> DILocation [-> DILocalScope [-> ...]] -> DISubprogramDefn
>>> DICompositeType -> DISubprogramDefn // function-local type
>>>
>>> // Elements (member functions):
>>> DICompositeType -> MDNode -> DISubprogramDecl
>>> DISubprogramDecl -> DICompositeType
>>> DISubprogramDefn -> DICompositeType
>>>
>>>
>>> And here are the changes I would consider:
>>>
>>> 1. Split DISubprogramDecl and DISubprogramDefn into different types.
>>>
>>> 2. Remove definition-specific fields from DISubprogramDecl.
>>> - variables, a "dangerous" source of uniquing cycles.
>>> - declaration, which it will never have.
>>> - scopeLine, which it will never have.
>>> - unit, which the Defn will have when it's relevant.
>>>
>>> 3. Make a DISubprogramDecl mandatory for DISubprogramDefn.
>>>
>>> 4. Remove redundant fields from DISubprogramDefn.
>>> - name and linkageName are on the declaration.
>>> - type is redundant.
>>> - flags is likely redundant?
>>>
>>> Nothing so far has made any real changes, I think.  DISubprogramDecl
>>> should coalesce nicely between modules though.
>>>
>>> 5. Migrate links to point at DISubprogramDecl instead of
>>> DISubprogramDefn.
>>> - DILocalVariable scope chain.
>>> - DILocation scope chain.
>>> - DICompositeType scope.
>>>
>>> I don't think this will create any uniquing cycles.  DISubprogramDecl
>>> won't point at any of these.  Also, DILocalVariable should start coalescing
>>> with each other.
>>>
>>> Since DISubprogramDefn is only referenced from the Function !dbg
>>> attachment, only one should survive function-coalescing.
>>>
>>> One open question, since I don't know the backend code, is: can you emit
>>> the "correct" DWARF after step (5)?  I'm hoping you only need the Defn of
>>> the Function that has instructions referencing the
>>> DILocalVariable/DILocation.  In which case, we suddenly get a lot of
>>> coalescing.
>>>
>>> On 2016-Dec-15, at 10:54, David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>> Branching off from a discussion of improvements to DIGlobalVariable
>>> representations that Adrian's working on - got me thinking about related
>>> changes that have already been made to DISubprogram.
>>>
>>> To reduce duplicate debug info when things like linkonce_odr functions
>>> were deduplicated in LTO linking, the relationship between a CU and
>>> DISubprogram was inverted (instead of a CU maintaining a list
>>> of subprograms, subprograms specify which CU they come from - and the
>>> llvm::Function references the DISubprogram, so if the llvm::Function goes
>>> away, so does the associated DISubprogram)
>>>
>>> I'm not sure if this caused a regression, but at least seems to miss a
>>> possible improvement:
>>>
>>> During IR linking (for LTO, ThinLTO, etc) these distinct DISubprogram
>>> definitions (& their CU link, even if they weren't marked 'distinct', the
>>> CU link would cause them to effectively be so) remain separate - this means
>>> that inlined versions in one CU don't refer to an existing subprogram
>>> definition in another CU.
>>>
>>> To demonstrate:
>>> inl.h:
>>> void f1();
>>> inline __attribute__((always_inline)) void f2() {
>>>   f1();
>>> }
>>> inl1.cpp:
>>> #include "inl.h"
>>> void c1() {
>>>   f2();
>>> }
>>> inl2.cpp:
>>> #include "inl.h"
>>> void c2() {
>>>   f2();
>>> }
>>>
>>> Compile to IR, llvm-link the result. The DWARF you get is basically the
>>> same as the DWARF you'd get without linking:
>>>
>>> DW_TAG_compile_unit
>>>   DW_AT_name "inl1.cpp"
>>>   DW_TAG_subprogram #0
>>>     DW_AT_name "f2"
>>>   DW_TAG_subprogram
>>>     DW_AT_name "c1"
>>>     DW_TAG_inlined_subroutine
>>>       DW_TAG_abstract_origin #0 "f2"
>>> DW_TAG_compile_unit
>>>   DW_AT_name "inl2.cpp"
>>>   DW_TAG_subprogram #1
>>>     DW_AT_name "f2"
>>>   DW_TAG_subprogram
>>>     DW_AT_name "c2"
>>>     DW_TAG_inlined_subroutine
>>>       DW_TAG_abstract_origin #1 "f2"
>>>
>>> Instead of something more like this:
>>>
>>> DW_TAG_compile_unit
>>>   DW_AT_name "inl1.cpp"
>>>   DW_TAG_subprogram #0
>>>     DW_AT_name "f2"
>>>   DW_TAG_subprogram
>>>     DW_AT_name "c1"
>>>     DW_TAG_inlined_subroutine
>>>       DW_TAG_abstract_origin #0 "f2"
>>> DW_TAG_compile_unit
>>>   DW_AT_name "inl2.cpp"
>>>   DW_TAG_subprogram
>>>     DW_AT_name "c2"
>>>     DW_TAG_inlined_subroutine
>>>       DW_TAG_abstract_origin #0 "f2"
>>>
>>> (note that only one abstract definition of f2 is produced here)
>>>
>>> Any thoughts? I imagine this is probably worth a reasonable amount of
>>> savings in an optimized build. Not huge, but not nothing. (probably not the
>>> top of anyone's list though, I realize)
>>>
>>> Should we remove the CU link from a non-internal linkage subprogram (&
>>> this may have an effect on the GV representation issue originally being
>>> discussed) and just emit it into whichever CU happens to need it first?
>>>
>>> This might be slightly sub-optimal, due to, say, the namespace being
>>> foreign to that CU. But it's how we do types currently, I think? So at
>>> least it'd be consistent and probably cheap enough/better.
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211015/97d747cc/attachment.html>


More information about the llvm-dev mailing list