[llvm-dev] [DebugInfo] Bugs when emitting debug info with inlined functions

Thu Oct 14 14:13:44 PDT 2021

Could be worth a shot - I think the attribute removal would be more
difficult due to abbreviations, etc (maybe the same is true of children,
when setting the CHILDREN property of the abbrev, I'm not sure) - worth a
go, at least. Can write down why it's not feasible if you/we discover some
reason that's infeasible.

On Thu, Oct 14, 2021 at 2:12 PM Ellis Hoag <ellis.sparky.hoag at gmail.com>
wrote:

> /maybe/ though I'm not sure the DIE APIs support removing children, do
>> they?
>>
>
> No the API doesn't support removing children or changing the parent node.
> I don't see anything obvious reason why we can't add this API, but it might
> be too disruptive for what we are trying to accomplish.
>
> Ellis
>
> On Thu, Oct 14, 2021 at 11:49 AM David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>> On Thu, Oct 14, 2021 at 10:30 AM Ellis Hoag <ellis.sparky.hoag at gmail.com>
>> wrote:
>>
>>> For these other cases (variables, imported declarations, maybe function
>>>> local types too?) - I think, at least consistent with the current design,
>>>> we have to defer producing them until we get to the end - the same as for
>>>> the name/decl file/line attributes, etc.
>>>
>>>
>>> Instead of deferring those DIEs, I wonder if we can do some kind of
>>> fixup. At the end of the module, we can search through the SPs with
>>> abstract origins to see if they have a concrete DIE with children
>>> (variables, types, imported declarations). Then move those DIEs to the
>>> abstract origin. This would be a simple solution because we can just assume
>>> the concrete function will exist and never be inlined, but at the end we
>>> fix the inlined cases. I suspect this solution won't work for the same
>>> reason we don't want to create DIEs without parents, but let me know what
>>> you think of this.
>>>
>>
>> /maybe/ though I'm not sure the DIE APIs support removing children, do
>> they?
>>
>>
>>> Function local types /probably/ have a similar problem. Maybe call_site
>>>> tags too? (since they reference function declarations to say which function
>>>> the call_site is calling) Perhaps some other stuff - but those are ones
>>>> that come to my mind at least.
>>>>
>>>
>>> I just tested function local types and they do have the same problem.
>>>
>>> __attribute__((always_inline)) inline
>>> int foo() { struct S {int a;}; S s; return s.a; }
>>> int bar() { return 1 + foo(); }
>>>
>>
>> Thanks, good to confirm!
>>
>>
>>>
>>> I don't think call site tags have the problem because if they exist then
>>> the function was not inlined in that location so there must be a concrete
>>> definition.
>>>
>>
>> Fair point!
>>
>> - Dave
>>
>>
>>>
>>>
>>> Thanks, Ellis
>>>
>>> On Wed, Oct 13, 2021 at 8:29 PM David Blaikie <dblaikie at gmail.com>
>>> wrote:
>>>
>>>> On Wed, Oct 13, 2021 at 4:20 PM Ellis Hoag via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> In recent weeks I've been looking into fixing a few bugs in the Dwarf
>>>>> emitted by LLVM related to when functions are inlined.
>>>>> * https://bugs.llvm.org/show_bug.cgi?id=30637 (static variables)
>>>>>   * Fixed in https://reviews.llvm.org/D108492 (in review)
>>>>> * https://bugs.llvm.org/show_bug.cgi?id=52159 (imported declaration)
>>>>>   * Fixed in https://reviews.llvm.org/D110294 (in review)
>>>>> In these bugs `getOrCreateSubprogramDIE()` is used to find the SP DIE
>>>>> that will become the parent of a GV DIE or the reference of an imported
>>>>> declaration DIE. The problem is if the function is inlined and removed, the
>>>>> concrete SP DIE will not be created and `getOrCreateSubprogramDIE()` will
>>>>> return an empty DIE. Instead, I think we should be using the abstract
>>>>> origin DIE of the SP which is created after processing an inlined scope.
>>>>>
>>>>> Here is a concrete example from
>>>>> https://bugs.llvm.org/show_bug.cgi?id=52159
>>>>> ```
>>>>> namespace ns {
>>>>> inline __attribute__((always_inline))
>>>>> void foo() { int a = 4; }
>>>>> }
>>>>>
>>>>> void goo() {
>>>>>   using ns::foo;
>>>>>   foo();
>>>>> }
>>>>> ```
>>>>>
>>>>> This produces an imported declaration DIE that references an empty SP
>>>>> DIE even though there already exists one for foo.
>>>>> ```
>>>>> // Abstract origin
>>>>> 0x0000002f:     DW_TAG_subprogram
>>>>>                   DW_AT_linkage_name ("_ZN2ns3fooEv")
>>>>>                   DW_AT_name ("foo")
>>>>> // Empty concrete
>>>>> 0x00000047:     DW_TAG_subprogram
>>>>> 0x00000048:     NULL
>>>>> // Import declaration
>>>>> 0x00000069:     DW_TAG_imported_declaration
>>>>>                   DW_AT_import (0x00000047)
>>>>> ```
>>>>>
>>>>> One fix is to reference the abstract origin DIE.
>>>>> ```
>>>>> 0x00000069:     DW_TAG_imported_declaration
>>>>>                   DW_AT_import (0x0000002f)
>>>>> ```
>>>>>
>>>>> Another fix is to do what gcc does and fill out a specification DIE
>>>>> that the import references.
>>>>> ```
>>>>> // Import declaration
>>>>> 0x0000004f:     DW_TAG_imported_declaration
>>>>>                   DW_AT_import (0x00000063)
>>>>> // Specification
>>>>> 0x00000063:     DW_TAG_subprogram
>>>>>                   DW_AT_name ("foo")
>>>>>                   DW_AT_declaration (true)
>>>>> // Abstract origin
>>>>> 0x00000070:   DW_TAG_subprogram
>>>>>                 DW_AT_specification (0x00000063 "_ZN2ns3fooEv")
>>>>> ```
>>>>>
>>>>> Since I'm relatively new to debug info, I thought I'd ask some
>>>>> questions on the mailing list.
>>>>> 1. A simple solution to this class of bugs is to reference the
>>>>> abstract origin SP DIE where appropriate. Do we have any rules for when to
>>>>> reference the abstract origin if it exists?
>>>>>
>>>>
>>>> That's the difficult problem here - the abstract origin only exists if
>>>> there's at least one instance of the function being inlined - and we don't
>>>> know that until we've processed all the functions. The way this was/is
>>>> currently implemented is to create the concrete definition subprogram when
>>>> we see the concrete function definition, but not fill out the attributes
>>>> that would be inherited from an abstract origin if there was one - then
>>>> wait until we get to the end of the module and if we've created an abstract
>>>> origin (because an inlined instance was found/produced), then we add the
>>>> abstract_origin attribute to the concrete definition subprogram, and if we
>>>> haven't created an abstract origin (because there were no inlined
>>>> instances) then we put the attributes (name, decl file/line/etc) on the
>>>> concrete definition without creating an abstract origin.
>>>>
>>>> For these other cases (variables, imported declarations, maybe function
>>>> local types too?) - I think, at least consistent with the current design,
>>>> we have to defer producing them until we get to the end - the same as for
>>>> the name/decl file/line attributes, etc.
>>>>
>>>> Though deferring them presents other challenges - we can't reference
>>>> them if they're deferred, so if some other debug info like the type of a
>>>> function parameter has to be produced, how do we produce that if we have
>>>> deferred the type production?
>>>>
>>>> We can't/shouldn't/it's difficult to create the DIEs but to defer
>>>> attaching those DIEs to the DIE tree - because there's some logic that uses
>>>> the DIE parent chain/which CU a DIE is in to determine certain issues of
>>>> encoding (whether CU-local references can be used or the like). Maybe we
>>>> can get rid of that constraint (at which point we could create DIEs but
>>>> defer adding them to the DIE tree) but if I recall correctly it's pretty
>>>> deeply embedded/important.
>>>>
>>>>
>>>>> 2. Would it be better to follow gcc's solution and fill out
>>>>> specifications in these cases?
>>>>>
>>>>
>>>> I guess that doesn't address the function-local static variable case,
>>>> or the case of an imported declaration being inside a subprogram? (we could
>>>> in theory put those in a declaration DIE, but it'd probably be extra
>>>> confusing to consumers)
>>>>
>>>>
>>>>> 3. Are they more known cases/bugs to consider?
>>>>>
>>>>
>>>> Function local types /probably/ have a similar problem. Maybe call_site
>>>> tags too? (since they reference function declarations to say which function
>>>> the call_site is calling) Perhaps some other stuff - but those are ones
>>>> that come to my mind at least.
>>>>
>>>> Thanks for looking into this - sorry it's all a bit thorny, though.
>>>>
>>>> - Dave
>>>>
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211014/fcd769aa/attachment.html>