[cfe-dev] Design discussion re: DW_TAG_call_site support in clang

David Blaikie via cfe-dev cfe-dev at lists.llvm.org
Wed May 20 15:37:23 PDT 2020


On Wed, May 20, 2020 at 3:01 PM Adrian Prantl <aprantl at apple.com> wrote:

>
>
> On May 20, 2020, at 1:57 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Wed, May 20, 2020 at 9:17 AM Adrian Prantl <aprantl at apple.com> wrote:
>
>>
>>
>> On May 19, 2020, at 3:03 PM, Vedant Kumar <vedant_kumar at apple.com> wrote:
>>
>> Hi,
>>
>> In a recent review (https://reviews.llvm.org/D79967), David Blaikie
>> suggested that we have a broader design discussion about how support for
>> DW_TAG_call_site is supported in clang, so I’ll kick off the discussion.
>>
>> Some topics to discuss:
>>
>> 1) Under LTO, if we emit a declaration DISubprogram for a function in one
>> TU, and another TU defines the function with __attribute__((nodebug)),
>> would the declaration DISubprogram get attached to the definition?
>>
>> My own thoughts on this: firstly, as posed in D79967
>> <https://reviews.llvm.org/D79967>, I think David was asking about the
>> case where the declaration subprogram does not have an attached unit. I’m
>> not sure why it wouldn’t, though, so I’d appreciate some clarification on
>> that. Second, as far as I know, there isn’t a mechanism for attaching a
>> declaration subprogram to a Function which doesn’t have debug info.
>>
>> 2) Should declaration subprograms emitted in support of DW_TAG_call_site
>> be kept in the CU’s retainedTypes field?
>>
>> David pointed out that these declarations a) persist through optimization
>> and b) don't get deduplicated against the definitions, which means there’s
>> potential to bloat debug info. As an alternative, we might attach a
>> declaration subprogram to the corresponding Function (which we do already),
>> but leave it out of the CU’s retainedTypes field.
>>
>>
>> I'm having trouble visualizing this example: We are talking about forward
>> declaration DISubprograms created for call sites of functions that are not
>> defined in the CU where the call site is, correct?
>>
>
> Right
>
>
>> What does "attaching a declaration to the corresponding function" mean?
>>
>
> I think "attaching the declaration DISubprogram to the corresponding
> llvm::Function" would be a more precise statement. The same way definition
> DISubprograms are attached to llvm::Functions, using the Function's
> !dbg/getSubprogram field.
>
>
>> Why are we currently entering those declarations into retainedTypes?
>>
>
> Because we're not currently attaching those DISubprogram declarations to
> the llvm::Function declarations - or any other part of the IR.
>
>
> I see.
>

Just to correct this (for other readers) - as stated later in my email,
this statement is incorrect - declaration DISubprograms are attached to
declaration llvm::Functions in addition to being added to retainedTypes.
And because of this, I have no answer to "why are we currently entering
those declarations into retainedTypes" - I don't see any reason to do so.


> That was the piece I was missing — it wasn't clear to me that there were
> llvm::Function forward declarations in addition to the DISubprograms, but
> of course, that is how external calls are represented in IR. Makes perfect
> sense now.
>
>
>
>> I would have expected the reference from the call site to hold on to them.
>>
>>
>
> That would be a 3rd option - think attaching them to the llvm::Function
> would /probably/ be better. (since the IR already has the infrastructure
> for that attachment, whereas the call instruction doesn't - it just has the
> DebugLoc attachment)
>
>
>> I guess it's unclear to me how call sites are represented at the moment.
>>
>
> Right, probably a good place to start.
>
> If the call is to a function defined in the same module - there's no IR
> representation for the call site - the CallInst refers the llvm::Function
> that in turn refers to the definition DISubprogram. So the DWARF emission
> code looks that up to create the call site description.
>
> But if it's a function that /isn't/ defined in the same module, the
> declaration DISubprogram is created and attached to the DICompileUnit's
> retainedTypes list to make sure it doesn't get dropped (because it's not
> referenced from anywhere else in the IR/metadata/debug info) and... it
> looks like I misunderstood.
>
> No, the declaration DISubprogram /is/ attached to the llvm::Function via
> its !dbg attachment, it's just the IR syntax wasn't what I was expecting:
>
> For a definition llvm::Function, the !dbg is rendered at the end of the
> line:
> define dso_local void @_Z2f2v() local_unnamed_addr #0 !dbg !11 {
>
> But for a declaration llvm::Function, the !dbg is rendered nearer the
> beginning:
> declare !dbg !4 dso_local void @_Z2f1v() local_unnamed_addr #1
>
> Quirky, but OK.
>
>
> What happens in an llvm-link when a forward declaration and a definition
> for the same function are imported from different llvm::Modules? Are the
> Function forward decl and its DISubprogram RAUWed with the respective
> definitions?
>

I believe so.

lto1.ll:

declare !dbg !4 dso_local void @_Z2f1v() local_unnamed_addr #1

...

!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1,
producer: "clang version 11.0.0 ", isOptimized: true, runtimeVersion: 0,
emissionKind: FullDebug, enums: !2, retainedTypes: !3, splitDebugInlining:
false, nameTableKind: None)

!1 = !DIFile(filename: "lto1.cpp", ...)

!3 = !{!4}

!4 = !DISubprogram(name: "f1", linkageName: "_Z2f1v", scope: !1, file: !1,
line: 1, type: !5, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized,
retainedNodes: !2)

lto2.ll:

define dso_local void @_Z2f1v() local_unnamed_addr #0 !dbg !7 {
...

!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1,
producer: "clang version 11.0.0 ", isOptimized: true, runtimeVersion: 0,
emissionKind: FullDebug, enums: !2, splitDebugInlining: false,
nameTableKind: None)

!1 = !DIFile(filename: "lto2.cpp", ...)

!7 = distinct !DISubprogram(name: "f1", linkageName: "_Z2f1v", scope: !1,
file: !1, line: 1, type: !8, scopeLine: 1, flags: DIFlagPrototyped |
DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized,
unit: !0, retainedNodes: !2)

lto1.ll + lto2.ll:

define dso_local void @_Z2f1v() local_unnamed_addr #1 !dbg !18 {
...

!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1,
producer: "clang version 11.0.0 ", isOptimized: true, runtimeVersion: 0,
emissionKind: FullDebug, enums: !2, retainedTypes: !3, splitDebugInlining:
false, nameTableKind: None)

!1 = !DIFile(filename: "lto1.cpp", ...)

!3 = !{!4}

!4 = !DISubprogram(name: "f1", linkageName: "_Z2f1v", scope: !1, file: !1,
line: 1, type: !5, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized,
retainedNodes: !2)

!7 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !8,
producer: "clang version 11.0.0 ", isOptimized: true, runtimeVersion: 0,
emissionKind: FullDebug, enums: !2, splitDebugInlining: false,
nameTableKind: None)

!18 = distinct !DISubprogram(name: "f1", linkageName: "_Z2f1v", scope: !8,
file: !8, line: 1, type: !5, scopeLine: 1, flags: DIFlagPrototyped |
DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized,
unit: !7, retainedNodes: !2)


In the linked IR there's no other referenec to !4 other than the one via
the retainedTypes list - and so far as I can see, the only place that
iterates retainedTypes
seems to be here:
https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L882
- so I'm /pretty/ sure that's dead IR that nothing's looking at anymore.

>
>
> OK, so now my question becomes "why are these in the retainedTypes list at
> all"? Because it looks like they're attached in entirely the
> good/proper/appropriate way that will keep them when we need them, drop
> them when we don't, merge them in LTO as desired, etc. (except for them
> being in the retainedTypes list - which thwarts a bunch of that and doesn't
> seem to be providing any value that I can see - nothing iterates the
> retainedTypes list expecting to find them there/look for them there, so
> they're just kept alive for no use I can see)
>
>
>
>
>
>
>> -- adrian
>>
>>
>> My own thoughts: as I understand it, if a function ends up dead (all call
>> sites to it are optimized out), we won’t emit a DW_TAG_call_site that
>> references the function, and so we won't emit unnecessary declaration
>> subprograms. This is demonstrated in the LTO case in
>> llvm/test/DebugInfo/X86/lto-cross-cu-call-origin-ref.ll (added in
>> https://reviews.llvm.org/D70350), where the declaration subprograms
>> for func_from_b, noinline_func_in_a,
>> and always_inline_helper_in_a_that_calls_foo are elided. The same test
>> demonstrates that llvm can emit cross-CU references to definition
>> subprograms within DW_TAG_call_site. However, all of this just pertains to
>> the final DWARF. There may still be some cost (in terms of metadata) to
>> preserving a declaration subprogram throughout optimization when all
>> references to the declaration are removed. Before changing the
>> representation, it’d be helpful to measure how much it costs (either in
>> terms of compile-time, or metadata size) to keep unreferenced declaration
>> subprograms around (say, on a stage2 clang build).
>>
>> I'm not too interested in the cost, just the principle really - why are
> they tehre? (I'm ever more confused why they're there now that I see they
> are attached to the llvm::Function's !dbg, and that I can't find any code
> using their presence in RetainedTypes)
>
>>
>> As for why declaration subprograms are inserted into a CU’s retainedTypes
>> field in the first place: I don’t think this was intentionally changed just
>> for DW_TAG_call_site support. Adrian (cc’d) changed EmitFunctionDecl to
>> retain declaration subprograms in r266445, and the new code path which
>> calls into EmitFunctionDecl adopted the same behavior. (Hopefully that
>> clears up some of the history, that’s not to say that it’s not worth
>> revisiting.)
>>
>> Ah, OK, so thanks for pointing to the revision that introduced this.
>
> "commit e76bda544bbf52d9ff3b55e6018b494a1e6bbc00
> Author: Adrian Prantl <aprantl at apple.com>
> Date:   Fri Apr 15 15:55:45 2016 +0000
>
>     Update to match LLVM changes for PR27284.
>     (Reverse the ownership between DICompileUnit and DISubprogram.)
>
>     http://reviews.llvm.org/D19034
>     <rdar://problem/25256815>
>
>
> For reference, the title of that ticket was "[ThinLTO] Remove list of
> subprograms from DICompileUnit"
>
>
>     llvm-svn: 266445"
>
> Adrian - do you have any context on why this change ended up adding
> declaration DISubprogram's to the retained types list? (perhaps something
> discussed in the rdar?) It doesn't look like the clang part of this was
> reviewed.
>
> I think it's probably best to just remove it & see if anything breaks - I
> don't see anything that depends on it, at least at a cursory glance.
>
>
> I wouldn't be opposed to trying this. It looks like this may have been a
> transitional necessity and is no longer needed.
>
> -- adrian
>
>
> best,
>> vedant
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200520/35fffe23/attachment-0001.html>


More information about the cfe-dev mailing list