[llvm-dev] [DebugInfo] Different representations of optimised-out variables in DWARF

via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 27 07:57:48 PST 2021



> -----Original Message-----
> From: Jeremy Morse <jeremy.morse.llvm at gmail.com>
> Sent: Wednesday, January 27, 2021 10:26 AM
> To: David Blaikie <dblaikie at gmail.com>; Robinson, Paul
> <paul.robinson at sony.com>; Tozer, Stephen <stephen.tozer at sony.com>; llvm-
> dev <llvm-dev at lists.llvm.org>
> Subject: [DebugInfo] Different representations of optimised-out variables
> in DWARF
> 
> Hi,
> 
> This was "[llvm-dev] [DebugInfo] The current status of debug values
> using multiple machine locations" but I don't want to de-rail Stephens
> thread,
> 
> Paul wrote:

Actually that was Stephen.  I don't know why Sony insists on dropping the
personal names from outgoing email (or maybe it's just Outlook365's fault).

> > I'm not actually sure what causes variables to be dropped from the DWARF
> > entirely, as opposed to them existing but having an unknown location for
> > their entire scope; however, outside of our desire to use dwarfdump to
> > analyze our debug info it's simply more efficient to omit variables with
> no
> > location, since they inflate the debug info size and I don't believe
> > there's any practical value in having them.
> 
> David wrote:
> > When does this ^ happen? In optimized builds we include all local
> variables
> > in a "variables" attachment to the DISubprogram, so we shouldn't be
> losing
> > variables entirely.
> > [...]
> > I think it's pretty important that we keep them. It helps a user
> understand
> > that they've not mistyped the name of a variable, etc [...]

I'm with David here; we shouldn't be dropping declared variables from
a scope just because they get optimized away.

> This is something that's bothered me for a while, as it messes with
> our statistics when changing how variable locations are tracked. Take
> this completely contrived C file:
> 
>     int foo(int bar) {
>       int baz = 12 + bar;
>       return baz;
>     }
> 
>     int qux(int quux) {
>       int xyzzy = foo(quux);
>       return xyzzy;
>     }
> 
> Using clang ef0dcb50630 and options "-O3 -g -c", llvm-locstats reports
> the object file has five variables in it. If you emit LLVM-IR, and
> replace the first operand of all "llvm.dbg.value" intrinsic
> invocations with "undef" and compile the IR with llc, then
> llvm-locstats still reports five variables. However: if you instead
> /delete/ all the invocations of "llvm.dbg.value", four variables are
> reported by llvm-locstats. This indicates there's an observable
> difference in the way we represent optimised-out variables in DWARF.
> 
> The difference between the object files is the way they represent the
> inlined copy of "foo", here's the output with undef dbg.values,
> followed by the output when I delete all the dbg.value intrinsics:
> 
>      DW_TAG_inlined_subroutine
>        DW_AT_abstract_origin (0x0000004e "foo")
>        DW_AT_low_pc  (0x0000000000000010)
>        DW_AT_high_pc (0x0000000000000013)
>        DW_AT_call_file       ("/tmp/test.c")
>        DW_AT_call_line       (7)
>        DW_AT_call_column     (0x0f)
> 
>        DW_TAG_formal_parameter
>              DW_AT_abstract_origin       (0x0000005a "bar")
> 
>       NULL
> 
> and:
> 
>      DW_TAG_inlined_subroutine
>        DW_AT_abstract_origin (0x00000048 "foo")
>        DW_AT_low_pc  (0x0000000000000010)
>        DW_AT_high_pc (0x0000000000000013)
>        DW_AT_call_file       ("/tmp/test.c")
>        DW_AT_call_line       (7)
>        DW_AT_call_column     (0x0f)
> 
>      NULL
> 
> When there are dbg.value intrinsics present, then the inlined
> subroutine gets an empty DW_TAG_format_parameter that links back to
> the abstract origin. If there are no dbg.value intrinsics present, it
> does not. As far as I understand it, consumers can still determine
> that "bar" exists in the inlined subroutine by looking at the inlined
> subroutines abstract origin. This is what the "retained nodes"
> collection preserves.
> 
> llvm-locstats / llvm-dwarfdump --statistics should probably be taught
> to look at the inlined subroutines abstract origin to find all
> variables, however, it seems unwise to have internal compiler state
> reflected in the output file in the way it is above. The cause of the
> empty DW_TAG_formal_parameter being created in
> DwarfDebug::collectEntityInfo [0] -- it distinguishes between a
> variable that has no location intrinsics, and a variable that has only
> empty location intrinsics. Putting a filter in to skip variables with
> only empty locations avoids the difference in output, and reduce the
> size of .debug_info on a stage2reldeb clang build by about 20Mb, or
> ~1%.
> 
> To ensure this email contains a question: would there be any
> objections to adding that filter, and obliging consumers to look in
> the inlined subroutines abstract origin to find optimised-out
> variables, instead of giving them a list per-inlined-instance?

This is specifically about concrete (inlined) instances, it seems.
Rereading the description of concrete instance trees (DWARF 5, 
section 3.3.8.2) it explicitly permits omitting a useless entry 
(has only abstract_origin as an attribute, and no children) (p.85 
item 1).  I think it is legal to omit these, and I would go so far
as to say it's specifically legal to omit formal_parameter DIEs
with no attributes (other than abstract_origin).
--paulr

> 
> [0] https://urldefense.com/v3/__https://github.com/llvm/llvm-
> project/blob/70e251497c4e26f8cfd85e745459afff97c909ce/llvm/lib/CodeGen/Asm
> Printer/DwarfDebug.cpp*L1779__;Iw!!JmoZiZGBv3RvKRSx!tY6GfPI-
> g0N1h42dFceJMSVgUrJkaLldqFUlVzWL2QrrVEbrcltoibOrpKdWjMiz8g$
> 
> --
> Thanks,
> Jeremy


More information about the llvm-dev mailing list