[llvm-dev] [DebugInfo] Different representations of optimised-out variables in DWARF

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 27 12:40:04 PST 2021


I'm a bit confused by some of the stuff in this thread, but rather than
trying to puzzle all of that out, it might be simpler/more agreeable for me
to say this:

We should never produce DWARF like this:

      DW_TAG_formal_parameter
             DW_AT_abstract_origin       (0x0000005a "bar")

(as Paul quoted from the DWARF spec - this should not be necessary/seems
just like wasted bytes)

And locstats/dwarfdump statistics should not produce different results if
it reads DWARF like that compared to DWARF missing this inlined instance
entirely.

On Wed, Jan 27, 2021 at 7:26 AM Jeremy Morse <jeremy.morse.llvm at gmail.com>
wrote:

> Hi,
>
> This was "[llvm-dev] [DebugInfo] The current status of debug values
> using multiple machine locations" but I don't want to de-rail Stephens
> thread,
>
> Paul wrote:
> > I'm not actually sure what causes variables to be dropped from the DWARF
> > entirely, as opposed to them existing but having an unknown location for
> > their entire scope; however, outside of our desire to use dwarfdump to
> > analyze our debug info it's simply more efficient to omit variables with
> no
> > location, since they inflate the debug info size and I don't believe
> > there's any practical value in having them.
>
> David wrote:
> > When does this ^ happen? In optimized builds we include all local
> variables
> > in a "variables" attachment to the DISubprogram, so we shouldn't be
> losing
> > variables entirely.
> > [...]
> > I think it's pretty important that we keep them. It helps a user
> understand
> > that they've not mistyped the name of a variable, etc [...]
>
> This is something that's bothered me for a while, as it messes with
> our statistics when changing how variable locations are tracked. Take
> this completely contrived C file:
>
>     int foo(int bar) {
>       int baz = 12 + bar;
>       return baz;
>     }
>
>     int qux(int quux) {
>       int xyzzy = foo(quux);
>       return xyzzy;
>     }
>
> Using clang ef0dcb50630 and options "-O3 -g -c", llvm-locstats reports
> the object file has five variables in it. If you emit LLVM-IR, and
> replace the first operand of all "llvm.dbg.value" intrinsic
> invocations with "undef" and compile the IR with llc, then
> llvm-locstats still reports five variables. However: if you instead
> /delete/ all the invocations of "llvm.dbg.value", four variables are
> reported by llvm-locstats. This indicates there's an observable
> difference in the way we represent optimised-out variables in DWARF.
>
> The difference between the object files is the way they represent the
> inlined copy of "foo", here's the output with undef dbg.values,
> followed by the output when I delete all the dbg.value intrinsics:
>
>      DW_TAG_inlined_subroutine
>        DW_AT_abstract_origin (0x0000004e "foo")
>        DW_AT_low_pc  (0x0000000000000010)
>        DW_AT_high_pc (0x0000000000000013)
>        DW_AT_call_file       ("/tmp/test.c")
>        DW_AT_call_line       (7)
>        DW_AT_call_column     (0x0f)
>
>        DW_TAG_formal_parameter
>              DW_AT_abstract_origin       (0x0000005a "bar")
>
>       NULL
>
> and:
>
>      DW_TAG_inlined_subroutine
>        DW_AT_abstract_origin (0x00000048 "foo")
>        DW_AT_low_pc  (0x0000000000000010)
>        DW_AT_high_pc (0x0000000000000013)
>        DW_AT_call_file       ("/tmp/test.c")
>        DW_AT_call_line       (7)
>        DW_AT_call_column     (0x0f)
>
>      NULL
>
> When there are dbg.value intrinsics present, then the inlined
> subroutine gets an empty DW_TAG_format_parameter that links back to
> the abstract origin. If there are no dbg.value intrinsics present, it
> does not. As far as I understand it, consumers can still determine
> that "bar" exists in the inlined subroutine by looking at the inlined
> subroutines abstract origin. This is what the "retained nodes"
> collection preserves.
>
> llvm-locstats / llvm-dwarfdump --statistics should probably be taught
> to look at the inlined subroutines abstract origin to find all
> variables, however, it seems unwise to have internal compiler state
> reflected in the output file in the way it is above. The cause of the
> empty DW_TAG_formal_parameter being created in
> DwarfDebug::collectEntityInfo [0] -- it distinguishes between a
> variable that has no location intrinsics, and a variable that has only
> empty location intrinsics. Putting a filter in to skip variables with
> only empty locations avoids the difference in output, and reduce the
> size of .debug_info on a stage2reldeb clang build by about 20Mb, or
> ~1%.
>
> To ensure this email contains a question: would there be any
> objections to adding that filter, and obliging consumers to look in
> the inlined subroutines abstract origin to find optimised-out
> variables, instead of giving them a list per-inlined-instance?
>
> [0]
> https://github.com/llvm/llvm-project/blob/70e251497c4e26f8cfd85e745459afff97c909ce/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1779
>
> --
> Thanks,
> Jeremy
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210127/5bcb0572/attachment-0001.html>


More information about the llvm-dev mailing list