[llvm-dev] RFC: [DebugInfo] Improving Debug Information in LLVM to Recover Optimized-out Function Parameters

Fri Feb 8 07:14:28 PST 2019

It's fantastic to see this happening! It'll be really nice to start
seeing some DWARF 5 feature work (as opposed to the infrastructure
stuff, which has its own benefits but doesn't help the end-user's
debugging experience).  A couple of questions below.
--paulr

> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
> Ananthakrishna Sowda (asowda) via llvm-dev
> Sent: Thursday, February 07, 2019 9:32 PM
> To: Adrian Prantl
> Cc: llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] RFC: [DebugInfo] Improving Debug Information in
> LLVM to Recover Optimized-out Function Parameters
> 
> Thank you for your interest and comments! Please see my responses inline.
> 
> On 2/7/19, 3:17 PM, "aprantl at apple.com on behalf of Adrian Prantl"
> <aprantl at apple.com> wrote:
> 
> 
> 
>     > On Feb 7, 2019, at 1:49 PM, Ananthakrishna Sowda (asowda) via llvm-
> dev <llvm-dev at lists.llvm.org> wrote:
>     >
>     > Hi,
>     > Following is a proposal to improve location coverage for Function
> parameters in LLVM. The patches for review will be posted soon.
>     >
>     > RFC: [DebugInfo] Improving Debug Information in LLVM to Recover
> Optimized-out Function Parameters
>     >
>     > Ananthakrishna Sowda(Cisco), asowda at cisco.com
>     > Nikola Prica (RT-RK/Cisco), nprica at rtrk.com
>     > Djordje Todorovic(RT-RK/Cisco), djtodorovic at rtrk.com
>     > Ivan Baev (Cisco), ibaev at cisco.com
>     >
>     >
>     > Overview of the problem
>     > Software release products are compiled with optimization level –O2
> and higher. Such products might produce a core-file in case of a failure.
> Support engineers usually begin debug analysis by looking at the backtrace
> from a core-file. Unfortunately, many parameters in backtraces are
> reported as optimized out due to variety of reasons. This makes triaging
> the issue and assigning ownership harder due to missing information. It is
> harder for the product team to understand the cause of the failure. In
> summary, we are describing a well-known serviceability problem for
> optimized production code.
>     >
>     > Proposal for solution
>     > Function parameters have a natural fall-back location which is
> parent frame. Debuggers can easily go up a frame in call-chain and
> evaluate any expression. Expert developers can find what values parameters
> had at function entry point by examining disassembly of caller frame at
> that particular function call. With additional call-site information
> produced by compiler, debugger can fully automate this technique. DWARF 5
> specification has new tags and attributes to describe call-site parameter
> information [1][2]. it is already implemented in GCC and GDB since
> 2011[3]. We propose implementing this feature in LLVM to enhance the
> debugging of optimized code experience of LLVM users.
>     >
>     > Prior mention
>     > An initial version of our work was presented as a poster during LLVM
> Developer Meeting, in San Jose, 2018. The feature is now fully implemented
> in internal Clang/LLVM 4.0 version.
>     > We presented a talk on our work at FOSDEM 2019[4].
>     >
> 
>     Thank you for posting this. This looks very interesting! Since your
> proposal has a lot of different components, Sema support,
> DW_AT_call_site_parameter support, DW_OP_entry_value support, it will
> probably be best to split them out into separate reviews, but it's also
> good to discuss the proposal in its entirety first. I have a bunch of
> questions to make sure I fully understand what you are doing.
> 
> Sure, we will post several patches, each as a logical unit.
> 
>     > Implementation notes in Clang and LLVM
>     > On the callee side the only information that we need is whether a
> parameter is never modified in the function. If true then we can use
> parameter’s entry value when we lose track of parameter’s location. As a
> natural way of handling this problem we used Clang’s Sema and its
> constness check to embed this information in variable’s declaration which
> is later used for DILocalVariable construction.

Do you think we could build on this to offer entry_value location list
entries, if the value is lost (e.g., register is reused) before the parameter 
is modified within the callee?

> 
>     By looking at whether an argument is modified in the function, you can
> identify variables that can be described with an entry value location and
> that entry value would be valid throughout the function. Are you using
> this information in the function body to identify whether to emit an entry
> value location, or are you using this information at the call site to
> identify call sites for which call site parameters would be beneficial (or
> both)?
> 
>     Is emitting an entry value location in the function body an either-or
> thing or do you also emit plain old locations if you have them available
> in the location list together with the entry values?
> 
>     In the function, I assume you don't know whether all call sites will
> have call site parameters. How do you decide whether to emit entry value
> locations?
> 
> Entry value location is added to same conventional location list. We emit
> them when we see holes in the coverage, looking at the whole function
> scope. It is used by the debugger when there is no conventional location
> for a program range.  It is reported by the debugger as   “<optimized-
> out>, @entry  = <value at call site>”.
> Unmodified argument is a sub-set for which debugger can report  @entry
> value is same as actual value. So, it will not be reported as optimized-
> out.
> We generate entry value location whenever  conventional does not cover one
> hundred percent. We generate call-site information when the call-site
> parameter values can be evaluated by unwinding to the parent frame. Only
> debugger can tell if look-up of an entry value finds matching call site
> and call site parameter.

Is the idea to use DW_OP_entry_value in a default-location list entry?
I think we do not emit those currently, but it seems like an ideal match
for an unmodified formal parameter.

> 
> 
>     > For call-site information, new DINode metadata DICallSite and
> DICallSiteParam are defined and these are emitted by the Clang frontend.
> The metadata is associated to the call or invoke IR instruction. Here is
> an example:
>     >
>     > %call5 = call i32 @fed_em_strncmp(i8* %arraydecay, i8* %arraydecay1,
> i64 5), !dbg !114, !call_site !101
>     > …
>     > !99 = !DICallSiteParam(argno: 1, variable: !91, expr:
> !DIExpression())
>     > !100 = !DICallSiteParam(argno: 2, variable: !95, expr:
> !DIExpression())
>     > !101 = !DICallSite(scope: !87, file: !3, parameters: !102, line: 40,
> calledSubprogram: !13)
>     > !102 = !{!99, !100, !103}
>     > !103 = !DICallSiteParam(argno: 3, expr: !DIExpression(DW_OP_lit5)
>     > For tracking call sites and call site parameters in backend two new
> pseudo instructions, DBG_CALLSITE and DBG_CALLSITEPARAM, are introduced.
> See the MIR code bellow:
>     >
>     > DBG_CALLSITE 0, %noreg, <!19>; dbg:strncmp.c:40:47
>     >          * DBG_CALLSITEPARAM %RDX,  <0x727fee0> =
> !DIExpression(DW_OP_lit5), 5, %noreg ; dbg:strncmp.c:40:47
>     >          * DBG_CALLSITEPARAM %RSI, "str2" <0x71a6dd0> =
> !DIExpression(), %RBX, %noreg ; dbg:strncmp.c:40:47
>     >          * DBG_CALLSITEPARAM %RDI, "str1" <0x71a6dd0> =
> !DIExpression(), %, %RSP, 4 ; dbg:strncmp.c:40:47
> 
>     I'll refrain from bike-shedding the actual implementation; let's save
> this for phabricator, but conceptually, this makes sense to me. If I
> understand correctly, you are identifying at the call site parameters that
> are in locations that can be restored by unwinding the function call, such
> as constants, stack slots, and callee/r-saved registers.
>     Can you explain why you need to identify them at the IR level? Could
> you do it just in MIR, too, or is there some information missing in MIR?
> 
>     What happens to your DICallSiteParam when a function call gets
> inlined?
> 
> Your understanding about call site parameter is right!
> When a function call gets inlined,   call site information is eliminated.
> We kept to the LLVM guidelines of introducing DI metadata in the front end
> and carrying it through IR and MIR. I don’t think there is missing
> information in MIR, though.
> 
> >
>     > There is a challenge in ISel phase to produce them. Algorithm that
> collects information about call site parameters iterates over call
> sequence chained nodes returned from target specific call lowering
> interface. Goal of the algorithm is to recognize SDNodes that represent
> instructions which will load function arguments into registers that
> transfer them into another function call frame. There is a question
> whether this is effectively implemented as a general matching algorithm or
> it should be lowered to target specific level. DBG_CALLSITE pseudo
> instruction will need to be revisited since information whether a call is
> tail call or not could be extracted differently but for sake of simplicity
> we chose this.
>     > Most of passes handle DBG_CALLSITE and DBG_CALLSITEPARAM through
> target instruction info interface method isDebugInstr(). This method is
> used to skip processing of pseudo debug instructions. Since these new
> pseudo debug instructions relay on virtual registers and frame objects we
> need to follow up their substitution through the compilation phases. There
> were several backend passes that needed special attention: Register
> Coalesce, Inline Spiller, Prologue Epilog Inserter, Split Kit and Virtual
> Register Rewriter. Virtual Register Rewriter required implementation of
> target specific salvaging interface for “call site parameter identities” –
> situation following identity copy instructions that leads to overlapping
> of parameter transferring register location and location that is loaded
> into that register.
>     > The last challenge is to extend LiveDebugValues pass to generate
> additional DBG_VALUE instructions with new kind of debug expression (with
> ‘DW_OP_entry_value’) for parameters that meet the requirements described
> in [1].
>     > Finally, emitting call-site debug information can be controlled by
> ‘-debugger-tune’ and ‘-dwarf-version’ LLVM code-generation flags, since
> not all debuggers used in the community consume this DWARF information.
> 
>     Have you considered to instead insert a very late MIR pass that does
> some backwards analysis on the machine code to yield potential call site
> parameters instead of threading it all the way through the compiler? If
> yes, why did you choose this implementation?
> 
> If I have not convinced you on this, we are open for suggestion. I will
> ask my colleagues on this project to respond too.
> 
> 
>     >
>     > Location coverage improvement
>     > The important criteria in debugging-optimized-code is whether the
> compiler has location information for variables and parameters. We use
> ‘locstats’ utility from elfutils [5] package to guide us in improving
> overall location coverage in final executable. For each non-artificial
> variable or formal parameter - or Debugging Information Entry in DWARF -
> ‘locstats’ computes what percentage from the code section bytes where the
> variable is in scope, the variable has a non-empty location description.
> 100% coverage is not expected for non-global variables and function
> parameters, since value may not be ‘live’ through the entire scope. On the
> other end, 0% coverage for variables which are used in the code is
> indicative of compiler losing track of values.
>     > The second column in Table 1 shows ‘locstats’ report for gdb-7.11
> compiled for x86-64 with “-g –O2” by Clang 4.0. For example, there are
> 29476 parameters whose coverage is in 91..100% range. The third column
> shows locstats’ report with “-g –O2” and the parameter-entry-value
> feature. There are now 37671 parameters whose coverage is in 91..100%
> range – for a 28% improvement.
>     > Because our implementation computes an additional location list
> entry to parameters whenever possible, and DW_OP_entry_value is valid
> through the entire scope of the parameter, the numbers at 91..100 row are
> relevant indication of improvement with parameter-entry-value.
>     >
>     > Coverage	Parameters	Parameters with emit-param-entry-values
>     > (% range)	(number/%))	(number/%)
>     > 0..10		22682/30%	21342/28%
>     > 11..20		3498/4%	2337/3%
>     > 21..30		3083/4%	1986/2%
>     > 31..40		3050/4%	1862/2%
>     > 41..50		2534/3%	1574/2%
>     > 51..60		2349/3%	1571/2%
>     > 61..70		2184/2%	1649/2%
>     > 71..80		2620/3%	2069/2%
>     > 81..90		3432/4%	2847/3%
>     > 91..100		29476/39%	37671/50%
>     > Table 1 Location coverage statistics for function parasmeters
>     >
>     > Improved backtrace for optimized code in debugger
>     > Figure 1 below shows improved backtrace for optimized code when
> compiled with parameter entry value tracking feature. Please note the new
> @entry values reported for parameters in backtrace. These parameters will
> otherwise be reported as <optimized-out>.
>     >
>     > gdb) bt
>     > #0 get_next_move_from_list (list=list at entry=0x7fffffffbf88,
>     > color=color at entry=1, moves=moves at entry=0x7fffffffbfb0,
>     > cutoff=cutoff at entry=100) at engine/owl.c:3032
>     > #1 0x000000000042a957 in do_owl_attack (str=<optimized out>,
>     > move=<optimized out>, move at entry=0x7fffffffc334, wormid=<optimized
> out>, wormid at entry=0x7fffffffc33c, owl=<optimized out>, owl at entry=0x0,
>     > komaster=komaster at entry=0, kom_pos=kom_pos at entry=0,escape=<optimized
> out>)
>     > at engine/owl.c:1306
>     > #2 0x000000000042a0d0 in owl_attack (target=target at entry=148,
>     > attack_point=attack_point at entry=0x7fffffffc580, certain=<optimized
> out>, certain at entry=0xb63048 <dragon+11288>,
> kworm=kworm at entry=0x7fffffffc3c4) at engine/owl.c:1144
>     > #3 0x0000000000412c71 in make_dragons (color=<optimized out>,
> color at entry=1, stop_before_owl=<optimized out>, stop_before_owl at entry=0,
> save_verbose=<optimized out>, save_verbose at entry=0) at engine/dragon.c:346
>     > #4 0x0000000000417fdc in examine_position (color=color at entry=1,
> how_much=how_much at entry=99) at engine/genmove.c:152
>     > #5 0x00000000004183c6 in do_genmove (move=move at entry=0x7fffffffd344,
> color=1, color at entry=3, pure_threat_value=<optimized out>,
>     > allowed_moves=<optimized out>, allowed_moves at entry=0x0)
>     > at engine/genmove.c:334
>     > #6 0x000000000041926d in genmove_conservative
> (i=i at entry=0x7fffffffd36c, j=j at entry=0x7fffffffd368, color=3) at
> engine/genmove.c:255
>     > #7 0x00000000004618ae in gtp_gg_genmove (s=<optimized out>) at
> interface/play_gtp.c:2163
>     > #8 0x000000000045b0f8 in gtp_main_loop (commands=<optimized out>,
> gtp_input=0xb8b100) at interface/gtp.c:126
>     > Figure 1: Backtrace with @entry value parameters
> 
>     Very nice!
> 
>     >
>     > Cost in disk image size increase and compile-time
>     > The parameter-entry-value feature is enabled with -g compilation.
> Due to new DebugInfo metadata generation which adds entries to DWARF
> sections such as .debug_info and .debug_loc, there is expected size
> increase of disk image of the executable built with “-g –O”. For SPEC CPU
> 2006 benchmark, the average size increase is 15%. However, there is no
> change in sections loaded at runtime such as .text. .data, .bss.  Hence,
> there is no runtime size increase.
>     > Compile-time cost increase is 1-3% percent for SPEC CPU 2006.
> 
>     At this early stage I'm not yet worried by the size increase. We'll
> probably find some opportunities to fine-tune the heuristics that decide
> whether a call site parameter / entry value is profitable. We can also
> always provide a tuning option to turn the feature off.
> 
> 
>     > Community up-streaming
>     > Since we have implemented this for LLVM-4.0 we are currently in
> process of porting this implementation on LLVM trunk. We are planning to
> share this set of patches with LLVM community and seek feedback in
> improving certain parts of our implementation.
> 
>     Sounds great!
>     thanks for sharing this,
>     adrian
> 
> 
> Thanks,
> Ananth
> 
>     >
>     >
>     > References
>     > [1]Jakub Jelínek and Roland McGrath. DWARF DW_OP_entry_value
> extension proposal. http://dwarfstd.org/ShowIssue.php? issue=100909.1.
>     > [2 Jakub Jelínek, Roland McGrath, Jan Kratochvíl, and Alexandre
> Oliva. DWARF DW_TAG_call_site extension proposal. http://dwarfstd.org/
> ShowIssue.php?issue=100909.2
>     > [3] J. Jelinek “Improving debug info for optimized away parameters”
> https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=view&target=jelin
> ek.pdf
>     > [4] FOSDEM talk
> http://bofh.nikhef.nl/events/FOSDEM/2019/K.4.201/llvm_debug.webm
>     > [5] Elfutils https://sourceware.org/elfutils/
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     > _______________________________________________
>     > LLVM Developers mailing list
>     > llvm-dev at lists.llvm.org
>     > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev