[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

Tue Sep 1 06:48:37 PDT 2020

Hi David,

Thanks for your feedback.

My thinking would be that dbg.values for variable locations, and dbg.values for "backup" entry_value locations would be basically separate - as though there were two variables. How this would be reflected in the IR, I'm not sure - maybe it's similar to what you're suggesting (perhaps you could show a more fleshed out example? even for a simple function "void f1(int i) { f2(); f3(i); f2(); }" or something. I guess I would've imagined maybe a way for the dbg.value to include an extra bit saying "I'm an entry value expression" - oh, but I see, there's no IR for it how to have an entry value ins the expression? Fair enough, yeah, either using just DW_OP_entry_value with a counted value being the function parameter, or some DWOP_LLVM_* with some more suitable semantics, sounds OK to me. But probably having a top-level bit on the dbg.value saying "this is a backup/entry_value based location" is probably useful too - mostly ignored by optimizations, they would apply all the same transformations to it to create new locations from old ones, etc.
We have an LLVM-internal operation (DW_OP_LLVM_entry_value), but I think we might be needing something different/more complex (e.g. a flag that indicates it is an entry_value/backup; since it needs to coexist with the real value). An alternative could be a separate intrinsic llvm.dbg.entry_val(), but I think we all want to avoid extra Intrinsics if possible.

I guess it would mean a frontend or early pass would create two locations for every parameter? (backup/entry_value based (though that would be tricky to do up-front, since frontends don't do the dataflow analysis, they just create an alloca and let it be read/written to as needed - so maybe entry_vaule based locations would be created on the fly more often somehow), and direct)
I guess we'd need something like that, but "on-the-fly" model will be more acceptable. Or, a separate IR pass for that purpose, but it would introduce some extra overhead...

Best regards,
Djordje

________________________________
From: David Blaikie <dblaikie at gmail.com>
Sent: Tuesday, September 1, 2020 9:51 AM
To: Djordje Todorovic <Djordje.Todorovic at syrmia.com>
Cc: LLVM Dev <llvm-dev at lists.llvm.org>; vsk at apple.com <vsk at apple.com>; aprantl at apple.com <aprantl at apple.com>; david.stenberg at ericsson.com <david.stenberg at ericsson.com>; paul.robinson at sony.com <paul.robinson at sony.com>; Jeremy Morse <jeremy.morse at sony.com>; asowda at cisco.com <asowda at cisco.com>; ibaev at cisco.com <ibaev at cisco.com>; Nikola Tesic <Nikola.Tesic at syrmia.com>; Petar Jovanovic <petar.jovanovic at syrmia.com>; Caroline Tice <cmtice at google.com>; Tobias Bosch <tbosch at google.com>; Fangrui Song <maskray at google.com>
Subject: Re: [llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

(+ a few other folks from Google interested in increased optimized debug info location nifo)

I don't have much context for the variable location part of LLVM's DWARF handling - I've mostly been leaving that to other folks, so take anything I say here with a grain of salt.

My thinking would be that dbg.values for variable locations, and dbg.values for "backup" entry_value locations would be basically separate - as though there were two variables. How this would be reflected in the IR, I'm not sure - maybe it's similar to what you're suggesting (perhaps you could show a more fleshed out example? even for a simple function "void f1(int i) { f2(); f3(i); f2(); }" or something. I guess I would've imagined maybe a way for the dbg.value to include an extra bit saying "I'm an entry value expression" - oh, but I see, there's no IR for it how to have an entry value ins the expression? Fair enough, yeah, either using just DW_OP_entry_value with a counted value being the function parameter, or some DWOP_LLVM_* with some more suitable semantics, sounds OK to me. But probably having a top-level bit on the dbg.value saying "this is a backup/entry_value based location" is probably useful too - mostly ignored by optimizations, they would apply all the same transformations to it to create new locations from old ones, etc.

I guess it would mean a frontend or early pass would create two locations for every parameter? (backup/entry_value based (though that would be tricky to do up-front, since frontends don't do the dataflow analysis, they just create an alloca and let it be read/written to as needed - so maybe entry_vaule based locations would be created on the fly more often somehow), and direct)

On Tue, Sep 1, 2020 at 12:35 AM Djordje Todorovic <Djordje.Todorovic at syrmia.com<mailto:Djordje.Todorovic at syrmia.com>> wrote:

Hi all,

The debug entry values feature introduces new DWARF symbols (tags, attributes, operations) on caller (call site) as well as on callee side; and the intention is to improve debugging user experience by using the functionality (especially in “optimized” code by turning “<optimized_out>” values into real values). The call site information includes info about call itself (described with DW_TAG_call_site) with corresponding children representing function arguments at the call site (described with DW_TAG_call_site_params). The most interesting DWARF attribute for us (here) is DW_AT_call_value which contains a DWARF expression which represents a value of the parameter at the time of the call. For the context of this RFC, more relevant part of the feature is the callee side, and it refers to new DWARF operation - DW_OP_entry_value, used to indicate that in some situations we can use parameter’s entry value as a real value in the current frame. It relies on the call-site info provided, and the more DW_AT_call_value generated, the more debug location inputs using DW_OP_entry_value will be turned into real values.

Current implementation in LLVM

Currently in LLVM, we generate the DW_OP_entry_values *only* for unmodified parameters during the LiveDebugValues pass, for the places where the Code Generation truncated live range of the parameters. The potential of the functionality goes beyond this, and it means we should be able to use the entry values even for modified parameters iff the modification could be expressed in terms of its entry value. In addition, there are cases where we can express values of local variables in terms of some parameter’s entry-values (e.g. int local = param + 2;).

Proposal

The idea of this RFC is to introduce an idea/discussion of using the DW_OP_entry_value not only at the end of LLVM pipeline (within LiveDebugValues). There are cases it could be useful at IR level; i.e. for unused arguments (please take a look into https://reviews.llvm.org/D85012); I believe there are a lot of cases where an IR pass drops/cuts variable’s debug value info where an entry value can fall back on as a backup location. There could be multiple ways of implementation, but in general, we need to extend metadata describing the debug value to support/refer to entry value/backup value as well (and when primary location is lost, the value with DW_OP_entry_value becomes the primary one). One way could be extending of llvm.dbg.value with an additional operand as following:

               llvm.dbg.value(…, DIEntryValExpression(DW_OP_uconst, 5)) // DIEntryValExpression implicitly contains DW_OP_entry_value operation

The bottom line is that the production of call-site side of the feature stays the same, but LLVM will have more freedom to generate more of DW_OP_entry_values operation on the callee side.

Any thoughts on this?

Best regards,

Djordje
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200901/54242f7b/attachment.html>