[llvm-dev] [RFC] Allowing debug intrinsics to reference multiple SSA Values

Djordje Todorovic via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 24 01:14:33 PST 2020


Hi Stephen,

This makes sense to me.

The Salvage Debug Info functionality is missing the chance to salvage a lot of cases where the optimized-out instructions manipulate on multiple SSA values. Currently, it supports only expressions with a single SSA value and a constant.
This could have good impact on debug optimized code and debug location coverage.

DW_OP_pick should be something like DW_OP_LLVM_pick, but I think it is an implementation detail, so it might be too early for such comment.

Best,
Djordje

On 21.2.20. 14:24, Tozer, Stephen via llvm-dev wrote:
>     What would it look like without this extension? If we modeled it as if all the register values were already on the stack (an extension of the current way where the singular value is modeled as being already on the stack, if I understand it correctly?)?
> 
>     If it's decided that the best approach is to introduce something like DW_OP_LLVM_register - might be worth migrating to that first (basically adding DW_OP_LLVM_register(0) at the start of every DIExpression) and then expanding it from unary to n-ary support
> 
> Putting the register values initially on the stack reduces the verbosity, though it could complicate successive salvages of variadic DIExpressions - if any value other than the last needs salvaging, then you have to use DWARF stack operations to move it to the top of the stack. For example, if the elements are pushed in order so that the last element is on the top of the stack:
> 
> %c = mul 3, %a
> %d = add 5, %b
> dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus), %c, %d)
> ; Salvage %d
> dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5, DW_OP_plus), %c, %b)
> ; Salvage %c needs to use DW_OP_swap
> dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5, DW_OP_swap, DW_OP_constu, 3, DW_OP_mul, DW_OP_plus), %a, %b)
> 
> Or, written out as the stack state:
> [%a, %b]               ; Initial state
> [%a, %b + 5]           ; DW_OP_plus_constu, 5
> [%b + 5, %a]           ; DW_OP_swap
> [%b + 5, %a, 3]        ; DW_OP_constu, 3
> [%b + 5, %a * 3]       ; DW_OP_mul
> [(%b + 5) + (%a * 3)]  ; DW_OP_plus
> 
> The simplest and most general solution would be to use DW_OP_pick, which duplicates the stack element at a given index to the top of the stack. This is more or less the same as using DW_OP_LLVM_register - both of them take an index corresponding to a register value - with a few differences: we get to maintain the default concise dbg.value with a single argument and empty DIExpression, but salvaging becomes more brittle. If we rely on elements being on the stack in their declared order, then we need to guarantee that nothing modifies those stack elements. The cleanest implementation of this would be for SalvageDebugInfo to salvage expressions normally when the last register is salvaged, and switch to using DW_OP_pick for every register whenever any other register is salvaged:
> 
> %c = add %a, 5
> %e = div %c, %d
> ; x = %b * ((%a + 5) / %d)
> dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul), %b, %e)
> ; Salvage %e; last operator so salvage normally
> dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul, DW_OP_div), %b, %c, %d)
> ; Salvage %c; not last operator and expression isn't already using pick, so add DW_OP_pick for registers
> dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_pick, 0, DW_OP_pick, 1, DW_OP_plus_constu, 5, DW_OP_pick, 2, DW_OP_div, DW_OP_mul), %b, %a, %d)
> 
> The major advantage of putting all registers on the stack is that it reduces verbosity and also doesn't require us to implement the new operator or update existing DIExpressions to contain it, which is useful. Personally I lean towards keeping things simple and consistent, and so using the pick/register operator in every expression rather than only the ones that need it, but there's a good case for either I think.
> 
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> *From:* David Blaikie <dblaikie at gmail.com>
> *Sent:* 20 February 2020 20:21
> *To:* Tozer, Stephen <stephen.tozer at sony.com>; Adrian Prantl <aprantl at apple.com>; Jonas Devlieghere <jdevlieghere at apple.com>; Robinson, Paul <paul.robinson at sony.com>; Eric Christopher <echristo at gmail.com>
> *Cc:* llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] [RFC] Allowing debug intrinsics to reference multiple SSA Values
>  
> (+usual debug info folks)
> 
> I'm mostly staying out of discussions around optimized debug info/variable locations - other folks have more state on this than I do. But some casual thoughts from the peanut gallery... 
> 
> On Thu, Feb 20, 2020 at 8:54 AM Tozer, Stephen via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
>     Currently, the debug intrinsic functions each have 3 arguments: an SSA value representing either the address or Value of a local variable, a DILocalVariable, and a complex expression. If the SSA value is an Instruction, and that Instruction is at some point deleted, we attempt to salvage the SSA value by recreating the instruction within the complex expression. If the instruction cannot be replicated by a complex expression, then the variable location is dropped causing a reduction in coverage. One of the key restrictions on this process at the moment is that each intrinsic can only reference a single SSA value; a numeric constant may be represented within the expression itself, allowing for binary operators with a constant operand to be salvaged:____
> 
>     __ __
> 
>     %c = add i32 %a, 4____
> 
>     llvm.dbg.value(metadata i32 %c, DILocalVariable("x"), DIExpression())____
> 
>     ; Salvage...____
> 
>     llvm.dbg.value(metadata i32 %a, DILocalVariable("x"), DIExpression(DW_OP_constu, 4, DW_OP_plus))____
> 
>     __ __
> 
>     This proposal is to allow multiple SSA value references within a debug intrinsic, allowing binary operators with non-constant operands to be salvaged. This is a long-awaited feature, with an open bugzilla[0] and support from members of the community[1][2]. To implement this change, each debug intrinsic will contain a list of SSA values instead of just one, and a new operator will be added: DW_OP_LLVM_register, which takes as its only argument the index of an SSA value within the intrinsic’s list, and pushes that SSA value onto the expression stack.
> 
> 
> What would it look like without this extension? If we modeled it as if all the register values were already on the stack (an extension of the current way where the singular value is modeled as being already on the stack, if I understand it correctly?)?
> 
> If it's decided that the best approach is to introduce something like DW_OP_LLVM_register - might be worth migrating to that first (basically adding DW_OP_LLVM_register(0) at the start of every DIExpression) and then expanding it from unary to n-ary support.
>  
> 
>     Two proposed syntaxes for the list of SSA values - though suitable alternatives may be worth considering - are to either replace the first argument of the intrinsic function with an MDNode containing the SSA values as operands, or to remove the first argument and make the intrinsic function variadic, passing the SSA value list as vargs:____
> 
>     __ __
> 
>     %c = add i32 %a, %b____
> 
>     llvm.dbg.value(metadata i32 %c, DILocalVariable("x"), DIExpression())____
> 
>     ; Salvage...____
> 
>     llvm.dbg.value(!{metadata i32 %a, metadata i32 %b}, DILocalVariable("x"), DIExpression(DW_OP_LLVM_register, 0, DW_OP_LLVM_register, 1, DW_OP_plus))____
> 
>     ; Alternatively, the intrinsic function could be made variadic...____
> 
>     llvm.dbg.value(DILocalVariable("x"), DIExpression(DW_OP_LLVM_register, 0, DW_OP_LLVM_register, 1, DW_OP_plus), metadata i32 %a, metadata i32 %b)____
> 
>     __ __
> 
>     The new operator DW_OP_LLVM_register would exist in the IR and MIR, and further down the pipeline be replaced by the appropriate operator for the target debug output. For example, when producing DWARF this would be replaced by DW_OP_regval_type, which pushes the contents of a given register interpreted as a value of a given type onto the DWARF expression stack.____
> 
>     __ __
> 
>     This has the potential to allow salvaging in a much greater number of cases than is currently possible. There are also potential follow-up tasks, such as allowing the salvaging of conditional values, that would further improve debug variable availability. The following table gives, for several of the multi-source application projects in the test suite, the number of successful invocations of SalvageDebugInfo, and the number of failed salvages for each type of unsalvageable instruction:____
> 
>     __ __
> 
>                 Success             Variadic Binops     Variadic GEPs       Cmp Insts           Select Insts        Load Insts          Phi Nodes           Alloca Insts        Call Insts____
> 
>     ALAC        261                 29                  61                  0                   0                   1                   12                  0                   0____
> 
>     Burg        50                  1                   9                   0                   1                   95                  6                   0                   0____
> 
>     hbd         514                 16                  1                   0                   3                   45                  10                  0                   4____
> 
>     Lua         270                 12                  54                  0                   12                  46                  32                  1                   0____
> 
>     minisat     458                 10                  10                  3                   1                   35                  4                   0                   0____
> 
>     sgefa       439                 1                   121                 0                   20                  14                  55                  0                   0____
> 
>     SIBsim4     153                 15                  6                   0                   3                   40                  3                   0                   0____
> 
>     siod        112                 2                   1                   0                   0                   11                  5                   2                   1____
> 
>     SPASS       1241                70                  27                  27                  27                  2114                156                 0                   7____
> 
>     spiff       39                  0                   15                  0                   0                   7                   2                   1                   0____
> 
>     sqlite3     2322                94                  167                 6                   37                  1143                136                 4                   10____
> 
>     treecc      127                 1                   0                   0                   1                   350                 37                  0                   1____
> 
>     viterbi     7                   0                   1                   0                   1                   1                   0                   0                   0____
> 
>     __ __
> 
>     __ __
> 
>     Of these categories, the first 3 will become salvageable after this work is implemented, and Select Insts will also be salvageable with some follow-up work to enable conditional branching in complex expressions. The remainder are not salvageable in general, although it's possible that the specific passes that delete those instructions may be able to preserve the debug info as long as the code isn't totally dead.____
> 
>     __ __
> 
>     [0] https://bugs.llvm.org/show_bug.cgi?id=39141____
> 
>     [1] https://reviews.llvm.org/D51976#1237060____
> 
>     [2] http://lists.llvm.org/pipermail/llvm-dev/2019-November/137021.html____
> 
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 


More information about the llvm-dev mailing list