[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands

Tue Sep 15 17:39:24 PDT 2020

> On Sep 15, 2020, at 3:07 PM, Robinson, Paul <paul.robinson at sony.com> wrote:
> 
> Hi Adrian & Stephen,
>  
> One thought here:
>  
> But — not all memory locations are l-values. If we have a DWARF location list for variable "x" which points to a memory address for the first n instructions and the switches to a constant for the remainder of the scope, the memory address is not guaranteed to be an l-value, because writing the the memory address cannot affect the later part of the function where the variable is a constant.
>  
> A very interesting point.  I believe this is a concept that LLVM does not yet understand?  that it needs to consider the l/r-value-ness of the variable throughout the function, before deciding on the form that the location-expressions will take.  I would not want to spend a lot of time parsing expressions in order to make this kind of decision.

That's right. Stephen correctly identified that we need some new concept at the LLVM IR level. My hypothesis is that DW_OP_stack_value — if applied correctly — is sufficient to represent this. Stephen is making a case that we could get out more by having a separate flag but I'm not (yet?) convinced that that is in the end-user's best interest. So if we can avoid having two flags for confusingly similar semantics I would like to go with the simpler IR.

-- adrian

>  
> And a second thought here:
>  
> I suspect there may be disagreements over whether c should share an l-value location with a, since this means that a user could write to either c or a, and that doing so would assign to both of them. My personal belief is that even if it seems confusing, we shouldn't arbitrarily restrict write-access to variables on criteria that will not always be clear to a debug user; whether or not to apply such a restriction should be left to the debugger, rather than being baked into the information we produce for it.
>  
> Ah — you've thought about this already :-)
>  
> This is actually really important to me: My take is that we must always err on the safe side. If the compiler cannot prove that it is safe to write to an l-value (ie., that the semantics are the same as if the write were inserted into the source code at that source location) it must downgrade the l-value to an r-value. That's one thing I'm feeling really strongly about.
> The reason behind this is that if we allow debug info that is only maybe correct (or correct on some paths, or under some other circumstances that are opaque to the user) — if we allow potentially incorrect debug info to leak into the program, the user can not distinguish between information they can trust and information that is bogus. That means all information that they see in the debugger is potentially bogus. If the debugger is not trustworthy it's worthless. That is feedback I keep getting particularly from kernel developers, and I must say it resonates with me.
>  
> I should say that Stephen and I had a discussion a while back, where I said IMO the compiler should emit debug info that correctly reflects the code as-it-is, and not arbitrarily decide that ‘c’ and/or ‘a’ should not be writeable.  I believe I also said this was arguable, but that I felt pretty strongly about it.
>  
> I have to say, it did not occur to me to consider emitting info that showed *neither* ‘c’ nor ‘a’ to be writeable!  This is a kind of protect-the-debugger-from-the-user perspective that I’ve not encountered before.  I would actually argue that showing both variables as writeable is in fact correct (truthful), because it reflects the instructions as-produced; if you modify one, you necessarily modify the other.  That’s what the code as-produced does.
> I accept that is not what the code as-originally-written does… and that it is unlikely that the debugger would go to the trouble to notice that two variables share a storage location and caution the user about that unexpected interaction.  (It’s hard enough to persuade the compiler to notice…)  Given that engineering practicality, and that producing debug info that helps the user without confusing/dismaying the user is a valuable goal, then I can agree that describing both variables as r-values is more helpful.
>  
> In short, I agree with your conclusion, although not for your reasons. 😊
> --paulr

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200915/8da5a5d9/attachment.html>