[Lldb-commits] [PATCH] D119963: [LLDB] Dump valid ranges of variables

Zequan Wu via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Wed Feb 23 11:25:24 PST 2022


zequanwu added inline comments.


================
Comment at: lldb/test/Shell/SymbolFile/DWARF/x86/debug_loc.s:28-29
 # CHECK:     Variable{{.*}}, name = "x0", {{.*}}, scope = parameter, location =
 # CHECK-NEXT:  [0x0000000000000000, 0x0000000000000001): DW_OP_reg5 RDI
 # CHECK-NEXT:  [0x0000000000000001, 0x0000000000000006): DW_OP_reg0 RAX
 # CHECK:     Variable{{.*}}, name = "x1", {{.*}}, scope = parameter
----------------
labath wrote:
> zequanwu wrote:
> > labath wrote:
> > > zequanwu wrote:
> > > > labath wrote:
> > > > > zequanwu wrote:
> > > > > > labath wrote:
> > > > > > > zequanwu wrote:
> > > > > > > > `image dump symfile` already prints valid ranges for variables along with where the value is at each range.
> > > > > > > Are you sure it does?
> > > > > > > 
> > > > > > > I was under the impression that there are two distinct range concepts being combined here. One is the range list member of the Variable object (as given by `GetScopeRange` -- that's the one you're printing now), and the other is the list of ranges hidden in the DWARFExpression object, which come from the debug_loc(lists) section (that's the one we've been printing so far). And that the root cause of the confusion is the very existence of these two concepts.
> > > > > > > 
> > > > > > > If I got it wrong, then do let me know, cause it would make things a lot simpler if there is only one validity concept to think about.
> > > > > > Dwarf plugin is supposed to construct the `m_scope_range` member of an Variable, but it doesn't. `scope_ranges` is empty at https://github.com/llvm/llvm-project/blob/main/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp#L3468. 
> > > > > > `image dump symfile` dumps the dwarf location list in `m_location` in `Variable`. 
> > > > > > The dwarf location list has more information than `m_scope_range` as it contains info about where the value is during each range. (e.g. which register the variable lives in). 
> > > > > > 
> > > > > > So, I think we need to use similar logic to construct `m_scope_range` when creating `Variable` in dwarf plugin like this https://github.com/llvm/llvm-project/blob/main/lldb/source/Expression/DWARFExpression.cpp#L145.
> > > > > Ok, I see where you're coming from. You're essentially saying that the fact that the dwarf plugin does not fill this out is a bug.
> > > > > 
> > > > > I don't think that's the case. My interpretation was (and [[ https://github.com/llvm/llvm-project/blob/main/lldb/source/Symbol/Variable.cpp#L313 | this comment]] confirms it) that an empty range here means the entire enclosing block. (Also, DWARF was for a long time the only symbol file plugin, so what it does is kinda "correct by definition").
> > > > > 
> > > > > I don't think we want to change that interpretation, as forcing a copy of the range in the location list would be wasteful (it would be different if this was an interface that one could query, and that the dwarf plugin could implement by consulting the location list). However, since the dwarf class does not actually make use of this functionality (it was [[ https://reviews.llvm.org/D17449 | added ]] to support DW_AT_start_scope, then broken at some point, and eventually [[ https://reviews.llvm.org/D62302 | removed ]]), we do have some freedom in defining the interactions of the two fields (if you still want to pursue this, that is).
> > > > > 
> > > > > So how about this: if the user passes the extra flag, then we print both the range field (if it exists), and the *full* location list (in that order, ideally). That way the output will be either `range = [a, b), [c, d), location = DW_OP_reg47` or `location = [a,b) -> DW_OP_reg4, [c,d) -> DW_OP_reg7`. If the dwarf plugin starts using the range field again then the output will contain both fields, which will be slightly confusing, but at least not misleading (and we can also change the format then).
> > > > Oh, I think I misunderstood `m_scope_range`. It's the range list where the variable is valid regardless whether its value is accessible or not (valid range). As for `m_location` in `Variable`, it's describing the ranges where the value is (value range). They are not the same. 
> > > > 
> > > > So, currently how NativePDB creates local Variable's range is not correct. That only works when it's not optimized build such that the valid range is the same as the value range. It still need to create dwarf location lists to correctly represent the value range, but as mentioned [[ https://reviews.llvm.org/D119508#3319113 | here ]], we need to choose a generic "variable location provider" interface for that.
> > > > 
> > > > Oh, I think I misunderstood m_scope_range. It's the range list where the variable is valid regardless whether its value is accessible or not (valid range). As for m_location in Variable, it's describing the ranges where the value is (value range).
> > > 
> > > Yes, that was my initial assumption as well, and I think that is the only interpretation in which it makes sense to have two sources of range information for a variable. However, I've done some research since then, and I haven't found any compiler or debugger which would model the program sufficiently precisely to be able to make that distinction.
> > > 
> > > There are definite limits as to how far you can go with pdb using these abstractions, but given they (the m_scope_range) exist, I think you could make use of them (as you've done now), if they are sufficient for your current use case. That said, I would definitely encourage you to create a better abstraction for providing the location information for a variable.
> > I think you meant to replace `DWARFExpression` with a more generic interface which has the same functionalities as `DWARFExpression`. That seems a lot work, especially on `DWARFExpression::Evaluate`. 
> Well.. not exactly "replace". You know how they say there's no software engineering problem that can't be solved by adding a layer of indirection. So, I thought we could create a new `VariableLocationProvider` (for lack of a better name) interface, and one of the implementations of that interface would be backed by a DWARFExpression class. Theoretically you may not need to touch the DWARFExpression class at all -- just wrap it so that it conforms to the new interface.
> 
> This is still pretty hand-wavy, so I don't know how much work would it be, but it does not seem like it should be _that_ hard..
Thanks, I see what you mean. I thought it would be each symbol file plugin needs to convert their debug formats to a universal one and the actual evaluation will happens in that one.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119963/new/

https://reviews.llvm.org/D119963



More information about the lldb-commits mailing list