[llvm-dev] Seeking clarification and way forward on limited scope variables.
Vedant Kumar via llvm-dev
llvm-dev at lists.llvm.org
Wed Apr 15 15:27:48 PDT 2020
Thanks David and Sourabh for your comments. I filed https://bugs.llvm.org/show_bug.cgi?id=45564 to track the lldb bug. I see now that `DW_AT_start_case` provides some information not captured by location lists, and withdraw my objection.
vedant
> On Apr 15, 2020, at 1:39 PM, Sourabh Singh Tomar via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Thanks David for your example demonstrating why location list won't be the best fit here, Surprisingly your response came just after(or before) I sent mine!
>
> Just wanted to drop an additional note here WRT lldb log shared previously --
> that behavior is when, Local var is having *location list*, whilst in normal case(no location list for Local var inside Lex Block) also LLDB shows 2 variables at Line No. 6, But the second one contains (as expected)Garabage value.
> I think it's a bug, since we're inside Lex Block it should not show variable from outer scope(or block) ??
> ...
> (lldb) frame variable
> (int) Argc = 1
> (char **) Argv = 0x00007fffffffe4a8
> (int) Local = 6
> (int) Local = 2102704
> ....
>
> Thank You,
> Sourabh
>
> On Thu, Apr 16, 2020 at 1:50 AM Sourabh Singh Tomar <sourav0311 at gmail.com <mailto:sourav0311 at gmail.com>> wrote:
> Hi Vedant,
>
> Thanks for quick response.
>
> > If at all possible, I’d /much/ rather we use the existing location list machinery to solve this problem. Fundamentally, we’re looking for a way to express when a location for a variable becomes available, and location lists give us that already.
>
> We also thought of location list as a immediate solution here, but AFAIK location list is not emitted(or constructed) in at -O0(no optimizations), with clang and gcc.
>
> > To test this out, I took the IR for your test case, replaced all calls to “dbg.declare(…)” with “dbg.value(…, DW_OP_deref)”, and compiled with `-g -O0 -mllvm -fast-isel=false` (apparently FastISel doesn’t know what to do with frame-index dbg.values, but this is simple to fix). This seemed to work pretty well, and the DWARF looked legit: ...
>
> In our case, we also experimented with this, but as in case of `DW_AT_start_case` debugger(in our case GDB) is also modified to look for variable in outer scope if PC is not in range of list. Currently GDB works as, if PC is in range fetch that variable other wise report if as `optimized out`.
>
> > I did find one lldb bug where it didn’t know to look up a variable in the parent scope when stopped at your second printf() call (line 6). But that's a general bug: we’d have to fix it even if we used DW_AT_start_scope.
>
> Yea, I mean it shows two variable at line 6 name Local.
> ...
> Process 16841 launched: '/home/sourabh/work/C++/a.out' (x86_64)
> 6
> Process 16841 stopped
> * thread #1, name = 'a.out', stop reason = breakpoint 1.1
> frame #0: 0x00000000002016dd a.out`main(Argc=1, Argv=0x00007fffffffe4a8) at MainScope.c:6:16
> 3 printf("%d\n",Local);
> 4
> 5 {
> -> 6 printf("%d\n",Local);
> 7 int Local = 7;
> 8 printf("%d\n",Local);
> 9 }
> (lldb) frame variable
> (int) Argc = 1
> (char **) Argv = 0x00007fffffffe4a8
> (int) Local = 6
> (int) Local = <variable not available>
>
> Thank You,
> Sourabh Singh Tomar.
>
>
>
> On Thu, Apr 16, 2020 at 12:22 AM Vedant Kumar <vedant_kumar at apple.com <mailto:vedant_kumar at apple.com>> wrote:
> Hi Sourabh,
>
> Thanks for raising this issue. To answer your question, (afaik) there isn’t anyone working on DW_AT_start_scope support in tree. We’re looking for a solution to this problem for Swift debugging, where it's important not to make a debug location for a variable available until its (guaranteed) initialization is complete.
>
> If at all possible, I’d /much/ rather we use the existing location list machinery to solve this problem. Fundamentally, we’re looking for a way to express when a location for a variable becomes available, and location lists give us that already.
>
> To test this out, I took the IR for your test case, replaced all calls to “dbg.declare(…)” with “dbg.value(…, DW_OP_deref)”, and compiled with `-g -O0 -mllvm -fast-isel=false` (apparently FastISel doesn’t know what to do with frame-index dbg.values, but this is simple to fix). This seemed to work pretty well, and the DWARF looked legit:
>
> ```
> 0x0000005f: DW_TAG_variable
> DW_AT_location (DW_OP_fbreg -20)
> DW_AT_name ("Local")
>
> 0x0000006d: DW_TAG_lexical_block
> DW_AT_low_pc (0x0000000100000f4f)
> DW_AT_high_pc (0x0000000100000f84)
>
> 0x0000007a: DW_TAG_variable
> DW_AT_location (0x00000000
> [0x0000000100000f63, 0x0000000100000f8c): DW_OP_breg6 RBP-24)
> DW_AT_name ("Local”)
> ```
>
> I did find one lldb bug where it didn’t know to look up a variable in the parent scope when stopped at your second printf() call (line 6). But that's a general bug: we’d have to fix it even if we used DW_AT_start_scope.
>
> The upshot of sticking to location lists is that fixing any bugs we find improves optimized debugging workflows. And Swift -Onone debugging workflows as well, since Swift runs certain mandatory optimizations which can make the DWARF at -Onone fairly complex.
>
> best,
> vedant
>
>> On Apr 15, 2020, at 9:53 AM, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>> As always, concerned about the size growth in object files this might produce - though looks like the DWARF spec avoids the worst of this in unoptimized code by using an offset relative to the start of the enclosing scope, so it doesn't require a relocation in that case.
>>
>> I have no idea what the LLVM DWARF representation for this would look like - short of making even more fine-grained scopes in the DILexicalScope hierarchy, which sounds really expensive from a memory perspective. That's really where I worry that the cost to this feature might outweigh the benefit (& might be why no one's done this in the past) - but data should tell us that. As much as in-tree development is preferred, this might be the sort of thing worth prototyping out of tree first to see if it can be made viable before adding the complexity to the LLVM project proper - but I'm open to ideas/suggestions.
>>
>> On Wed, Apr 15, 2020 at 3:15 AM Sourabh Singh Tomar <sourav0311 at gmail.com <mailto:sourav0311 at gmail.com>> wrote:
>> Hello Everyone,
>>
>> I need to have your thoughts on this.
>>
>> Consider the following test case --
>> -------------------------------------------
>> 1 int main(int Argc, char **Argv) {
>> 2 int Local = 6;
>> 3 printf("%d\n",Local);
>> 4
>> 5 {
>> 6 printf("%d\n",Local);
>> 7 int Local = 7;
>> 8 printf("%d\n",Local);
>> 9 }
>> 10
>> 11 return 0;
>> 12 }
>> --------------------------------------------
>> When compiled in debug mode with compilers including (trunk gcc and trunk clang) and debugging with GDB at Line No.6, the following behavior is observed
>> Breakpoint 1, main (Argc=1, Argv=0x7fffffffe458) at MainScope.c:6
>> 6 printf("%d\n",Local);
>> (gdb) print Local
>> $1 = 2102704 -- some Garbage value,
>> (gdb) info addr Local
>> Symbol "Local" is a variable at frame base reg $rbp offset 0+-24. -- This is location of *Local* declared inside scope, but as you may notice that the variable being referred here is from the outer scope.
>>
>> This problem persists with both GDB and LLDB. Since we have entered the Lexical Scope and when we try to print value of *Local*, it will look into the *current scope* and fetch the value if the variable exists in scope(in case variable doesn't exist, GDB searches for it in the outer scope).
>>
>> This is regardless of whether the variable has actually came into scope(or actually defined) at Line No. 7. Since DWARF already defined the location(on stack) which will be valid for the lifetime of the variable, contrary to when the variable is actually defined(or allocated) which is in this case Line No. 7.
>> ---------------------------------------------
>> 0x0000006d: DW_TAG_lexical_block
>> DW_AT_low_pc (0x00000000002016d1)
>> DW_AT_high_pc (0x000000000020170b)
>> 0x0000007a: DW_TAG_variable
>> DW_AT_location (DW_OP_fbreg -24)
>> DW_AT_name ("Local")
>> DW_AT_decl_file ("MainScope.c")
>> DW_AT_decl_line (7)
>> DW_AT_type (0x0000008a "int")
>> ----------------------------------------------
>>
>> The DWARF specification provides the DW_AT_start_scope attribute to deal with this issue (Sec 3.9 Declarations with Reduced Scope DWARFv5). This attribute aims at limiting the scope of variables within the lexical scope in which it is defined to from where it has been declared/ defined.
>>
>> In order to fix this issue, we want to modify llvm so that DW_AT_start_scope is emitted for the variable in the inner block (in the above example). This limits the scope of the inner block variable to start from the point of its declaration.
>>
>> For POC, we inserted DW_AT_start_scope in this inner *Local* variable, resultant dwarf after this.
>> -----------------------------
>> 0x0000006d: DW_TAG_lexical_block
>> DW_AT_low_pc (0x00000000002016d1)
>> DW_AT_high_pc (0x000000000020170b)
>> 0x0000007a: DW_TAG_variable
>> DW_AT_start_scope (0x17) -- restricted within a subset(starting from the point of definition(specified as an offset)) of entire ranges covered by Lex Block.
>> DW_AT_location (DW_OP_fbreg -24)
>> DW_AT_name ("Local")
>> DW_AT_decl_file ("MainScope.c")
>> DW_AT_decl_line (7)
>> DW_AT_type (0x00000092 "int")
>> ----------------------------
>>
>>
>> We also modified ‘gdb’ to interpret DW_AT_start_scope so that the scope of the variable is limited from the PC where the value of DW_AT_start_scope is. If the debugger is stopped at a point within the same lexical block but at a PC before DW_AT_start_scope, then gdb follows the normal search mechanism of searching in consecutive super blocks till it gets a match or it reaches the global block. After the modification, GDB is able to correctly show the value *6* in our example.
>>
>>
>> After incorporating changes --
>> Breakpoint 1, main (Argc=1, Argv=0x7fffffffe458) at MainScope.c:6
>> 6 printf("%d\n",Local);
>> (gdb) print Local
>> $1 = 6 --- Value retrieved from outer scope
>> (gdb) info addr Local
>> Symbol "Local" is a variable at frame base reg $rbp offset 0+-20.
>>
>> Could you guys please let us know your thoughts or suggestions on this? Was/ Is there is an existing effort already going on to deal with this problem?
>>
>> Even though location lists can be used to deal with this scenario, in my experience, location lists are emitted at higher optimization levels, and with the usage of location lists in this example, gdb prints out <optimized out> (as expected) if it is stopped at a PC in the same lexical block but before the point of declaration of the local variable.
>>
>> Thank You,
>> Sourabh Singh Tomar.
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200415/44cab56e/attachment.html>
More information about the llvm-dev
mailing list