[llvm-dev] Seeking clarification and way forward on limited scope variables.

Sourabh Singh Tomar via llvm-dev llvm-dev at lists.llvm.org
Wed Apr 15 13:39:56 PDT 2020


Thanks David for your example demonstrating why location list won't be the
best fit here, Surprisingly your response came just after(or before) I sent
mine!

Just wanted to drop an additional note here WRT lldb log shared previously
-- 
that behavior is when, Local var is having *location list*, whilst in
normal case(no location list for Local var inside Lex Block) also LLDB
shows 2 variables at Line No. 6, But the second one contains (as
expected)Garabage value.
I think it's a bug, since we're inside Lex Block it should not show
variable from outer scope(or block) ??
...
(lldb) frame variable
(int) Argc = 1
(char **) Argv = 0x00007fffffffe4a8
(int) Local = 6
*(int) Local = 2102704*
....

Thank You,
Sourabh

On Thu, Apr 16, 2020 at 1:50 AM Sourabh Singh Tomar <sourav0311 at gmail.com>
wrote:

> Hi Vedant,
>
> Thanks for quick response.
>
> >  If at all possible, I’d /much/ rather we use the existing location list
> machinery to solve this problem. Fundamentally, we’re looking for a way to
> express when a location for a variable becomes available, and location
> lists give us that already.
>
> We also thought of location list as a immediate solution here, but AFAIK
> location list is not emitted(or constructed) in at -O0(no optimizations),
> with clang and gcc.
>
> >  To test this out, I took the IR for your test case, replaced all calls
> to “dbg.declare(…)” with “dbg.value(…, DW_OP_deref)”, and compiled with `-g
> -O0 -mllvm -fast-isel=false` (apparently FastISel doesn’t know what to do
> with frame-index dbg.values, but this is simple to fix). This seemed to
> work pretty well, and the DWARF looked legit:  ...
>
> In our case, we also experimented with this, but as in case of
> `DW_AT_start_case` debugger(in our case GDB) is also modified to look for
> variable in outer scope if PC is not in range of list. Currently GDB works
> as, if PC is in range fetch that variable other wise report if as
> `optimized out`.
>
> >  I did find one lldb bug where it didn’t know to look up a variable in
> the parent scope when stopped at your second printf() call (line 6). But
> that's a general bug: we’d have to fix it even if we
> used DW_AT_start_scope.
>
> Yea, I mean it shows two variable at line 6 name Local.
> ...
> Process 16841 launched: '/home/sourabh/work/C++/a.out' (x86_64)
> 6
> Process 16841 stopped
> * thread #1, name = 'a.out', stop reason = breakpoint 1.1
>     frame #0: 0x00000000002016dd a.out`main(Argc=1,
> Argv=0x00007fffffffe4a8) at MainScope.c:6:16
>    3            printf("%d\n",Local);
>    4
>    5            {
> -> 6            printf("%d\n",Local);
>    7            int Local = 7;
>    8            printf("%d\n",Local);
>    9            }
> (lldb) frame variable
> (int) Argc = 1
> (char **) Argv = 0x00007fffffffe4a8
> (int) Local = 6
> (int) Local = <variable not available>
>
> Thank You,
> Sourabh Singh Tomar.
>
>
>
> On Thu, Apr 16, 2020 at 12:22 AM Vedant Kumar <vedant_kumar at apple.com>
> wrote:
>
>> Hi Sourabh,
>>
>> Thanks for raising this issue. To answer your question, (afaik) there
>> isn’t anyone working on DW_AT_start_scope support in tree. We’re looking
>> for a solution to this problem for Swift debugging, where it's important
>> not to make a debug location for a variable available until its
>> (guaranteed) initialization is complete.
>>
>> If at all possible, I’d /much/ rather we use the existing location list
>> machinery to solve this problem. Fundamentally, we’re looking for a way to
>> express when a location for a variable becomes available, and location
>> lists give us that already.
>>
>> To test this out, I took the IR for your test case, replaced all calls to
>> “dbg.declare(…)” with “dbg.value(…, DW_OP_deref)”, and compiled with `-g
>> -O0 -mllvm -fast-isel=false` (apparently FastISel doesn’t know what to do
>> with frame-index dbg.values, but this is simple to fix). This seemed to
>> work pretty well, and the DWARF looked legit:
>>
>> ```
>> 0x0000005f:     DW_TAG_variable
>>                   DW_AT_location        (DW_OP_fbreg -20)
>>                   DW_AT_name    ("Local")
>>
>> 0x0000006d:     DW_TAG_lexical_block
>>                   DW_AT_low_pc  (0x0000000100000f4f)
>>                   DW_AT_high_pc (0x0000000100000f84)
>>
>> 0x0000007a:       DW_TAG_variable
>>                     DW_AT_location      (0x00000000
>>                        [0x0000000100000f63,  0x0000000100000f8c):
>> DW_OP_breg6 RBP-24)
>>                     DW_AT_name  ("Local”)
>> ```
>>
>> I did find one lldb bug where it didn’t know to look up a variable in the
>> parent scope when stopped at your second printf() call (line 6). But that's
>> a general bug: we’d have to fix it even if we used DW_AT_start_scope.
>>
>> The upshot of sticking to location lists is that fixing any bugs we find
>> improves optimized debugging workflows. And Swift -Onone debugging
>> workflows as well, since Swift runs certain mandatory optimizations which
>> can make the DWARF at -Onone fairly complex.
>>
>> best,
>> vedant
>>
>> On Apr 15, 2020, at 9:53 AM, David Blaikie via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> As always, concerned about the size growth in object files this might
>> produce - though looks like the DWARF spec avoids the worst of this in
>> unoptimized code by using an offset relative to the start of the enclosing
>> scope, so it doesn't require a relocation in that case.
>>
>> I have no idea what the LLVM DWARF representation for this would look
>> like - short of making even more fine-grained scopes in the DILexicalScope
>> hierarchy, which sounds really expensive from a memory perspective. That's
>> really where I worry that the cost to this feature might outweigh the
>> benefit (& might be why no one's done this in the past) - but data should
>> tell us that. As much as in-tree development is preferred, this might be
>> the sort of thing worth prototyping out of tree first to see if it can be
>> made viable before adding the complexity to the LLVM project proper - but
>> I'm open to ideas/suggestions.
>>
>> On Wed, Apr 15, 2020 at 3:15 AM Sourabh Singh Tomar <sourav0311 at gmail.com>
>> wrote:
>>
>>> Hello Everyone,
>>>
>>>
>>> I need to have your thoughts on this.
>>>
>>>
>>> Consider the following test case --
>>> -------------------------------------------
>>>  1    int main(int Argc, char **Argv) {
>>>   2         int Local = 6;
>>>   3         printf("%d\n",Local);
>>>   4
>>>   5         {
>>>   6         printf("%d\n",Local);
>>>   7         int Local = 7;
>>>   8         printf("%d\n",Local);
>>>   9         }
>>>  10
>>>  11         return 0;
>>>  12  }
>>> --------------------------------------------
>>> When compiled in debug mode with compilers including (trunk gcc and
>>> trunk clang) and debugging with GDB at Line No.6, the following behavior is
>>> observed
>>> Breakpoint 1, main (Argc=1, Argv=0x7fffffffe458) at MainScope.c:6
>>> 6               printf("%d\n",Local);
>>> (gdb) print Local
>>> $1 = 2102704   -- some Garbage value,
>>> (gdb) info addr Local
>>> Symbol "Local" is a variable at frame base reg $rbp offset 0+-24.   -- *This
>>> is location of *Local* declared inside scope, but as you may notice that
>>> the variable being referred here is from the outer scope.*
>>>
>>>
>>> This problem persists with both GDB and LLDB. Since we have entered the
>>> Lexical Scope and when we try to print value of *Local*,  it will look into
>>> the *current scope* and fetch the value if the variable exists in scope(in
>>> case variable doesn't exist, GDB searches for it in the outer scope).
>>>
>>>
>>> This is regardless of whether the variable has actually came into
>>> scope(or actually defined) at Line No. 7. Since DWARF already defined the
>>> location(on stack) which will be valid for the lifetime of the variable,
>>> contrary to when the variable is actually defined(or allocated) which is in
>>> this case Line No. 7.
>>> ---------------------------------------------
>>>   0x0000006d:     DW_TAG_lexical_block
>>>                   DW_AT_low_pc  (0x00000000002016d1)
>>>                   DW_AT_high_pc (0x000000000020170b)
>>> 0x0000007a:       DW_TAG_variable
>>>                     DW_AT_location      (DW_OP_fbreg -24)
>>>                     DW_AT_name  ("Local")
>>>                     DW_AT_decl_file     ("MainScope.c")
>>>                     DW_AT_decl_line     (7)
>>>                     DW_AT_type  (0x0000008a "int")
>>> ----------------------------------------------
>>>
>>>
>>> The DWARF specification provides the DW_AT_start_scope attribute to deal
>>> with this issue (Sec 3.9 Declarations with Reduced Scope DWARFv5). This
>>> attribute aims at limiting the scope of variables within the lexical scope
>>> in which it is defined to from where it has been declared/ defined.
>>>
>>>
>>> In order to fix this issue, we want to modify llvm so that
>>> DW_AT_start_scope is emitted for the variable in the inner block (in the
>>> above example). This limits the scope of the inner block variable to start
>>> from the point of its declaration.
>>>
>>>
>>> For POC, we inserted DW_AT_start_scope in this inner *Local* variable,
>>> resultant dwarf after this.
>>> -----------------------------
>>> 0x0000006d:     DW_TAG_lexical_block
>>>                   DW_AT_low_pc  (0x00000000002016d1)
>>>                   DW_AT_high_pc (0x000000000020170b)
>>> 0x0000007a:       DW_TAG_variable
>>>                     * DW_AT_start_scope   (0x17) -- restricted within a
>>> subset(starting from the point of definition(specified as an offset)) of
>>> entire ranges covered by Lex Block.*
>>>                     DW_AT_location      (DW_OP_fbreg -24)
>>>                     DW_AT_name  ("Local")
>>>                     DW_AT_decl_file     ("MainScope.c")
>>>                     DW_AT_decl_line     (7)
>>>                     DW_AT_type  (0x00000092 "int")
>>> ----------------------------
>>>
>>>
>>>
>>>
>>> We also modified ‘gdb’ to interpret DW_AT_start_scope so that the scope
>>> of the variable is limited from the PC where the value of DW_AT_start_scope
>>> is. If the debugger is stopped at a point within the same lexical block but
>>> at a PC before DW_AT_start_scope, then gdb follows the normal search
>>> mechanism of searching in consecutive super blocks till it gets a match or
>>> it reaches the global block. After the modification,  GDB is able to
>>> correctly show the value *6* in our example.
>>>
>>>
>>>
>>>
>>> After incorporating changes --
>>>   Breakpoint 1, main (Argc=1, Argv=0x7fffffffe458) at MainScope.c:6
>>> 6               printf("%d\n",Local);
>>> (gdb) print Local
>>> * $1 = 6 --- Value retrieved from outer scope*
>>> (gdb) info addr Local
>>>
>>> Symbol "Local" is a variable at frame base reg $rbp offset 0+-20.
>>>
>>>
>>> Could you guys please let us know your thoughts or suggestions on this?
>>> Was/ Is there is an existing effort already going on to deal with this
>>> problem?
>>>
>>>
>>> Even though location lists can be used to deal with this scenario, in my
>>> experience, location lists are emitted at higher optimization levels, and
>>> with the usage of location lists in this example, gdb prints out <optimized
>>> out> (as expected) if it is stopped at a PC in the same lexical block but
>>> before the point of declaration of the local variable.
>>>
>>> Thank You,
>>> Sourabh Singh Tomar.
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200416/51b87c40/attachment.html>


More information about the llvm-dev mailing list