[Lldb-commits] [PATCH] D94846: Allow breakpoints to be set on C++11 inline initializers

Tue Jan 19 10:17:37 PST 2021

> On Jan 17, 2021, at 10:47 AM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> On Fri, Jan 15, 2021 at 6:22 PM Jim Ingham <jingham at apple.com> wrote:
>> 
>> If you set a breakpoint on source lines with no line table entries, lldb slides the actual match forward to the nearest line table entry to the given line number.  That's necessary for instance because if you lay out your function definitions over multiple lines they won't all get line table entries, but we don't want people to have to guess which of them was...  Same is true of more complicated compound statements.
> 
> Ah, sure - gdb seems to do that too, totally makes sense - as you say,
> it'd be pretty hard to know exactly which tokens end up with
> corresponding entries in the line table and which don't (especially
> under optimizations and complex compound statements).

Yeah, I think you either have to have this heuristic, or have a debugger that can display "valid breakpoint locations".  The old Metrowerks debugger used to do it that way.  They only put up breakpoint affordances in the UI for the lines that had line table entries and there was no free-form way to set breakpoints, so you couldn't do an ambiguous thing.  But I don't think we have that option in lldb.

> 
>> So people like that, but they really hate it when they have:
>> 
>> 
>> #ifdef SOMETHING_NOT_DEFINED
>> 
>> int foo() {
>> 
>> }
>> 
>> #else
>> int bar() {
>> 
>> }
>> #end
>> 
>> but with lots more junk so you can't see the ifdef's and then they set a breakpoint in the "int foo" part and the breakpoint gets moved to bar instead and then doesn't get hit.
> 
> Ah, indeed - that is a curious case that could be surprising to a
> user. Thanks for explaining it - though it seems gdb doesn't special
> case this & does produce the not-quite-what-the-user-intended
> behavior.
> 
>> You might try to argue that they should have checked where the breakpoint actually landed before coming into your office to yell at you, but it's likely to leave you with fewer friends...
> 
> FWIW: I don't have coworkers or friends come into my office to yell at
> me about anything like this, and I don't really think anyone should -
> that's not appropriate behavior for a workplace.
> 
> Having a nuanced discussion about the tradeoff of features - sure.

Yeah, IME people get cheesed off at debuggers in ways they don't with compilers.  I have some theories, but more of the after work quality than the public mailing list quality.

> 
>> So I was trying to detect this case and not move the breakpoint if sliding it crossed over the function start boundary.  That way you'd see that the breakpoint didn't work, and go figure out why.
> 
> Neat idea!
> 
>> The thinko in the original version was that we were still doing this when we DIDN'T have to slide the breakpoint, when we got an exact match in the line table.  In that case, we shouldn't try to second guess the line table at all.  That's the patch in this fix.
> 
> Might this still leave some confusing behavior. Now it'll slide
> forward into a function, but it won't slide forward into an
> initializer? (so now the user has to know exactly which lines of the
> initializer are attributed to the line table, making that somewhat
> difficult to use?)

Yes I knew this was a tradeoff of the implementation.  If I did a more in depth search in order to slide this over to function initializer, it would IMO add way more complexity to an already overly fiddly part of the debugger, and the trade-off didn't seem worth it.  Most initializers have a fairly simple structure, as opposed say to the before the body part of a function definition (that's got to have a name, but I don't know it off the top of my head.)  So I think the payoff would be a lot less in this area.  I asked around here and the folks I polled agreed that it wasn't worth the complexity.

> 
> Testing some things out:
> 
> $ cat break.cpp
> __attribute__((optnone)) int f1() { return 1; }
> struct t1 {
>  int     // 3
>      i   // 4
>      =   // 5
>      f1  // 6
>      ();
>  t1() {
>  }
> };
> 
> t1 v1;
> int main() {
> }
> 
> The line table:
> 
> Address            Line   Column File   ISA Discriminator Flags
> ------------------ ------ ------ ------ --- ------------- -------------
> 0x0000000000401140      1      0      1   0             0  is_stmt
> 0x0000000000401144      1     37      1   0             0  is_stmt prologue_end
> 0x0000000000401150     13      0      1   0             0  is_stmt
> 0x0000000000401154     14      1      1   0             0  is_stmt prologue_end
> 0x0000000000401158     14      1      1   0             0  is_stmt end_sequence
> 0x0000000000401020      0      0      1   0             0  is_stmt
> 0x0000000000401024     12      4      1   0             0  is_stmt prologue_end
> 0x0000000000401040      0      0      1   0             0  is_stmt
> 0x000000000040104b      0      0      1   0             0  is_stmt end_sequence
> 0x0000000000401160      8      0      1   0             0  is_stmt
> 0x0000000000401174      6      7      1   0             0  is_stmt prologue_end
> 0x000000000040117f      4      7      1   0             0  is_stmt
> 0x0000000000401181      9      3      1   0             0  is_stmt
> 0x0000000000401187      9      3      1   0             0  is_stmt end_sequence
> 
> Has entries for line 4 and 6 from the initializer (and 8 and 9 from
> the ctor), but gdb doesn't seem to want to break on any of them. So
> I'm not sure what gdb's doing, but it seems pretty unhelpful
> (apparently gdb breaks in to the ctor too late to even let you step
> into the initializers... guess that's because the initializers are
> called in the prologue according to the debug info):
> (gdb) b 1
> Breakpoint 1 at 0x401144: file break.cpp, line 1.
> (gdb) b 2
> Breakpoint 2 at 0x401181: file break.cpp, line 9.
> (gdb) b 3
> Note: breakpoint 2 also set at pc 0x401181.
> Breakpoint 3 at 0x401181: file break.cpp, line 9.
> (gdb) b 4
> Note: breakpoints 2 and 3 also set at pc 0x401181.
> Breakpoint 4 at 0x401181: file break.cpp, line 9.
> (gdb) b 5
> Note: breakpoints 2, 3 and 4 also set at pc 0x401181.
> Breakpoint 5 at 0x401181: file break.cpp, line 9.
> (gdb) b 6
> Note: breakpoints 2, 3, 4 and 5 also set at pc 0x401181.
> Breakpoint 6 at 0x401181: file break.cpp, line 9.
> (gdb) b 7
> Note: breakpoints 2, 3, 4, 5 and 6 also set at pc 0x401181.
> Breakpoint 7 at 0x401181: file break.cpp, line 9.
> (gdb) b 8
> Note: breakpoints 2, 3, 4, 5, 6 and 7 also set at pc 0x401181.
> Breakpoint 8 at 0x401181: file break.cpp, line 9.
> (gdb) b 9
> Note: breakpoints 2, 3, 4, 5, 6, 7 and 8 also set at pc 0x401181.
> Breakpoint 9 at 0x401181: file break.cpp, line 9.
> 
> lldb does already seem to let me break on the initializer without this
> patch - so any idea what part of this already works compared to what's
> being fixed here? (though lldb breaks on line 6 even when I asked to
> break on line 7 or 8, but it doesn't break on 6 when I ask to break on
> 6... )

Yes.  This is because the line table entry for the "first line" of a function is also imprecise.  For instance, if you do:

int
foo()
{

}

the "int" line in the function definition pretty much never gets a line table entry, yet that's a perfectly reasonable place to set a breakpoint on the function.  So the "am I sliding across function definitions" check has a little slop in the comparison.  Just put a handful of blank lines between the "();" of the struct initialization and the definition of the constructor, and then the breakpoint won't take.

> 
> (lldb) target create "./a.out"
> Current executable set to
> '/usr/local/google/home/blaikie/dev/scratch/a.out' (x86_64).
> (lldb) b 1
> Breakpoint 1: where = a.out`f1() + 4 at break.cpp:1:37, address =
> 0x0000000000401144
> (lldb) b 2
> Breakpoint 2: no locations (pending).
> WARNING:  Unable to resolve breakpoint to any actual locations.
> (lldb) b 3
> Breakpoint 3: no locations (pending).
> WARNING:  Unable to resolve breakpoint to any actual locations.
> (lldb) b 4
> Breakpoint 4: no locations (pending).
> WARNING:  Unable to resolve breakpoint to any actual locations.
> (lldb) b 5
> Breakpoint 5: no locations (pending).
> WARNING:  Unable to resolve breakpoint to any actual locations.
> (lldb) b 6
> Breakpoint 6: no locations (pending).
> WARNING:  Unable to resolve breakpoint to any actual locations.
> (lldb) b 7
> Breakpoint 7: where = a.out`t1::t1() + 20 at break.cpp:6:7, address =
> 0x0000000000401174
> (lldb) b 8
> Breakpoint 8: where = a.out`t1::t1() + 20 at break.cpp:6:7, address =
> 0x0000000000401174
> (lldb) b 9
> Breakpoint 9: where = a.out`t1::t1() + 33 at break.cpp:9:3, address =
> 0x0000000000401181
> (lldb) r
> 
>> BTW, the check wouldn't have affected code from .defs files because I only do it if the original breakpoint specification and the "function start" are from the same source file.
> 
> Sure enough - no doubt someone out there has some file that #includes
> itself in some entertaining way, but likely quite rare.

Given that the penalty you have to pay for such a deed is having to be a little more careful how you set breakpoints, I don't feel the need for super-human efforts in this area.

Jim

> 
>> And we know about inlined blocks so inlining isn't going to fool us either.
>> 
>> Jim
>> 
>> 
>> 
>> 
>>> On Jan 15, 2021, at 5:09 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>> 
>>> "But because their source lines are outside the function source range"
>>> 
>>> Not sure I understand that - the DWARF doesn't describe a function
>>> source range, right? Only the line a function starts on. And a
>>> function can have code from source lines in many files/offsets that
>>> are unrelated to the function start line (LLVM in several places
>>> #includes .def files into functions to stamp out tables, switches,
>>> arrays, etc, for instance)
>>> 
>>> On Fri, Jan 15, 2021 at 4:47 PM Jim Ingham via Phabricator via
>>> lldb-commits <lldb-commits at lists.llvm.org> wrote:
>>>> 
>>>> jingham created this revision.
>>>> jingham requested review of this revision.
>>>> Herald added a project: LLDB.
>>>> Herald added a subscriber: lldb-commits.
>>>> 
>>>> The inline initializers contribute code to the constructor(s).  You will step onto them in the source view as you step through the constructor, for instance.  But because their source lines are outside the function source range, lldb thought a breakpoint on the initializer line was crossing from one function to another, which file & line breakpoints don't allow.  That meant if you tried to set a breakpoint on one of these lines it doesn't create any locations.
>>>> 
>>>> This patch fixes that by asserting that if the LineEntry in one of the SymbolContexts that the search produced exactly matches the file & line specifications in the breakpoint, it has to be a valid place to set the breakpoint, and we should just set it.
>>>> 
>>>> 
>>>> Repository:
>>>> rG LLVM Github Monorepo
>>>> 
>>>> https://reviews.llvm.org/D94846
>>>> 
>>>> Files:
>>>> lldb/source/Breakpoint/BreakpointResolverFileLine.cpp
>>>> lldb/test/API/lang/cpp/break-on-initializers/Makefile
>>>> lldb/test/API/lang/cpp/break-on-initializers/TestBreakOnCPP11Initializers.py
>>>> lldb/test/API/lang/cpp/break-on-initializers/main.cpp
>>>> 
>>>> _______________________________________________
>>>> lldb-commits mailing list
>>>> lldb-commits at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits