[Lldb-commits] [PATCH] D109738: [lldb] Fix bug 38317 - Address breakpoints don't work if set before the process launches

Fri Oct 8 14:49:14 PDT 2021

d-millar added a comment.

That discussion is extremely helpful - gives me good context for what I need to suport.  THANKS!

________________________________

From: lldb-commits <lldb-commits-bounces at lists.llvm.org> on behalf of Jim Ingham via Phabricator via lldb-commits <lldb-commits at lists.llvm.org>
Sent: Friday, October 8, 2021 5:44:49 PM
To: vadimcn at gmail.com; jingham at apple.com
Cc: djordje.todorovic at syrmia.com; lldb-commits at lists.llvm.org; Liburd1994 at outlook.com; quic_sourabhs at quicinc.com; h.imai.833 at nitech.jp; serhiy.redko at gmail.com; david.spickett at linaro.org
Subject: [Lldb-commits] [PATCH] D109738 <https://reviews.llvm.org/D109738>: [lldb] Fix bug 38317 - Address breakpoints don't work if set before the process launches

jingham added a comment.

I realized I typed this all down a while ago but forgot to hit submit.  I think I still agree with former me...

In D109738#3002257 <https://reviews.llvm.org/D109738#3002257> https://reviews.llvm.org/D109738#3002257, @vadimcn wrote:

> Hi Jim,
> I think there's a bit of confusion going on here:
> The original bug was opened by Ted Woodward, and I think his scenario was
> motivated by embedded debugging.   He also provided that example with a
> breakpoint being set in main.
>
> I've rediscovered this bug independently, while working on an IDE
> integration project.  In my case, the IDE saves just the address of the
> instruction where a breakpoint was set in the previous debugging session.

That is clearly wrong: an address is never enough to restore a breakpoint from one session to the next even if you can turn ASLR off.  After all, the breakpoint could be in a dlopened library.  That makes getting this breakpoint back to where it came from dependent on the path of execution.  For instance, if one library had been dlopened at the address in question and then dlclosed and then your target library dlopened in the same spot before the stop that set the breakpoint, then next time round you will set the breakpoint in the first library loaded not the second one, which is again not right.  Remember also that setting breakpoints is not a risk-free process.  On x86-64, for instance, you could end up setting the breakpoint not on an instruction boundary, you could set it in some data, etc.  All leading to hard-to-diagnose failures in the running process.  So some caution is warranted here.

Is it possible to get your IDE to record the module & the address?  WIth that pair, you can always set the breakpoint at the right place, and lldb allows you to specify a module as a filter when you set an address breakpoint, so there would be no problem with resetting it this way using the SB API's.

> In the next session, the debug adapter is expected to restore instruction
> breakpoints using just the absolute addresses; the protocol does not
> provide for any user interaction.  Also, this happens via the SB API, not
> via LLDB commands.   Fortunately, when ASLR is disabled, modules are
> usually loaded at the same address, so things tend to work out 99% of the
> time.  The other 1%... well, if you are debugging on the assembly
> instruction level, you are kinda expected to know what you are doing and be
> prepared to deal with problems like those you've described above.

lldb use on iOS is far above 1% of the lldb users.

> When the target is created, the modules for the target are loaded and
>
>> mapped at whatever address is their default load address.
>
> Clearly, this isn't happening.  From what I can see, section addresses are
> not known until the process starts running.   Maybe this is another bug
> that needs fixing?

Breakpoints eventually resolve to "breakpoint locations", which are specified by an lldb_private::Address.  When you set a file & line breakpoint, or a symbol name breakpoint we make always make a Section relative Address for the breakpoint location.  When the binary loads, provided its UUID hasn't changed we don't re-resolve the breakpoint, which would be wasted effort since we already know its section relative offset, and can use that to calculate the load address to send to the debug monitor.

So it is the case that we can make section relative addresses pre-run and resolve them on running.  In fact, when you ask lldb "image lookup -n main" before run, we will get the Address for the main symbol - a section relative Address - and print it at the "default" load address of the library.  So we deal with Section relative Addresses all the time pre run.  The only thing we don't know is the "load address" of the section, because the SectionLoadList isn't filled in pre-run.

When you set a breakpoint by address pre-run, if the address is uniquely in the default load range of some binary, we certainly CAN figure out the section and resolve it.  The important point about Ted's report is that in a case where we clearly could do that, we aren't.

> Even so, I think that address breakpoints should still function when
> section addresses aren't known at breakpoint creation time (for whatever
> reason), if only on a best-effort basis.
>
> 3. If the address doesn't fall into any sections, then it would be fine to
>
>> assign it to the first section that shows up at that address, but I would
>> convert it to section relative at that point.
>
> This is what I am attempting to do with my patch.  Is there something else
> I should do to make sure I am not regressing other scenarios?

The fact that Ted's case doesn't work means that at some point we stopped recording address breakpoints as section relative in cases where we should.  That to me is the primary bug, but maybe not so interesting to you.

More generally, using raw address breakpoints is fallible, and whenever you find yourself doing that you should probably see if you can do anything else.

So to me the next bug is that you are trying to record breakpoint locations by raw address.  They probably weren't set that way, and they certainly weren't initially resolved that way, since the only way you'll get a breakpoint inserted is (a) if some section gets loaded at that address, or (b) we are told about a JIT region.  In case (a) we probably should fix the address to the section it was found in so that something predictable happens on rerun.  In case (b) we really should refuse to reset the breakpoint.  In the case of JIT code we don't necessarily know its structure, so we are very likely to stick our breakpoint somewhere that will cause problems the next time something gets JIT-ed there.

This patch as is worries me because it moves us further down the path of passing around raw addresses from run to run, and when that doesn't work it can cause hard to diagnose problems in the running of the process.

Repository:

  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION

  https://reviews.llvm.org/D109738/new/

https://reviews.llvm.org/D109738

_______________________________________________

lldb-commits mailing list
lldb-commits at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109738/new/

https://reviews.llvm.org/D109738