[Lldb-commits] [PATCH] D109738: [lldb] Fix bug 38317 - Address breakpoints don't work if set before the process launches

Wed Sep 15 10:32:38 PDT 2021

jingham added a comment.

In D109738#3001122 <https://reviews.llvm.org/D109738#3001122>, @vadimcn wrote:

>> In the future, can you generate patches with context (i.e. pass -U9999 to
>> "git diff" or "git show"), it's a lot easier to read patches with context.
>
> Sure thing, will do.
>
> This doesn't seem like the right solution to the problem to me.  The way
>
>> address breakpoints SHOULD work  is that when you set the breakpoint, we
>> resolve the address you pass in against the images in the target.  If the
>> address is in an image in the list, then we convert the load address to a
>> section relative address before making the breakpoint.  Address breakpoints
>> made that way will always return true for m_addr.GetSection(), and so set
>> re_resolve to true before getting to the test you modified.
>
> No, actually.   In my scenario, the breakpoint is being created by load
> address (via SBTarget::BreakpointCreateByAddress(addr_t address)), right
> after the target has been created.   Since the process hasn't been launched
> at this point yet, there are no code sections mapped, so the breakpoint
> address stays non-section-relative.   My understanding is that in such a
> case, LLDB should attempt to re-resolve it upon loading each new module.

When the target is created, the modules for the target are loaded and mapped at whatever address is their default load address.
So all the libraries that lldb can find will be listed in the shared library list at that point, mapped at their pre-load addresses.  
On Darwin systems the dynamic loader plugin reads the closure of the loaded libraries from the binary's load commands, and loads them all.  
That's not a requirement of the dynamic loader, however, so you may just have the executable loaded.  But you will always have that.

The initial example you showed when proposing the patch has you finding the address of main, then getting its address and setting
an address breakpoint on that address.  So clearly there was some addresses associated with the main executable at that point, and
we could have re-resolved the address to a section relative one correctly.

There are problems with this approach for sure.  Mainly that there's no guarantee before run that all the libraries lldb detects
will have unique addresses.  It's pretty common for the linker to assign a default load address, often 0x0, for all shared libraries.
So before run, an address breakpoint is ambiguous, it could map to many section/offset pairs.

lldb doesn't usually fail to make breakpoints (it does, for instance, for a regex breakpoint where the regex doesn't compile...).  
But in this case, I think it should because there's no way to tell what the user's intentions were.  For instance, in your initial example 
where you got the symbol for main pre-run, then used that address to set an address breakpoint, say offset a bit from main or whatever.  
If on run, the breakpoint was resolved to whatever library happened to load at that address, you would not be best pleased.

>> OTOH, lldb is a bit hesitant to reset raw load address breakpoints that
>> aren't in any module.  After all, you have no idea, what with loading &
>> unloading & the effects of ASLR, a raw address doesn't have much meaning
>> run to run.
>
> LLDB disables ASLR by default, so load addresses are usually stable.

lldb ASKS to disable ASLR by default.  The systems that lldb runs on are under no obligation to honor 
that request, and haven't on all iOS devices for the past 5 years or thereabouts.  You can't assume this will
get you out of trouble.

> Also, please see Ted's comment about embedded.

If you mean the comment with the example of getting main's address and then setting an address
breakpoint there, as I argued above, I think that more tends to argue we want to convert addresses
to section relative whenever we can.  The user's clear intention was to set the breakpoint on the address
that "main" was going to show up at, not at whatever code happens to end up at the pre-load address of main.

>> In your case, you have an address that's clearly in a section, so it
>> should have been handled by the if(m_addr.GetSection() ||
>> m_module_filespec) test.  So the correct fix is to figure out how you ended
>> up with a non-section relative address for the breakpoint.
>>
>> I don't want lldb to silently reset non-section relative addresses.  The
>> only way those should show up is if you set the breakpoint on some code
>> that was NOT in any section, maybe JIT code or something, and lldb has no
>> way of knowing what to do with those addresses on re-run.  Refusing to
>> re-resolve them may be too harsh, but we should at least warn about them.
>> Whereas if the address is section relative, we can always do the right
>> thing.
>
> I don't see what's the problem with resolving them.  If I set a breakpoint
> on a raw memory address, I expect it to be hit as soon as there's some code
> mapped and executed at that location.   Anything else would be surprising
> to me.

I think this is okay as a fallback position.  IMO, the way it should work is:

1. If the address maps uniquely to a section relative address when set, it was very very likely the

user's intent that the breakpoint track that section relative address, so we should convert
it to that in order to follow the user's likely intention.  Note this is clearly the case in the
example you cited as motivation for this patch.

2. If the address maps to more than one section, I think we should fail the breakpoint

and say "this maps to these shared libraries, please use the -s option to indicate the shared library."
It was pretty clear the user was trying to do something, but not capturing their intent in the breakpoint
setting they did, so this seems like the reasonable reaction.

3. If the address doesn't fall into any sections, then it would be fine to assign it to the first

section that shows up at that address, but I would convert it to section relative at that point.  Again,
on re-run even if libraries move it is not at all likely that the user's intent was to set the breakpoint on
whatever other library happened to show up at that address.  If you have a good use case for this, we
can add a --force or maybe you can say `--s NONE` to indicate you don't want this resolved to section
relative.

4. The only other way that this breakpoint would take is if we hit the gdb-JIT notification breakpoint and

found the address in some JIT code.  If this happens, it's probably okay to keep the breakpoint around non-section
relative.  You run the risk that some other library loads at this address, and we set the breakpoint in data-in-code
that we don't know about and that causes the program to behave badly in ways that are difficult to diagnose.
But that's fairly unlikely.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109738/new/

https://reviews.llvm.org/D109738