[Lldb-commits] [PATCH] D51830: Add a way to make scripted breakpoints

Wed Sep 12 17:17:47 PDT 2018

jingham added a comment.

I addressed most of the comments.  I remember we argued about how the filters should work w.r.t. the resolvers back when I first did this, but I still claim my way is right ;-)  I gave some arguments for that in response to your inline comment.

I'm going to save this (not sure what happens if you change the diff with in-flight comments) then I'll update the diff with the few little nits you've found and with documentation for SBTarget.CreateBreakpointFromScript.

================
Comment at: include/lldb/lldb-enumerations.h:267
     eSearchDepthAddress,
-    kNumSearchDepthKinds = eSearchDepthAddress
+    kLastSearchDepthKind = eSearchDepthAddress
 };
----------------
clayborg wrote:
> It wasn't public? Why isn't this entire thing new then? The diffs show just a few changes? If no one was able to use this enum before if it wasn't in any public arguments, then we can change as needed. But from the diff it seems this was here?
This was a pre-commit to this patch set (I was trying to make it a little smaller, though I didn't succeed much...):

------------------------------------------------------------------------
r341690 | jingham | 2018-09-07 11:43:04 -0700 (Fri, 07 Sep 2018) | 6 lines

NFC: Move Searcher::Depth into lldb-enumerations as SearchDepth.

In a subsequent commit, I will need to expose the search depth
to the SB API's, so I'm moving this define into lldb-enumerations
where it will get added to the lldb module.

------------------------------------------------------------------------

================
Comment at: packages/Python/lldbsuite/test/functionalities/breakpoint/scripted_bkpt/resolver.py:23
+          if sym.IsValid():
+              self.bkpt.AddLocation(sym.GetStartAddress())
+
----------------
clayborg wrote:
> Can locations have names at all? Do we want AddLocation to take an optional name as a "const char *"?
Breakpoint locations can't currently have names.  The problem with adding names to locations is that I don't currently have any code that tracks "the same" location from if the binary changes.  Instead, if the binary changes and you rerun, I discard all the locations from the old binary, and rerun the Resolver on the new binary.  At that point the name would go missing.  I think that would be disconcerting.

I actually want to put in some heuristics to figure out equivalent locations when the binary changes so that I can do a better job of preserving location specific options.  Right now, they all get lost if the binary changes.  Actually that's not 100% true, if you go from 1 location in a module to 1 location, I am pretty sure I consider this the same location and copy over the options.  But if it gets harder than that I just punt at present.

If I can make the locations more stable, then it would make sense to me to be able to name them.

================
Comment at: packages/Python/lldbsuite/test/functionalities/breakpoint/scripted_bkpt/resolver.py:33
+class ResolverCUDepth(Resolver):
+    def __get_depth__ (self):
+        return lldb.eSearchDepthCompUnit
----------------
clayborg wrote:
> clayborg wrote:
> > ditto
> If we set the depth to compile unit, do we get called with a symbol context for each compile unit? Is there a way to get called for each function? I would use this functionality for our PGO plug-in that sets a breakpoint on every concrete function. I currently use a regex bp and then weed out all inlines by disabling them. But it would be nice to avoid having extra inline locations in the first place. 
It should except for the part where I only implemented the searcher search down to the CU level 'cause that was all I needed at the time.  I'll fix that and use the breakpoint resolvers to test it, but I'd rather do that as a separate commit.

================
Comment at: scripts/interface/SBBreakpoint.i:232
+    AddLocation(SBAddress &address);
+
     bool
----------------
clayborg wrote:
> So the question becomes should we outlaw this for all breakpoints? Might be nice to be able to add extra locations to any kind of breakpoint to allow sharing a condition. Any reason to strictly limit this to only script breakpoints? Also, remember my previous comment about locations being able to have names?
Right now, the whole model of the breakpoints is that the Resolvers know everything about how to set the locations.  If we allow users to tack on locations to other Resolvers, there's no way for the Resolver to know how to recreate the breakpoint with the added location.  So for instance, I don't know how to copy or serialize and deserialize a breakpoint including those extra locations.  I'm also not sure what I would do if the binary changes?  Would it be okay to just ditch the added location?  This would introduce enough odd corner cases that I don't think the feature is worth it.  

OTOH, I think it would be really cool to have a way to provide the standard lldb breakpoint resolvers as Python classes that you could subclass.  Maybe the subclasses could even use another signature and do something like:

```
class MyResolver(lldb.FileAndLineResolver):
    def __init__(self, bkpt, extra_args):
        # Same as before
    def __callback__(self, sym_ctx, breakpoint_candidates):
        # breakpoint_candidates are the addresses the resolver WOULD have set...
```
where the internal resolver would run first, and present a list of the addresses it would have set for this sym_ctx, and you can nix some, or add others.  That would keep the model cleaner, since the Resolvers would remain the source of truth for the breakpoint, but allow you to modify the results of the standard resolvers.

================
Comment at: source/API/SBStructuredData.cpp:83

+bool SBStructuredData::GetKeys(lldb::SBStringList &keys) const {
+  if (!m_impl_up)
----------------
clayborg wrote:
> Another option would be to have a callback that can be supplied to  SBStructuredData::ForEach(...) function whose signature is:
> ```
> bool Callback(const char *key, lldb::SBStructuredData &object); 
> ```
> If use returns false to stop, true to continue iteration. When I have looked at using SBStructuredData in the past, I wanted this call.
It seems a little weird to me to have an API on StructuredData that only makes sense for Dictionaries.  This API at least says up front that it only handles Dictionaries (they are the only thing with Keys...)  If you are really doing a ForEach you'd want to traverse the other data types as well, running over arrays, digging into arrays of Dictionaries, etc.  

TTTT, however, this function doesn't get used anywhere.  I put it in in testing so I could make sure I was getting passed all the keys I was putting in.  It isn't terribly germane to this patch, so if we're going to make a fancier version I'd rather do that as a separate piece of work.

================
Comment at: source/Core/SearchFilter.cpp:751-759
+  SymbolContext sym_ctx;
+  address.CalculateSymbolContext(&sym_ctx, eSymbolContextEverything);
+  if (!sym_ctx.comp_unit) {
+    if (m_cu_spec_list.GetSize() != 0)
+      return false; // Has no comp_unit so can't pass the file check.
+  }
+  if (m_cu_spec_list.FindFileIndex(0, sym_ctx.comp_unit, false) == UINT32_MAX)
----------------
clayborg wrote:
> What if our python filter was disassembling code and finding calls to PLT entries and set breakpoints using addresses if found in disassembly? Not sure why we give people the ability to set breakpoints with any address anywhere in a module and then require the breakpoint to be in a compile unit?
The problem this is trying to solve is, what if your Resolver wants to get called back at the Module level to do  searches because that's convenient for some reason, but the user provided a CompileUnit specific search filter?  You wouldn't want to add locations that violate the user's filter.  

This happens for instance for symbol name breakpoints.  The command:

break set -n foo -f bar.c

means "set breakpoints on all the symbols 'foo' in bar.c".  But our symbol name resolver gets called back at the module depth.  That makes sense, since the symbols are stored in the module.  So once you find a symbol you have to run it past the overall SearchFilter to make sure it passes the file filter set by the user above.  

In lldb_private you can ask the breakpoint's search filter questions.  So for instance if you were doing expensive work function by function you could ask the filter at the beginning of each function if it passes, and then only do the work in that function range if it does.  

I didn't make a way to do this in the SB API's - since I don't think I've ever used this in the lldb builtin breakpoint resolvers.  But if this seems like a useful thing, it would be easy to add an:

bool SBBreakpoint::AddressPassesFilter(SBAddress &address)

API to allow this.

https://reviews.llvm.org/D51830