[lldb-dev] RFC: Break/Watchpoint refactor

Tue Sep 27 14:53:09 PDT 2016

> On Sep 27, 2016, at 2:23 PM, Zachary Turner via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> 
> 
> 
> 
> On Tue, Sep 27, 2016 at 1:09 PM Daniel Austin Noland via lldb-dev <lldb-dev at lists.llvm.org <mailto:lldb-dev at lists.llvm.org>> wrote:
> 4. All of these classes are "old school" (not necessarily bad, but
> potentially a problem).  For example:
>    a. lots of const char* running around.
> We should use llvm::StringRef here.
>  
>    b. DISALLOW_COPY_AND_ASSIGN(BreakpointEventData) to make ctors and
> such private rather than using ctor() = delete (which provides better
> compiler errors)
>    c. use of Error& args in function signatures as opposed to
> Expected<ReturnType>.
> LLDB already has its own class called Error, and it makes it confusing if we're going to be using llvm::Error as well.  In an earlier thread I proposed changing lldb::Error to LLDBError for exactly this reason.  I would suggest doing this first.

I don’t think we want different parts of lldb handling errors differently.  That would be even more confusing.

>  
>    d. callback implementation uses function pointers (an ever confusing
> topic, especially for new programmers) where I think we could use
> templated methods (or just a parent Callback class) to significant effect.
> We should consider using llvm::function_ref<> here.
> 
> More details on each of these below.
>  
> 
> Phase 2: Restructure / Modernize the Private API / Implementation
> -----------------------------------------------------------------
> 
> * Change Error& out parameters to Expected<ReturnType>.
> As mentioned, we should rename lldb::Error first so as to avoid naming conflicts with llvm.  Granted, they are each in their own namespace, but still, it will lead to confusion, and prevents the use of "using namespace llvm;" as it currently stands.
>  
> * Get rid of all the const char* vars / args in favor of a better string
> type (which?)
> llvm::StringRef.  Anyone returning a const char* is saying they don't own the backing memory.  That's exactly what a StringRef is for, but it has many other benefits such as very efficient searching, substring, and parsing methods.  
>  
> * Prefer explicitly deleted copy ctor / assignments over multiline macro
> DISALLOW_COPY_AND_ASSIGN
> * Try to reduce the (perhaps excessive) use of friendship between the
> support classes.  For instance, is it necessary to give
> BreakpointLocation access to private data members from
> BreakpointLocationList, Process, BreakpointSite, and
> StopInfoBreakpoint?  Wouldn't it be better to expand the functionality
> of those other classes?
> 
> A more significant change could be a rewrite of the callback functionality.
> 
> There are some problems with the way callbacks are implemented in terms
> of maintainability and extensibility.  I think that using templates and
> simple inheritance we could shift to using callback objects instead of
> function pointers.  I have a crude code sketch of this plan in the works
> now, and I will submit that if there is interest in this idea.
> Essentially, the idea is to let the user define their own Breakpoint or
> Watchpoint (or whatever) child class which overrides a pure virtual run
> method from a parent StoppointCallback templated class.  The template
> parameter of StoppointCallback would allow us to define a
> std::unique_ptr<UserData> member where the user can hang any data they
> want for the callback.  That saves us from void pointers (which I find
> very awkward).
> I'm not a huge fan of this.  Inheriting from Breakpoint or Watchpoint sounds old school to me.  Implementation inheritance in general should be avoided wherever possible IMO.  Also, you've mentioned that StoppointCallback would only have a template argument, but it's possible that different types of callback take completely incompatible types and numbers of arguments.  This is something that can't be described by a single function with a template parameter.
> 
> I think the canonical pattern for what you want here is "double dispatch".  You've got something like this:
> 
> struct CallbackArgs {
>   virtual void dispatch(StoppointBase &point) = 0;
> };
> struct WatchpointArgs : public CallbackArgs {
> private:
>   int Old;
>   int New;
> public:
>   void dispatch(StoppointBase &point) override {
>     static_cast<Watchpoint&>(point).Callback(Old, New);
>   };
> };
> 
> struct BreakpointArgs : public CallbackArgs {
> private:
>   int id;
> public:
>   void dispatch(StoppointBase &point) override {
>     static_cast<Breakpoint&>(point).Callback(id);
>   }
> };
> 
> It's a little bit awkward though, so you can perhaps simplify this by making use of llvm's casting infrastructure.  Check out llvm/include/Casting.h.  But basically you could move all this dispatch logic into a single function that looks like this:
> 
> if (auto wp = llvm::dyn_cast<Watchpoint*>(stop_point)) {
>   wp->Callback( ... );
> } else if (auto bp = llvm::dyn_cast<Breakpoint*>(stop_point)) {
>   bp->Callback( ... );
> }
> 
> And these callback functions could have completely different signatures.
> 
>  

I’m not sure this is solving a problem that we have.  The callbacks get arguments which describe the environment in which the stop occurred (the execution context) and something identifying the stop point (ID & loc ID).  The data specific to the breakpoint/watchpoint should come from the breakpoint/watchpoint.  I don’t see any convincing reason these need to have this flexibility.

> 
> template <class UserData>
> class StoppointCallback {
> private:
>     std::unique_ptr<UserData> m_data;
> // ...
> };
> Yes, this kind of pattern is good.  Then just pass StoppointUserData& into the callback and let the caller cast it to the appropriate derived type.
>  

Again, on the lldb_private side is fine.  But we still need to pass callbacks and data across the SB API’s in a non-fragile way.

> 
> * GDB style tracepoints.  This may be very difficult but it seems very
> desirable.
> 
>  I don't see why this would be hard.  but I also don't think it's sufficiently different from a breakpoint that it needs to be *that* special.  A tracepoint is just a breakpoint that does some stuff and then automatically continues.  We can already stop at breakpoints, run commands, and resume commands.  Why can't we just do all of those in a single scripted operation?  I can envision a command like:
> 
> break set -n foo --exec "print p" --exec "break set -n bar" --continue
> 
> which sets a breakpoint at foo, and when it gets hit prints the value of p and sets another breakpoint in bar, then resumes.

That’s not what the tracepoints Daniel is talking about do.  A tracepoint is an instruction to the server to stop, run an experiment (expressed in a little byte-coded language) without returning to the debugger, and store the result away.  Then when the target stops at a “real" breakpoint, the debugger downloads the results of all the tracepoint experiments and presents this past history to the user in one way or the other.

It isn’t rocket science, but it does require server side support to be useful.

Jim

> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160927/8790ca95/attachment.html>