[lldb-dev] RFC: libtrace

Wed Jun 27 10:18:56 PDT 2018

> On Jun 26, 2018, at 5:14 PM, Zachary Turner <zturner at google.com> wrote:
> 
> Yes that’s what I’ve been thinking about as well.
> 
> One thing I’ve been giving a lot of thought to is whether to serialize the handling of trace events.  I want to balance the “this is a library and you should be able to get it to work for you no matter what your use case is” aspect with the “you really just don’t want to go there, we know what’s best for you” aspect.  Then there’s the  fact that not all platforms behave the same, but we’d like a consistent set of expectations that makes it easy to use for everyone.
> 
> So I’m leaning towards having the library serialize all tace events, because it’s a nice common denominator that every platform can implement.
> 
> To be clear though, I don’t mean that if 2 processes are being traced simultaneously and A stops followed by B stopping, then the tool will necessarily block before handling  B’s stop.  I just mean that A and B’s stop handlers will be invoked on a single thread (not the threads which are tracing  A or B).  
> 
> So A stops, posts its stop event on the blessed thread and waits.  Then B stops and does the same thing.  A’s handler runs, for whatever reason decides it will continue later, saves off the event somewhere, then processes B’s.  Later something happens, it decides to continue A, signals A’s thread which wakes up.
> 
> I think this kind of design eliminates a large class of race conditions without sacrificing any performance.  
> 
> LLDB doesn’t currently work like this, but it would be nice not to end up with another split similar to the dwarf split, so I’m curious if you can think of any fundamental assumptions of LLDB’s architecture that this would violate.  This way we’d at least know that it’s possible to use the api in lldb (assuming it does everything lldb needs obviously) 

What you describe is actually pretty much how the lldb driver works.  Every time the lower levels of the Process (e.g. ProcessGDBRemote) class notice something interesting happening to the process they are managing, they post an event to the Listener in charge of driving that process.  Then the process is allowed to continue on its way, either stopped or continued depending (the event records whether a restart has occurred.)  The upper levels only know about what happened to the process when they fetch an event off the event queue.  For a single process that serializes the reporting of process state.  

As to multiple processes, you can decide whether you want to serialize all the process events using the same mechanism or not, depending on your use case.

In the lldb driver, there's one Listener that waits on all processes (the Debugger's listener).  These events all get effectively serialized in its event loop.  So if you were just straight using lldb classes you could trivially implement what you want to achieve.

That being said, I don't think you want to use lldb's process event system for your ptracer.  It has a lot of complexity which supports handling reactions to events (breakpoint commands and conditions) that have to operate in the same context as user commands even though they happen before the user has regained control, and which might or might not restart the process out from under you.  They also manage the task of concealing the vast majority of stops from the higher level clients - for instance to pretend that a single "source line step over" didn't actually require lots  of stops and starts.  I don't think anything you have described requires handling either of these tasks.

But you could use the general event system to achieve the serialization of reporting w/o hooking into the lldb private/public state thread system.

Jim

> 
> Thoughts?
> 
> On Tue, Jun 26, 2018 at 1:09 PM Jim Ingham <jingham at apple.com> wrote:
> You'd probably need to pull the Unwinder in if you want backtraces, but that part shouldn't be that hard to disentangle.  I don't think you'd need much else?
> 
> Basing your work on NativeProcess rather than lldb proper would also cut the number of observer processes in half and avoid the context switches between the server and the debugger.  That seems more appropriate for a lightweight tool.
> 
> Jim
> 
> 
> > On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> > 
> > So you aren't planning to print values at all, just stop points (i.e. you are only interested in the line table and function symbols part of DWARF)?
> > 
> > Given what you've described so far, I'm wondering if what you really want is the NativeProcess classes with some symbol-file reading pulled in? Is there anything that you couldn't do from there?
> > 
> > Jim
> > 
> > 
> >> On Jun 26, 2018, at 12:48 PM, Zachary Turner <zturner at google.com> wrote:
> >> 
> >> no expression parser or knowledge of any specific programming language.
> >> 
> >> Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope.  For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code.  Similarly, LLDB's type system could be built on top of it as well.  Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands.
> >> 
> >> It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has.
> >> 
> >> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <jingham at apple.com> wrote:
> >> Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical?  For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb.  I don't think that's what you meant, but wanted to be sure.
> >> 
> >> Jim
> >> 
> >>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> >>> 
> >>> Hi all,
> >>> 
> >>> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it. 
> >>> 
> >>> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
> >>> 
> >>> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
> >>> 
> >>> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
> >>> 
> >>> Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this?  Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
> >>> 
> >>> Thanks,
> >>> Zach
> >>> 
> >>> _______________________________________________
> >>> lldb-dev mailing list
> >>> lldb-dev at lists.llvm.org
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> >> 
> > 
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>