[lldb-dev] RFC: Moving debug info parsing out of process

Tue Feb 26 16:49:29 PST 2019

> On Feb 26, 2019, at 4:03 PM, Zachary Turner <zturner at google.com> wrote:
> 
> I would probably build the server by using mostly code from LLVM.  Since it would contain all of the low level debug info parsing libraries, i would expect that all knowledge of debug info (at least, in the form that compilers emit it in) could eventually be removed from LLDB entirely.

That’s quite an ambitious goal.

I haven’t looked at the SymbolFile API, what do you expect the exchange currency between the server and LLDB to be? Serialized compiler ASTs? If that’s the case, it seems like you need a strong rev-lock between the server and the client. Which in turn add quite some complexity to the rollout of new versions of the debugger.

> So, for example, all of the efforts to merge LLDB and LLVM's DWARF parsing libraries could happen by first implementing inside of LLVM whatever functionality is missing, and then using that from within the server.  And yes, I would expect lldb to spin up a server, just as it does with lldb-server today if you try to debug something.  It finds the lldb-server binary and runs it.
> 
> When I say "switching the default", what I mean is that if someday this hypothetical server supports everything that the current in-process parsing codepath supports, we could just delete that entire codepath and switch everything to the out of process server, even if that server were running on the same physical machine as the debugger client (which would be functionally equivalent to what we have today).

(I obviously knew what you meant by "switching the default”, I was trying to ask about how… to which the answer is by spinning up a local server)

Do you envision LLDB being able to talk to more than one server at the same time? It seems like this could be useful to debug a local build while still having access to debug symbols for your dependencies that have their symbols in a central repository.

Fred

> 
> On Tue, Feb 26, 2019 at 3:46 PM Frédéric Riss <friss at apple.com <mailto:friss at apple.com>> wrote:
> 
>> On Feb 25, 2019, at 10:21 AM, Zachary Turner via lldb-dev <lldb-dev at lists.llvm.org <mailto:lldb-dev at lists.llvm.org>> wrote:
>> 
>> Hi all,
>> 
>> We've got some internal efforts in progress, and one of those would benefit from debug info parsing being out of process (independently of whether or not the rest of LLDB is out of process).
>> 
>> There's a couple of advantages to this, which I'll enumerate here:
>> It improves one source of instability in LLDB which has been known to be problematic -- specifically, that debug info can be bad and handling this can often be difficult and bring down the entire debug session.  While other efforts have been made to address stability by moving things out of process, they have not been upstreamed, and even if they had I think we would still want this anyway, for reasons that follow.
>> It becomes theoretically possible to move debug info parsing not just to another process, but to another machine entirely.  In a broader sense, this decouples the physical debug info location (and for that matter, representation) from the debugger host.
>> It becomes testable as an independent component, because you can just send requests to it and dump the results and see if they make sense.  Currently there is almost zero test coverage of this aspect of LLDB apart from what you can get after going through many levels of indirection via spinning up a full debug session and doing things that indirectly result in symbol queries.
>> The big win here, at least from my point of view, is the second one.  Traditional symbol servers operate by copying entire symbol files (DSYM, DWP, PDB) from some machine to the debugger host.  These can be very large -- we've seen 12+ GB in some cases -- which ranges from "slow bandwidth hog" to "complete non-starter" depending on the debugger host and network.  In this kind of scenario, one could theoretically run the debug info process on the same NAS, cloud, or whatever as the symbol server.  Then, rather than copying over an entire symbol file, it responds only to the query you issued -- if you asked for a type, it just returns a packet describing the type you requested.
>> 
>> The API itself would be stateless (so that you could make queries for multiple targets in any order) as well as asynchronous (so that responses might arrive out of order).  Blocking could be implemented in LLDB, but having the server be asynchronous means multiple clients could connect to the same server instance.  This raises interesting possibilities.  For example, one can imagine thousands of developers connecting to an internal symbol server on the network and being able to debug remote processes or core dumps over slow network connections or on machines with very little storage (e.g. chromebooks).
>> 
>> 
>> On the LLDB side, all of this is hidden behind the SymbolFile interface, so most of LLDB doesn't have to change at all.   While this is in development, we could have SymbolFileRemote and keep the existing local codepath the default, until such time that it's robust and complete enough that we can switch the default.
>> 
>> Thoughts?
> 
> Interesting idea.
> 
> Would you build the server using the pieces we have in the current SymbolFile implementations? What do you mean by “switching the default”? Do you expect LLDB to spin up a server if there’s none configured in the environment?
> 
> Fred

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20190226/812a2314/attachment-0001.html>