[Lldb-commits] [PATCH] D48393: Make DWARFParsing more thread-safe

Wed Jun 20 15:21:38 PDT 2018

> On Jun 20, 2018, at 3:14 PM, Frederic Riss via Phabricator <reviews at reviews.llvm.org> wrote:
> 
> friss added a comment.
> 
> In https://reviews.llvm.org/D48393#1138398, @zturner wrote:
> 
>> Long term I think one potentially interesting possibility for solving a lot of these threading and lazy evaluation issues is to make a task queue that runs all related operations on a single thread and returns a future you can wait on.  This way, the core data structures themselves do not need synchronization because they are only ever accessed from one thread.  I actually have a patch in progress <https://reviews.llvm.org/D48240> to make this kind of infrastructure, which I actually needed for a different reason, but it would work equally well here.
>> 
>> Mostly just putting this out there as an idea for down the road, not that you should do it right now.
> 

> 
> It's an interesting idea, which comes with a couple of tradeoffs:
> 
> - We would be essentially serializing the DWARF parsing. I do not know if it needs to be a goal that we're able to do efficient multi-threaded parsing, but it's an obvious consequence which would need to be discussed.

One situation in which lldb gets used not infrequently is where a single lldb run owns a handful of concurrent Debugger's all managing separate debug sessions.  That's how Xcode uses lldb, and probably many other IDE's that allow multiple simultaneous debugging sessions.  Running all the debuggers in the same process allows them to share the parsed modules in the global module cache, so this is a pretty good performance win, but most of the debug information will be in modules specific to each Debugger.  So there's probably not much contention amongst these Debuggers for access to their DWARF files at present.

So it is not the case that access to debug information will be mostly serial by happenstance, and only occasionally will we have parallel requests.  Serializing all these independent requests might be a bigger deal.  OTOH, we could fix that by having a pool of task queues that manage access to separate Modules in the global module cache.

> - I have worked on a codebase relying heavily of serialized thread queues and something equivalent to futures to schedule work. Let's say that it was... interesting to debug. If we did this, we don't need to go to the same extremes though. (Also I'm wondering, would tasks running serially be allowed to enqueue other tasks and wait for them? Or would this be forbidden?)

It is not uncommon that you would be parsing the DWARF for module A and find a type that is only known as a forward declaration.  In that case, lldb will look through the other Modules' debug info for a real definition, parse that and import it into module A.  So you would need to suspend one task, start another and wait on its completion.

Jim

> 
> I like the simplification aspect of it though. It is basically equivalent to adding locks at a higher abstraction level than the data-structures like I did. I was pondering that too, but I wasn't sure 1/ which level would be best and 2/ how to be sure I was not going to deadlock the whole thing.
> 
> 
> https://reviews.llvm.org/D48393
> 
> 
>