[PATCH] D157459: Make DWARFContext more thread safe.

Thu Aug 17 15:12:03 PDT 2023

clayborg added a comment.

In D157459#4596842 <https://reviews.llvm.org/D157459#4596842>, @bulbazord wrote:

> In D157459#4596356 <https://reviews.llvm.org/D157459#4596356>, @clayborg wrote:
>
>> Are there any concrete steps I can take to help make this patch work for everyone? I have heard concerns but no real steps or tips on what it would take for people to move this patch along.
>
> I can't speak for everyone, but here are a few concrete ways I think this change can be improved:
>
> - This change is probably a regression performance-wise. You have a recursive mutex locking access to all member variables of a given DWARFContext on every access. One thread may try to access the already-calculated `TUIndex` while another is trying to do expensive work to calculate `Abbrev`. While this technically will give you correct results, I think we can do better in terms of performance. I'm not 100% sure which members depend on which, but it would be excellent if we could figure that out and guard every member behind its own mutex so we're not stalling on unrelated requests. The only downside I can think of is if we have mutually-dependent requests where thread A locks mutex 1 and then 2 while thread B locks mutex 2 and then 1 (deadlock). Is this possible to do?

I worry about thread deadlocking due to the A/B locking as you stated in these kinds of requests. We went this road early in LLDB's development but as the code evolved and people added more locks we ended up deadlocking. We used to have a lock on the lldb_private::Module and one on lldb_private::ObjectFile and one in lldb_private::SymbolFile and ran into deadlocks as people added new code. We eventually made all of these classes, that are related, use a single mutex and things were much better.

The main reason for the recursive nature is if any of the currently protected methods call another you can and will deadlock your program and I think some of the current calls do call others IIRC.

I

> - It looks like the "blessed" way of accessing these members is through getter methods. Future contributors may not be aware of this restriction (partially because it's not written down anywhere, partially because it's not obvious by reading the code). Even if you do document it, it's unenforceable except through code review. To address this, I think we can add an abstraction where all the member variables are owned by some other class/struct (e.g. `DWARFContextImpl`) where all the members are owned by that `Impl` object and are marked private. Access to those variables would be strictly through getter methods. DWARFContext would then contain an instance of this `DWARFContextImpl` class so nobody can use the member variables in an unsynchronized fashion. To be more explicit, anytime a method in `DWARFContext` wants to get the `TUIndex`, it would have to call something like `Impl.getTUIndex()`. It's possible people can work around this and still do bad things, but they would have to consciously think about breaking the abstraction to do that.

That sounds very complex and would make the class even harder to work on and maintain?

> - Some kind of documentation related to the expectations of `DWARFContext` in a multithreaded environment go a long way. Even something simple as "This class is considered thread-safe" so that folks know that they can call `DWARFContext::getTUIndex` without worrying about this exact scenario.

Seems like this is pretty clear from the places the mutex is used, but I could add documentation if needed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157459/new/

https://reviews.llvm.org/D157459