[PATCH] D157459: Make DWARFContext more thread safe.

Thu Aug 17 14:57:59 PDT 2023

bulbazord added a comment.

In D157459#4596356 <https://reviews.llvm.org/D157459#4596356>, @clayborg wrote:

> I know this patch is far from adding complete thread safety to this class, but it does address some fairly simple issues that can cause things to crash. This patch protects any function that has the format of:
>
>   if (MemberVar)
>     return MemberVar;
>   // Initialize MemberVar and then do expensive work to populate MemberVar 
>   return MemberVar;
>
> The expensive work to populate can take some time and protecting this with a mutex seems like the right thing to do. Any other threads can easily use the member variable that gets returned in an incomplete state.
>
> Are there any concrete steps I can take to help make this patch work for everyone? I have heard concerns but no real steps or tips on what it would take for people to move this patch along.

I can't speak for everyone, but here are a few concrete ways I think this change can be improved:

- This change is probably a regression performance-wise. You have a recursive mutex locking access to all member variables of a given DWARFContext on every access. One thread may try to access the already-calculated `TUIndex` while another is trying to do expensive work to calculate `Abbrev`. While this technically will give you correct results, I think we can do better in terms of performance. I'm not 100% sure which members depend on which, but it would be excellent if we could figure that out and guard every member behind its own mutex so we're not stalling on unrelated requests. The only downside I can think of is if we have mutually-dependent requests where thread A locks mutex 1 and then 2 while thread B locks mutex 2 and then 1 (deadlock). Is this possible to do?
- It looks like the "blessed" way of accessing these members is through getter methods. Future contributors may not be aware of this restriction (partially because it's not written down anywhere, partially because it's not obvious by reading the code). Even if you do document it, it's unenforceable except through code review. To address this, I think we can add an abstraction where all the member variables are owned by some other class/struct (e.g. `DWARFContextImpl`) where all the members are owned by that `Impl` object and are marked private. Access to those variables would be strictly through getter methods. DWARFContext would then contain an instance of this `DWARFContextImpl` class so nobody can use the member variables in an unsynchronized fashion. To be more explicit, anytime a method in `DWARFContext` wants to get the `TUIndex`, it would have to call something like `Impl.getTUIndex()`. It's possible people can work around this and still do bad things, but they would have to consciously think about breaking the abstraction to do that.
- Some kind of documentation related to the expectations of `DWARFContext` in a multithreaded environment go a long way. Even something simple as "This class is considered thread-safe" so that folks know that they can call `DWARFContext::getTUIndex` without worrying about this exact scenario.

What do y'all think @JDevlieghere @dblaikie @aprantl?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157459/new/

https://reviews.llvm.org/D157459