[PATCH] D52395: Thread safety analysis: Require exclusive lock for passing by non-const reference

Mon Sep 24 18:22:33 PDT 2018

aaronpuchert added a comment.

While most people probably just use ordinary mutexes, hence won't be affected, those that use read/write locks need to know when to use a shared and when to use an exclusive lock. What makes things hard in C++ is that through passing by non-const reference, an object can actually be modified although it looks like it's only passed as a parameter. That makes it really hard to decide whether a shared lock is sufficient just by reading the code, which was a major pain point in our project. We had to do very careful reviews to make sure we had the right kind of lock in every function. I'd like to offload some of that work to the compiler.

That's the main reason for this change: it's pretty easy to find out where a variable is used, but it's hard to say which uses are reads and which are writes in modern C++, if it weren't for const.

In https://reviews.llvm.org/D52395#1244021, @aaron.ballman wrote:

> > Unlike checking const qualifiers on member functions, there are probably not many false positives here: if a function takes a non-const reference, it will in almost all cases modify the object that we passed it.
>
> I'm not certain I agree with the predicate here. We can make that inference when there *is* a const qualifier on the parameter, but I don't I think we can make any assumptions about whether it will or won't modify the object passed in in the absence of a const qualifier. This has come up in the past for things like `C(C&)` being a valid copy constructor despite the parameter being non-const.

For const qualifiers on member functions, the concern is that I might have a shared lock on a data structure like a std::vector and then I read stuff from that vector. But unless I'm reading the vector from a const member function, or through a const reference, I'm going to use the non-const `operator[]`. That operator doesn't actually modify anything, but it returns a reference that allows me to modify stuff. However, I'm not doing any of that, so why should there be a warning. I think that's a valid concern.

Effectively we have two overloads that are both not actually modifying anything, but the non-const variant returns a reference that does allow us to modify something. My thinking was that this pattern is not so common with function parameters. The function(s) will return a reference to a part of an object, and such references will typically come from member functions for which we don't enforce this. So even if I use a non-modifying `<algorithm>` like `std::find`, there'll be no warning because the iterators come from container members `begin` and `end`. (By the way, that would be easy to fix by using `cbegin` and `cend`.)

The way I see it, there are basically three cases of false positives:

- The function doesn't actually modify the object, but takes it as non-const reference nevertheless. That can be easily fixed though and I think it's a fix that doesn't hurt.
- The function modifies the object sometimes, but here the user is very certain that it won't. That might be indication of a poor design, in any event it's very fragile. As a developer, I would feel very anxious about this, especially in a larger code base.
- We have two overloads of a function, both don't modify the object, but the non-const variant returns a non-const reference to some part of the object. This isn't backed up by data, but I think this doesn't happen nearly as often as with member functions. There are several fixes available as well: one could make sure that we only have a const reference there, or make the function a const member, or if all else fails, by taking a const reference to the object we don't want to modify and then call the function with that as argument. That's an additional line of code like `const auto &cref = obj;` which I think will rarely be needed.

So basically my argument is that this can catch really nasty bugs, and false positives should be rare. If they occur, they can be fixed by making the code more const-correct. Such fixes should be easy to sell, because everybody loves const-correctness, and it doesn't require any annotations.

> We might need some data to back this assertion up.

Makes sense to me, but I don't have access to the largest code base in terms of thread safety analysis. Maybe @delesley can help here? Or I need to apply for a job at Google. On our own code, we're still in the very early stages of annotating, so I can't provide reliable data from there. I've seen thread safety analysis in flutter <https://github.com/flutter/engine> and Chromium, but both don't seem to use read/write locks.

================
Comment at: lib/Analysis/ThreadSafety.cpp:2023
           QualType Qt = Pvd->getType();
-          if (Qt->isReferenceType())
-            checkAccess(Arg, AK_Read, POK_PassByRef);
+          if (const auto* RT = dyn_cast<const ReferenceType>(&*Qt)) {
+            if (RT->getPointeeType().isConstQualified())
----------------
aaron.ballman wrote:
> This isn't specific to your changes, but this gives me the impression we don't distinguish between rvalue references and lvalue references, but that may be something of interest in here.
You mean `&` vs `&&`? I don't that makes a difference here, the important property is whether it's `const`. So `const&` is a read, and `&` and `&&` are writes, and `const&&` doesn't really make sense. (It's effectively the same as `const&` for all I know.)

But there is one error here: I definitely have to look at the canonical type, i.e. it should be `&*Qt.getCanonicalType()` instead of `&*Qt`.

Repository:
  rC Clang

https://reviews.llvm.org/D52395