[PATCH] D48427: [Analyzer] Fix for D47417 to make the tests pass

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jun 27 11:38:03 PDT 2018


NoQ added a comment.

Aha, ok, yeah, we should have seen this coming. Whenever a checker tracks pairs of objects, like containers and iterators, or strings objects and their internal buffers (https://reviews.llvm.org/D48522), or return values and out-parameters of the same function call (https://reviews.llvm.org/D32449), we should expect races on which of the two dies first. Such races are inevitable because any of the two may be arbitrarily stored for an indefinite amount of time. The test that you see failing is one of the tricky cases to understand because it's about liveness of a parameter region, which is a bit counter-intuitive.

I believe that //the checker should deal with symbol/region lifetime races, not try to prevent them//. The checker should only extend the lifetime of items it tracks in `checkLiveSymbols()` when the checker has additional information about how the programmer can access the tracked object despite not having a direct reference to it in the imperfect symbolic memory. Otherwise we're sacrificing our fairly precise liveness tracking, which serves two important purposes:

1. finding leaks (eg., if the container is allocated on the heap, we may get a false negative leak when an iterator suddenly starts keeping it alive) and

2. keeping states small for faster lookups (even though state cleanups is a top performance bottleneck that occupies 30-50% of the analyzer's performance, removing cleanups only makes things worse because state lookups become slower).

When solving liveness races, the key understanding is that when the object becomes dead, all information we gathered during analysis about the object is final. Such information will never become more precise, and it will never mutate. For instance, because the object cannot be accessed anymore //during the current analysis//, its begin() and end() locations will never change. The "during the current analysis" part is important: for example, in your test `simple_bad_end()`, the vector is marked as dead even though the previous stack frame definitely still has access to it, because the analysis is only conducted within the current stack frame and will never leave it. If we started our analysis from the caller, the vector would not have been marked as dead yet.

Therefore i believe that when the container goes out of scope, the iterator should be "disconnected" from the container, and any information we've had about the container should now be associated with the iterator itself. In your case it probably means connecting the iterator directly to the begin/end symbols. That puts a bit of stress into your program state trait structure because there are now two sorts of iterator info structures to handle (one that directs all queries to the container, another that has all the info within it directly). But that's the usual cost of handling a liveness race - cf. "Schrödinger mutex states" in https://reviews.llvm.org/D32449, which are only as different from your case as the nature of the objects you're tracking is different.

I'm slightly worried that it might potentially be possible to mutate the container significantly by only using an interator, without having access to the object itself. It doesn't seem possible - at most you'll be able to mutate the element to which the iterator points, but you can't eg. change the size of the container or invalidate other iterators. If you can in fact mutate the container significantly through an iterator, then we'll need to re-think this, and enforcing liveness may be the right thing to do (though we may also get away with simply keeping a dead region in the program state solely for convenience).


https://reviews.llvm.org/D48427





More information about the cfe-commits mailing list