[PATCH] D55289: [analyzer] MoveChecker Pt.5: Improve invalidation policies.

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Dec 5 14:02:26 PST 2018


NoQ marked an inline comment as done.
NoQ added inline comments.


================
Comment at: lib/StaticAnalyzer/Checkers/MoveChecker.cpp:540-542
+    // Explicit regions are the regions passed into the call directly, but
+    // not all of them end up being invalidated. The ones that do appear in
+    // the Regions array as well.
----------------
NoQ wrote:
> Szelethus wrote:
> > Really? Then I guess this needs to be updated in `CheckerDocumentation.cpp`:
> > 
> > ```
> >   /// \param ExplicitRegions The regions explicitly requested for invalidation.
> >   ///        For a function call, this would be the arguments. For a bind, this
> >   ///        would be the region being bound to.
> > ```
> > 
> > To me, this clearly indicates that the elements of `ExplicitRegions` will be invalidated. Does "requested for" really just mean "requested for potential"? Since this happens //before// any invalidation, it could easily be interpreted as "invalidated after the end of the callback".
> The callback fires after invalidation - cf. `ProgramState::invalidateRegionsImpl`. Note that the `if (Eng)` condition is always true, because how the heck were we supposed to run path-sensitive analysis without `ExprEngine`.
> 
> The terminology is really weird here because we the word "invalidation" has too many meanings. Essentially, `ExplicitRegions` are regions that were specifically requested for invalidation, but it is up to the `CallEvent` / `ProgramState` / `StoreManager` (depending on which `invalidateRegions()` was called) to decide what invalidation means in this case by filling in the `RegionAndSymbolInvalidationTraits` map.
> 
> For regions that represent values of const pointers/references directly passed into the function, `CallEvent` decides to set the `TK_PreserveContents` trait, which says "invalidate the region but keep its contents intact". It would still perform other forms of invalidation on that region, say, pointer escape: if there are pointer values written down somewhere within that region, checkers need to stop tracking them.
> 
> Now, the callback never said that it has something to do with "invalidation". Instead, it is about "region changes", which means changes of the region's contents in the Store. This doesn't necessarily happen due to invalidation; it may also happen while evaluating a simple assignment in the program, as we see in the newly added `testUpdateField()` test. And, as we've seen above, not every "invalidation" changes contents of the region.
> 
> Similarly, `TK_SuppressEscape` would suppress the `checkPointerEscape()` callback for the region, but will not suppress `checkRegionChanges()` for that (non-explicit) region.
> 
> Therefore we end up in a weird situation when some regions were requested to be invalidated but never were actually invalidated in terms of the Store. It kinda  makes sense, but i hate it anyway.  The `checkRegionChanges()` callback is hard to understand and hard to use and too general and has too much stuff stuffed into it. `checkPointerEscape` is an easier-to-use fork of it, but it doesn't cover the use case of checkers that track objects via regions. I suspect that the right solution here is to start tracking objects as "symbols", i.e., assign a unique opaque ID to every state through which every object in the program goes (regardless of which object it is) and use that as keys in the map. That should remove the stress of dealing with invalidation from C++ checkers that need to track objects opaquely. The problem won't magically disappear, as we still need to identify when exactly does the state change, but an additional level of indirection (Region -> ID -> CheckerState instead of just Region -> CheckerState) allows us to decompose it into smaller parts and de-duplicate some of this work.
> 
> But regardless of that, it is still weird that `ExplicitRegions` is not a sub-set of `Regions`. We need to either document it or fix it, and for some reason i prefer the latter. In particular, the only checker that actually actively acts upon `ExplicitRegions` when they're const is `RetainCountChecker`, but in fact people don't ever use `const` in stuff that it checks, it just isn't idiomatic.
Another funny thing about `RegionAndSymbolInvalidationTraits`  is that it races when the same region is passed both via const pointer and a non-const pointer. The region will be invalidated or not depending on which one goes in first. We really need to fix it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55289/new/

https://reviews.llvm.org/D55289





More information about the cfe-commits mailing list