[PATCH] D46368: [analyzer] pr18953: Zeroing constructors, store binding extents, ASTRecordLayouts.

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 2 15:39:40 PDT 2018


NoQ created this revision.
NoQ added reviewers: dcoughlin, xazax.hun, a.sidorin, george.karpenkov, szepet, rnkovacs.
Herald added subscribers: cfe-commits, baloghadamsoftware.

As the old assertion says,

> "Thou shalt not initialize the same memory twice."

Namely, the `State->bindDefault()`/`StoreManager.BindDefault()` API has the purpose of overriding the initial value with which memory is initialized. For instance, a checker can express that loading from freshly malloc()ed memory would yield Undefined, while loading from freshly calloc()ed memory would yield zero, even though loading from a normal symbolic region would yield a region-value symbol. `BindDefault` never was meant for binding arbitrary default values to arbitrary locations (though it was never doxygen'ed anywhere). For that reason it has the aforementioned assertion.

The assertion is actually a bit more relaxed than that; it simply says that the new default binding should never overwrite an existing binding key, which means that not only the base region but also the offset should coincide in order to trigger the assertion.

Other sorts of default store bindings are created by different APIs. Namely, there's also the invalidation case, which consists in default-binding a conjured symbol, and it is performed by `State->invalidateRegions()`/`StoreManager.invalidateRegions()` which never calls `BindDefault` but instead manipulates bindings directly.

Now, when it comes to C++ zeroing constructors (i.e. non-user-supplied constructors that boil down to zero initialization of the object), a lot of weird things can happen, especially when such constructors are applied to fields or base objects, especially with multiple inheritance, especially with default inheritance from empty objects (objects with no fields, eg. a-la java interfaces). The same base object would undergo multiple initializations for every zeroing constructor of a field or a base. And due to empty fields or bases, it may happen that such bindings will have the exact same offset. It may also happen that the new default binding would conflict an old direct binding, as i demonstrated on one of the newly added tests.

It seems that C++ zero-initialization is quite different from normal `BindDefault`'s idea of setting the initial/default value. Therefore i propose to split up the `BindDefault` API into two parts: `BindDefaultInitial()` for setting initial/default values and `BindDefaultZero()` for performing C++-like zero initialization specifically. This would leave us with 3 different APIs for manipulating default bindings, but i guess it's not that bad because these different APIs are quite good at expressing the coder's intent while hiding internals of the `RegionStore`, and their implementations are significantly different.

----

Now, recall that there was a previous attempt to relax that assertion in https://reviews.llvm.org/rC190530, for the same reason. The separation of APIs proposed here would allow us to restore the historical assertion in its original strength, while accepting that the assertion doesn't cover C++ zero-initialization. I'm not seeing a good way to cover the zero-initialization case with that, and i'm not convinced that it's necessary.

The interesting part here is, however, the special-case behavior that was chosen on the path on which the assertion did not hold. It was pointed out by Jordan that if a zeroing constructor follows a conservatively evaluated empty-base constructor, it is more correct to preserve the default conjured symbol produced by the empty-base constructor than to replace it with a zero default binding that would completely overwrite the conjured symbol, since //we don't model store binding extents//. This is more correct because zero-initializer should only cover its respective sub-object, but other parts of the big object must remain invalidated.

In the newly added tests i point out that keeping the original conjured symbol in that case would also not be fully safe. The actually-safe behavior would be to invalidate the object again, because it's no longer the same unknown value that we previously had. So i'm adding a bunch of tests in which we take various field values in different weird moments of time and see if we aren't accidentally assuming that they're the same known or unknown values as the ones taken in different moments of time.

These tests now also demonstrate another bug that would also make our store bindings incorrectly, namely by looking at `ASTRecordLayout` it is not always apparent what part of the object is going to be overwritten by making a binding to a specific sub-region. See the new architecture-specific tests in which field `z` misbehaves.

Long story short, `RegionStore` is modeling zero initializations imperfectly for at least two reasons.

We could work around that by invalidating object contents every time we detect something fishy. The only reasonable detection i can come up with would be to detect if we have a coinciding binding (i.e. what the assertion was previously protecting us from), eg.:

  if (B.getDefaultBinding(R) || B.getDirectBinding(R))
    V = UnknownVal();

This detection is imperfect - there are tests that would still fail. In particular it won't fix the `ASTRecordLayout` problem. And i also suspect that it really destroys a lot more good data than bad data. So i'll be experimenting with that now. In the current version of the patch i remove the special case handling and simply bind the default value blindly even if i know that `RegionStore` can't handle it precisely.


Repository:
  rC Clang

https://reviews.llvm.org/D46368

Files:
  include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
  include/clang/StaticAnalyzer/Core/PathSensitive/Store.h
  lib/StaticAnalyzer/Checkers/MallocChecker.cpp
  lib/StaticAnalyzer/Core/ExprEngine.cpp
  lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
  lib/StaticAnalyzer/Core/ProgramState.cpp
  lib/StaticAnalyzer/Core/RegionStore.cpp
  lib/StaticAnalyzer/Core/Store.cpp
  test/Analysis/ctor.mm

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D46368.144937.patch
Type: text/x-patch
Size: 15655 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20180502/0f032fa2/attachment-0001.bin>


More information about the cfe-commits mailing list