[clang] c877294 - [analyzer] Wrap SymbolicRegions by ElementRegions before getting a FieldRegion (#85211)

via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 21 10:22:25 PDT 2024


Author: Balazs Benics
Date: 2024-03-21T18:22:22+01:00
New Revision: c8772940ee4d85b1a4578b3faea7c825300ce59f

URL: https://github.com/llvm/llvm-project/commit/c8772940ee4d85b1a4578b3faea7c825300ce59f
DIFF: https://github.com/llvm/llvm-project/commit/c8772940ee4d85b1a4578b3faea7c825300ce59f.diff

LOG: [analyzer] Wrap SymbolicRegions by ElementRegions before getting a FieldRegion (#85211)

Inside the ExprEngine when we process the initializers, we create a
PostInitializer program-point, which will refer to the field being
initialized, see `FieldLoc` inside `ExprEngine::ProcessInitializer`.

When a constructor (of which we evaluate the initializer-list) is
analyzed in top-level context, then the `this` pointer will be
represented by a `SymbolicRegion`, (as it should be).

This means that we will form a `FieldRegion{SymbolicRegion{.}}` as the
initialized region.

```c++
class Bear {
public:
  void brum() const;
};
class Door {
public:
  // PostInitializer would refer to "FieldRegion{SymRegion{this}}"
  // whereas in the store and everywhere else it would be:
  // "FieldRegion{ELementRegion{SymRegion{Ty*, this}, 0, Ty}".
  Door() : ptr(nullptr) {
    ptr->brum(); // Bug
  }
private:
  Bear* ptr;
};
```

We (as CSA folks) decided to avoid the creation of FieldRegions directly
of symbolic regions in the past:

https://github.com/llvm/llvm-project/commit/f8643a9b31c4029942f67d4534c9139b45173504

---

In this patch, I propose to also canonicalize it as in the mentioned
patch, into this: `FieldRegion{ElementRegion{SymbolicRegion{Ty*, .}, 0,
Ty}`

This would mean that FieldRegions will/should never simply wrap a
SymbolicRegion directly, but rather an ElementRegion that is sitting in
between.

This patch should have practically no observable effects, as the store
(due to the mentioned patch) was made resilient to this issue, but we
use `PostInitializer::getLocationValue()` for an alternative reporting,
where we faced this issue.

Note that in really rare cases it suppresses now dereference bugs, as
demonstrated in the test. It is because in the past we failed to follow
the region of the PostInitializer inside the StoreSiteFinder visitor -
because it was using this code:
```c++
// If this is a post initializer expression, initializing the region, we
// should track the initializer expression.
if (std::optional<PostInitializer> PIP =
        Pred->getLocationAs<PostInitializer>()) {
  const MemRegion *FieldReg = (const MemRegion *)PIP->getLocationValue();
  if (FieldReg == R) {
    StoreSite = Pred;
    InitE = PIP->getInitializer()->getInit();
  }
}
```
Notice that the equality check didn't pass for the regions I'm
canonicalizing in this patch.

Given the nature of this change, we would rather upstream this patch.

CPP-4954

Added: 
    

Modified: 
    clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
    clang/lib/StaticAnalyzer/Core/ProgramState.cpp
    clang/test/Analysis/inlining/false-positive-suppression.cpp

Removed: 
    


################################################################################
diff  --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
index ca75c2a756a4a5..51d76dc257ee94 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
@@ -494,6 +494,8 @@ class ProgramState : public llvm::FoldingSetNode {
                         InvalidatedSymbols *IS,
                         RegionAndSymbolInvalidationTraits *HTraits,
                         const CallEvent *Call) const;
+
+  SVal wrapSymbolicRegion(SVal Base) const;
 };
 
 //===----------------------------------------------------------------------===//
@@ -782,20 +784,6 @@ inline SVal ProgramState::getLValue(const ObjCIvarDecl *D, SVal Base) const {
   return getStateManager().StoreMgr->getLValueIvar(D, Base);
 }
 
-inline SVal ProgramState::getLValue(const FieldDecl *D, SVal Base) const {
-  return getStateManager().StoreMgr->getLValueField(D, Base);
-}
-
-inline SVal ProgramState::getLValue(const IndirectFieldDecl *D,
-                                    SVal Base) const {
-  StoreManager &SM = *getStateManager().StoreMgr;
-  for (const auto *I : D->chain()) {
-    Base = SM.getLValueField(cast<FieldDecl>(I), Base);
-  }
-
-  return Base;
-}
-
 inline SVal ProgramState::getLValue(QualType ElementType, SVal Idx, SVal Base) const{
   if (std::optional<NonLoc> N = Idx.getAs<NonLoc>())
     return getStateManager().StoreMgr->getLValueElement(ElementType, *N, Base);

diff  --git a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
index f12f1a5ac970dd..f82cd944750a3c 100644
--- a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
@@ -226,6 +226,20 @@ ProgramStateRef ProgramState::killBinding(Loc LV) const {
   return makeWithStore(newStore);
 }
 
+/// SymbolicRegions are expected to be wrapped by an ElementRegion as a
+/// canonical representation. As a canonical representation, SymbolicRegions
+/// should be wrapped by ElementRegions before getting a FieldRegion.
+/// See f8643a9b31c4029942f67d4534c9139b45173504 why.
+SVal ProgramState::wrapSymbolicRegion(SVal Val) const {
+  const auto *BaseReg = dyn_cast_or_null<SymbolicRegion>(Val.getAsRegion());
+  if (!BaseReg)
+    return Val;
+
+  StoreManager &SM = getStateManager().getStoreManager();
+  QualType ElemTy = BaseReg->getPointeeStaticType();
+  return loc::MemRegionVal{SM.GetElementZeroRegion(BaseReg, ElemTy)};
+}
+
 ProgramStateRef
 ProgramState::enterStackFrame(const CallEvent &Call,
                               const StackFrameContext *CalleeCtx) const {
@@ -451,6 +465,24 @@ void ProgramState::setStore(const StoreRef &newStore) {
   store = newStoreStore;
 }
 
+SVal ProgramState::getLValue(const FieldDecl *D, SVal Base) const {
+  Base = wrapSymbolicRegion(Base);
+  return getStateManager().StoreMgr->getLValueField(D, Base);
+}
+
+SVal ProgramState::getLValue(const IndirectFieldDecl *D, SVal Base) const {
+  StoreManager &SM = *getStateManager().StoreMgr;
+  Base = wrapSymbolicRegion(Base);
+
+  // FIXME: This should work with `SM.getLValueField(D->getAnonField(), Base)`,
+  // but that would break some tests. There is probably a bug somewhere that it
+  // would expose.
+  for (const auto *I : D->chain()) {
+    Base = SM.getLValueField(cast<FieldDecl>(I), Base);
+  }
+  return Base;
+}
+
 //===----------------------------------------------------------------------===//
 //  State pretty-printing.
 //===----------------------------------------------------------------------===//

diff  --git a/clang/test/Analysis/inlining/false-positive-suppression.cpp b/clang/test/Analysis/inlining/false-positive-suppression.cpp
index 56659b4a1941cc..2f9ed7f78b3f07 100644
--- a/clang/test/Analysis/inlining/false-positive-suppression.cpp
+++ b/clang/test/Analysis/inlining/false-positive-suppression.cpp
@@ -210,3 +210,20 @@ namespace Cleanups {
     testArgumentHelper(NonTrivial().getNull());
   }
 }
+
+class Bear *getNullBear() { return nullptr; }
+class Bear {
+public:
+  void brum() const;
+};
+class Door {
+public:
+  Door() : ptr(getNullBear()) {
+    ptr->brum();
+#ifndef SUPPRESSED
+    // expected-warning at -2 {{Called C++ object pointer is null}}
+#endif
+  }
+private:
+  Bear* ptr;
+};


        


More information about the cfe-commits mailing list