[clang] [analyzer] Wrap SymbolicRegions by ElementRegions before getting a FieldRegion (PR #85211)

Balazs Benics via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 21 10:06:23 PDT 2024


https://github.com/steakhal updated https://github.com/llvm/llvm-project/pull/85211

>From bde85e0d145049a6661afba6f4585865c5630792 Mon Sep 17 00:00:00 2001
From: Balazs Benics <benicsbalazs at gmail.com>
Date: Thu, 14 Mar 2024 12:38:12 +0100
Subject: [PATCH 1/4] [analyzer] Wrap SymbolicRegions by ElementRegions before
 getting a FieldRegion

Inside the ExprEngine when we process the initializers, we create a
PostInitializer program-point, which will refer to the field being
initialized, see `FieldLoc` inside `ExprEngine::ProcessInitializer`.

When a constructor (of which we evaluate the initializer-list) is analyzed
in top-level context, then the `this` pointer will be represented by a
`SymbolicRegion`, (as it should be).

This means that we will form a `FieldRegion{SymbolicRegion{.}}` as the
initialized region.

```c++
class Bear {
public:
  void brum() const;
};
class Door {
public:
  // PostInitializer would refer to "FieldRegion{SymRegion{this}}"
  // whereas in the store and everywhere else it would be:
  // "FieldRegion{ELementRegion{SymRegion{Ty*, this}, 0, Ty}".
  Door() : ptr(nullptr) {
    ptr->brum(); // Bug
  }
private:
  Bear* ptr;
};
```

We (as CSA folks) decided to avoid the creation of FieldRegions directly of
symbolic regions in the past:
https://github.com/llvm/llvm-project/commit/f8643a9b31c4029942f67d4534c9139b45173504

---

In this patch, I propose to also canonicalize it as in the mentioned patch,
into this: `FieldRegion{ElementRegion{SymbolicRegion{Ty*, .}, 0, Ty}`

This would mean that FieldRegions will/should never simply wrap a
SymbolicRegion directly, but rather an ElementRegion that is sitting
in between.

This patch should have practically no observable effects, as the store
(due to the mentioned patch) was made resilient to this issue, but we use
`PostInitializer::getLocationValue()` for an alternative reporting,
where we faced this issue.

Note that in really rare cases it suppresses now dereference bugs, as
demonstrated in the test. It is because in the past we failed to
follow the region of the PostInitializer inside
the StoreSiteFinder visitor - because it was using this code:
```c++
// If this is a post initializer expression, initializing the region, we
// should track the initializer expression.
if (std::optional<PostInitializer> PIP =
        Pred->getLocationAs<PostInitializer>()) {
  const MemRegion *FieldReg = (const MemRegion *)PIP->getLocationValue();
  if (FieldReg == R) {
    StoreSite = Pred;
    InitE = PIP->getInitializer()->getInit();
  }
}
```
And the equality check didn't pass for the regions I'm canonicalizing in this patch.

Given the nature of this change, we would rather upstream this patch.

CPP-4954
---
 .../Core/PathSensitive/ProgramState.h         | 14 --------
 .../lib/StaticAnalyzer/Core/ProgramState.cpp  | 32 +++++++++++++++++++
 .../inlining/false-positive-suppression.cpp   | 17 ++++++++++
 3 files changed, 49 insertions(+), 14 deletions(-)

diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
index ca75c2a756a4a5..47fc4ad88d3161 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
@@ -782,20 +782,6 @@ inline SVal ProgramState::getLValue(const ObjCIvarDecl *D, SVal Base) const {
   return getStateManager().StoreMgr->getLValueIvar(D, Base);
 }
 
-inline SVal ProgramState::getLValue(const FieldDecl *D, SVal Base) const {
-  return getStateManager().StoreMgr->getLValueField(D, Base);
-}
-
-inline SVal ProgramState::getLValue(const IndirectFieldDecl *D,
-                                    SVal Base) const {
-  StoreManager &SM = *getStateManager().StoreMgr;
-  for (const auto *I : D->chain()) {
-    Base = SM.getLValueField(cast<FieldDecl>(I), Base);
-  }
-
-  return Base;
-}
-
 inline SVal ProgramState::getLValue(QualType ElementType, SVal Idx, SVal Base) const{
   if (std::optional<NonLoc> N = Idx.getAs<NonLoc>())
     return getStateManager().StoreMgr->getLValueElement(ElementType, *N, Base);
diff --git a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
index f12f1a5ac970dd..604728cdf1dddd 100644
--- a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
@@ -451,6 +451,38 @@ void ProgramState::setStore(const StoreRef &newStore) {
   store = newStoreStore;
 }
 
+/// FieldRegions are expected to be wrapped by an ElementRegion as a canonical
+/// representation. See f8643a9b31c4029942f67d4534c9139b45173504 why.
+static SVal wrapSymbolicRegion(const ProgramState &State, SVal Base) {
+  const auto *SymbolicBase =
+      dyn_cast_or_null<SymbolicRegion>(Base.getAsRegion());
+
+  if (!SymbolicBase)
+    return Base;
+
+  StoreManager &SM = State.getStateManager().getStoreManager();
+  QualType ElemTy = SymbolicBase->getPointeeStaticType();
+  return loc::MemRegionVal{SM.GetElementZeroRegion(SymbolicBase, ElemTy)};
+}
+
+SVal ProgramState::getLValue(const FieldDecl *D, SVal Base) const {
+  Base = wrapSymbolicRegion(*this, Base);
+  return getStateManager().StoreMgr->getLValueField(D, Base);
+}
+
+SVal ProgramState::getLValue(const IndirectFieldDecl *D, SVal Base) const {
+  StoreManager &SM = *getStateManager().StoreMgr;
+  Base = wrapSymbolicRegion(*this, Base);
+
+  // FIXME: This should work with `SM.getLValueField(D->getAnonField(), Base)`,
+  // but that would break some tests. There is probably a bug somewhere that it
+  // would expose.
+  for (const auto *I : D->chain()) {
+    Base = SM.getLValueField(cast<FieldDecl>(I), Base);
+  }
+  return Base;
+}
+
 //===----------------------------------------------------------------------===//
 //  State pretty-printing.
 //===----------------------------------------------------------------------===//
diff --git a/clang/test/Analysis/inlining/false-positive-suppression.cpp b/clang/test/Analysis/inlining/false-positive-suppression.cpp
index 56659b4a1941cc..2f9ed7f78b3f07 100644
--- a/clang/test/Analysis/inlining/false-positive-suppression.cpp
+++ b/clang/test/Analysis/inlining/false-positive-suppression.cpp
@@ -210,3 +210,20 @@ namespace Cleanups {
     testArgumentHelper(NonTrivial().getNull());
   }
 }
+
+class Bear *getNullBear() { return nullptr; }
+class Bear {
+public:
+  void brum() const;
+};
+class Door {
+public:
+  Door() : ptr(getNullBear()) {
+    ptr->brum();
+#ifndef SUPPRESSED
+    // expected-warning at -2 {{Called C++ object pointer is null}}
+#endif
+  }
+private:
+  Bear* ptr;
+};

>From c1ce3c78935e9cbf839fea797c77007f97f1b25c Mon Sep 17 00:00:00 2001
From: Balazs Benics <benicsbalazs at gmail.com>
Date: Mon, 18 Mar 2024 14:44:15 +0100
Subject: [PATCH 2/4] Move the TU static helper to a private method of
 ProgramState

---
 .../Core/PathSensitive/ProgramState.h         |  4 +++
 .../lib/StaticAnalyzer/Core/ProgramState.cpp  | 30 +++++++++----------
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
index 47fc4ad88d3161..f160b262543eab 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
@@ -494,6 +494,10 @@ class ProgramState : public llvm::FoldingSetNode {
                         InvalidatedSymbols *IS,
                         RegionAndSymbolInvalidationTraits *HTraits,
                         const CallEvent *Call) const;
+
+  /// FieldRegions are expected to be wrapped by an ElementRegion as a canonical
+  /// representation. See f8643a9b31c4029942f67d4534c9139b45173504 why.
+  SVal wrapSymbolicRegion(SVal Base) const;
 };
 
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
index 604728cdf1dddd..68a16c0c14e983 100644
--- a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
@@ -226,6 +226,18 @@ ProgramStateRef ProgramState::killBinding(Loc LV) const {
   return makeWithStore(newStore);
 }
 
+SVal ProgramState::wrapSymbolicRegion(SVal Base) const {
+  const auto *SymbolicBase =
+      dyn_cast_or_null<SymbolicRegion>(Base.getAsRegion());
+
+  if (!SymbolicBase)
+    return Base;
+
+  StoreManager &SM = getStateManager().getStoreManager();
+  QualType ElemTy = SymbolicBase->getPointeeStaticType();
+  return loc::MemRegionVal{SM.GetElementZeroRegion(SymbolicBase, ElemTy)};
+}
+
 ProgramStateRef
 ProgramState::enterStackFrame(const CallEvent &Call,
                               const StackFrameContext *CalleeCtx) const {
@@ -451,28 +463,14 @@ void ProgramState::setStore(const StoreRef &newStore) {
   store = newStoreStore;
 }
 
-/// FieldRegions are expected to be wrapped by an ElementRegion as a canonical
-/// representation. See f8643a9b31c4029942f67d4534c9139b45173504 why.
-static SVal wrapSymbolicRegion(const ProgramState &State, SVal Base) {
-  const auto *SymbolicBase =
-      dyn_cast_or_null<SymbolicRegion>(Base.getAsRegion());
-
-  if (!SymbolicBase)
-    return Base;
-
-  StoreManager &SM = State.getStateManager().getStoreManager();
-  QualType ElemTy = SymbolicBase->getPointeeStaticType();
-  return loc::MemRegionVal{SM.GetElementZeroRegion(SymbolicBase, ElemTy)};
-}
-
 SVal ProgramState::getLValue(const FieldDecl *D, SVal Base) const {
-  Base = wrapSymbolicRegion(*this, Base);
+  Base = wrapSymbolicRegion(Base);
   return getStateManager().StoreMgr->getLValueField(D, Base);
 }
 
 SVal ProgramState::getLValue(const IndirectFieldDecl *D, SVal Base) const {
   StoreManager &SM = *getStateManager().StoreMgr;
-  Base = wrapSymbolicRegion(*this, Base);
+  Base = wrapSymbolicRegion(Base);
 
   // FIXME: This should work with `SM.getLValueField(D->getAnonField(), Base)`,
   // but that would break some tests. There is probably a bug somewhere that it

>From 8d6fec863e6855fe71e1c6aca8a0b0a8c4e7d925 Mon Sep 17 00:00:00 2001
From: Balazs Benics <benicsbalazs at gmail.com>
Date: Mon, 18 Mar 2024 15:03:40 +0100
Subject: [PATCH 3/4] Move private member fn comment to implementation

---
 .../clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h      | 2 --
 clang/lib/StaticAnalyzer/Core/ProgramState.cpp                  | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
index f160b262543eab..51d76dc257ee94 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
@@ -495,8 +495,6 @@ class ProgramState : public llvm::FoldingSetNode {
                         RegionAndSymbolInvalidationTraits *HTraits,
                         const CallEvent *Call) const;
 
-  /// FieldRegions are expected to be wrapped by an ElementRegion as a canonical
-  /// representation. See f8643a9b31c4029942f67d4534c9139b45173504 why.
   SVal wrapSymbolicRegion(SVal Base) const;
 };
 
diff --git a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
index 68a16c0c14e983..6b53e2ceb96b03 100644
--- a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
@@ -226,6 +226,8 @@ ProgramStateRef ProgramState::killBinding(Loc LV) const {
   return makeWithStore(newStore);
 }
 
+/// FieldRegions are expected to be wrapped by an ElementRegion as a canonical
+/// representation. See f8643a9b31c4029942f67d4534c9139b45173504 why.
 SVal ProgramState::wrapSymbolicRegion(SVal Base) const {
   const auto *SymbolicBase =
       dyn_cast_or_null<SymbolicRegion>(Base.getAsRegion());

>From db5d9a4bdd112743a3fdcaa98fc101b3c5e0fd79 Mon Sep 17 00:00:00 2001
From: Balazs Benics <benicsbalazs at gmail.com>
Date: Thu, 21 Mar 2024 18:05:56 +0100
Subject: [PATCH 4/4] Apply review feedback

---
 .../lib/StaticAnalyzer/Core/ProgramState.cpp  | 20 +++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
index 6b53e2ceb96b03..f82cd944750a3c 100644
--- a/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ProgramState.cpp
@@ -226,18 +226,18 @@ ProgramStateRef ProgramState::killBinding(Loc LV) const {
   return makeWithStore(newStore);
 }
 
-/// FieldRegions are expected to be wrapped by an ElementRegion as a canonical
-/// representation. See f8643a9b31c4029942f67d4534c9139b45173504 why.
-SVal ProgramState::wrapSymbolicRegion(SVal Base) const {
-  const auto *SymbolicBase =
-      dyn_cast_or_null<SymbolicRegion>(Base.getAsRegion());
-
-  if (!SymbolicBase)
-    return Base;
+/// SymbolicRegions are expected to be wrapped by an ElementRegion as a
+/// canonical representation. As a canonical representation, SymbolicRegions
+/// should be wrapped by ElementRegions before getting a FieldRegion.
+/// See f8643a9b31c4029942f67d4534c9139b45173504 why.
+SVal ProgramState::wrapSymbolicRegion(SVal Val) const {
+  const auto *BaseReg = dyn_cast_or_null<SymbolicRegion>(Val.getAsRegion());
+  if (!BaseReg)
+    return Val;
 
   StoreManager &SM = getStateManager().getStoreManager();
-  QualType ElemTy = SymbolicBase->getPointeeStaticType();
-  return loc::MemRegionVal{SM.GetElementZeroRegion(SymbolicBase, ElemTy)};
+  QualType ElemTy = BaseReg->getPointeeStaticType();
+  return loc::MemRegionVal{SM.GetElementZeroRegion(BaseReg, ElemTy)};
 }
 
 ProgramStateRef



More information about the cfe-commits mailing list