[clang] a6b4204 - [analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal

Tomasz Kamiński via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 19 07:06:42 PDT 2022


Author: Tomasz Kamiński
Date: 2022-10-19T16:06:32+02:00
New Revision: a6b42040ad30c13474c06d3e0291a4675475fec3

URL: https://github.com/llvm/llvm-project/commit/a6b42040ad30c13474c06d3e0291a4675475fec3
DIFF: https://github.com/llvm/llvm-project/commit/a6b42040ad30c13474c06d3e0291a4675475fec3.diff

LOG: [analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal

To illustrate our current understanding, let's start with the following program:
https://godbolt.org/z/33f6vheh1
```lang=c++
void clang_analyzer_printState();

struct C {
   int x;
   int y;
   int more_padding;
};

struct D {
   C c;
   int z;
};

C foo(D d, int new_x, int new_y) {
   d.c.x = new_x;       // B1
   assert(d.c.x < 13);  // C1

   C c = d.c;           // L

   assert(d.c.y < 10);  // C2
   assert(d.z < 5);     // C3

   d.c.y = new_y;       // B2

   assert(d.c.y < 10);  // C4

   return c;  // R
}
```
In the code, we create a few bindings to subregions of root region `d` (`B1`, `B2`), a constrain on the values  (`C1`, `C2`, ….), and create a `lazyCompoundVal` for the part of the region `d` at point `L`, which is returned at point `R`.

Now, the question is which of these should remain live as long the return value of the `foo` call is live. In perfect a word we should preserve:

  # only the bindings of the subregions of `d.c`, which were created before the copy at `L`. In our example, this includes `B1`, and not `B2`.  In other words, `new_x` should be live but `new_y` shouldn’t.

  # constraints on the values of `d.c`, that are reachable through `c`. This can be created both before the point of making the copy (`L`) or after. In our case, that would be `C1` and `C2`. But not `C3` (`d.z` value is not reachable through `c`) and `C4` (the original value of`d.c.y` was overridden at `B2` after the creation of `c`).

The current code in the `RegionStore` covers the use case (1), by using the `getInterestingValues()` to extract bindings to parts of the referred region present in the store at the point of copy. This also partially covers point (2), in case when constraints are applied to a location that has binding at the point of the copy (in our case `d.c.x` in `C1` that has value `new_x`), but it fails to preserve the constraints that require creating a new symbol for location (`d.c.y` in `C2`).

We introduce the concept of //lazily copied// locations (regions) to the `SymbolReaper`, i.e. for which a program can access the value stored at that location, but not its address. These locations are constructed as a set of regions referred to by `lazyCompoundVal`. A //readable// location (region) is a location that //live// or //lazily copied// . And symbols that refer to values in regions are alive if the region is //readable//.

For simplicity, we follow the current approach to live regions and mark the base region as //lazily copied//, and consider any subregions as //readable//. This makes some symbols falsy live (`d.z` in our example) and keeps the corresponding constraints alive.

The rename `Regions` to `LiveRegions` inside  `RegionStore` is NFC change, that was done to make it clear, what is difference between regions stored in this two sets.

Regression Test: https://reviews.llvm.org/D134941
Co-authored-by: Balazs Benics <benicsbalazs at gmail.com>

Reviewed By: martong, xazax.hun

Differential Revision: https://reviews.llvm.org/D134947

Added: 
    

Modified: 
    clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
    clang/lib/StaticAnalyzer/Core/RegionStore.cpp
    clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
    clang/test/Analysis/trivial-copy-struct.cpp

Removed: 
    


################################################################################
diff  --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
index e46f97b7fafcd..b7ce6ebe98786 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
@@ -582,7 +582,12 @@ class SymbolReaper {
   SymbolMapTy TheLiving;
   SymbolSetTy MetadataInUse;
 
-  RegionSetTy RegionRoots;
+  RegionSetTy LiveRegionRoots;
+  // The lazily copied regions are locations for which a program
+  // can access the value stored at that location, but not its address.
+  // These regions are constructed as a set of regions referred to by
+  // lazyCompoundVal.
+  RegionSetTy LazilyCopiedRegionRoots;
 
   const StackFrameContext *LCtx;
   const Stmt *Loc;
@@ -628,8 +633,8 @@ class SymbolReaper {
 
   using region_iterator = RegionSetTy::const_iterator;
 
-  region_iterator region_begin() const { return RegionRoots.begin(); }
-  region_iterator region_end() const { return RegionRoots.end(); }
+  region_iterator region_begin() const { return LiveRegionRoots.begin(); }
+  region_iterator region_end() const { return LiveRegionRoots.end(); }
 
   /// Returns whether or not a symbol has been confirmed dead.
   ///
@@ -640,6 +645,7 @@ class SymbolReaper {
   }
 
   void markLive(const MemRegion *region);
+  void markLazilyCopied(const MemRegion *region);
   void markElementIndicesLive(const MemRegion *region);
 
   /// Set to the value of the symbolic store after
@@ -647,6 +653,12 @@ class SymbolReaper {
   void setReapedStore(StoreRef st) { reapedStore = st; }
 
 private:
+  bool isLazilyCopiedRegion(const MemRegion *region) const;
+  // A readable region is a region that live or lazily copied.
+  // Any symbols that refer to values in regions are alive if the region
+  // is readable.
+  bool isReadableRegion(const MemRegion *region);
+
   /// Mark the symbols dependent on the input symbol as live.
   void markDependentsLive(SymbolRef sym);
 };

diff  --git a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
index eeadd1579a065..7e4e00454e960 100644
--- a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -2838,6 +2838,10 @@ void RemoveDeadBindingsWorker::VisitBinding(SVal V) {
   // Is it a LazyCompoundVal?  All referenced regions are live as well.
   if (Optional<nonloc::LazyCompoundVal> LCS =
           V.getAs<nonloc::LazyCompoundVal>()) {
+    // TODO: Make regions referred to by `lazyCompoundVals` that are bound to
+    // subregions of the `LCS.getRegion()` also lazily copied.
+    if (const MemRegion *R = LCS->getRegion())
+      SymReaper.markLazilyCopied(R);
 
     const RegionStoreManager::SValListTy &Vals = RM.getInterestingValues(*LCS);
 

diff  --git a/clang/lib/StaticAnalyzer/Core/SymbolManager.cpp b/clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
index 2227bd324adc4..3e97f0c95fc32 100644
--- a/clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
+++ b/clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
@@ -411,10 +411,14 @@ void SymbolReaper::markLive(SymbolRef sym) {
 }
 
 void SymbolReaper::markLive(const MemRegion *region) {
-  RegionRoots.insert(region->getBaseRegion());
+  LiveRegionRoots.insert(region->getBaseRegion());
   markElementIndicesLive(region);
 }
 
+void SymbolReaper::markLazilyCopied(const clang::ento::MemRegion *region) {
+  LazilyCopiedRegionRoots.insert(region->getBaseRegion());
+}
+
 void SymbolReaper::markElementIndicesLive(const MemRegion *region) {
   for (auto SR = dyn_cast<SubRegion>(region); SR;
        SR = dyn_cast<SubRegion>(SR->getSuperRegion())) {
@@ -437,8 +441,7 @@ bool SymbolReaper::isLiveRegion(const MemRegion *MR) {
   // is not used later in the path, we can diagnose a leak of a value within
   // that field earlier than, say, the variable that contains the field dies.
   MR = MR->getBaseRegion();
-
-  if (RegionRoots.count(MR))
+  if (LiveRegionRoots.count(MR))
     return true;
 
   if (const auto *SR = dyn_cast<SymbolicRegion>(MR))
@@ -454,6 +457,15 @@ bool SymbolReaper::isLiveRegion(const MemRegion *MR) {
   return isa<AllocaRegion, CXXThisRegion, MemSpaceRegion, CodeTextRegion>(MR);
 }
 
+bool SymbolReaper::isLazilyCopiedRegion(const MemRegion *MR) const {
+  // TODO: See comment in isLiveRegion.
+  return LazilyCopiedRegionRoots.count(MR->getBaseRegion());
+}
+
+bool SymbolReaper::isReadableRegion(const MemRegion *MR) {
+  return isLiveRegion(MR) || isLazilyCopiedRegion(MR);
+}
+
 bool SymbolReaper::isLive(SymbolRef sym) {
   if (TheLiving.count(sym)) {
     markDependentsLive(sym);
@@ -464,7 +476,7 @@ bool SymbolReaper::isLive(SymbolRef sym) {
 
   switch (sym->getKind()) {
   case SymExpr::SymbolRegionValueKind:
-    KnownLive = isLiveRegion(cast<SymbolRegionValue>(sym)->getRegion());
+    KnownLive = isReadableRegion(cast<SymbolRegionValue>(sym)->getRegion());
     break;
   case SymExpr::SymbolConjuredKind:
     KnownLive = false;

diff  --git a/clang/test/Analysis/trivial-copy-struct.cpp b/clang/test/Analysis/trivial-copy-struct.cpp
index cd9cfde3ee9dc..b1f4cb3fbfdbc 100644
--- a/clang/test/Analysis/trivial-copy-struct.cpp
+++ b/clang/test/Analysis/trivial-copy-struct.cpp
@@ -49,11 +49,53 @@ void deadCode(List orig) {
   if (c.value == 42)
     return;
   clang_analyzer_value(c.value);
-  // expected-warning at -1 {{32s:{ [-2147483648, 2147483647] }}}
-  // The symbol was garbage collected too early, hence we lose the constraints.
+  // expected-warning at -1 {{32s:{ [-2147483648, 41], [43, 2147483647] }}}
+  // Before symbol was garbage collected too early, and we lost the constraints.
   if (c.value != 42)
     return;
 
-  // Dead code should be unreachable
-  clang_analyzer_warnIfReached(); // expected-warning {{REACHABLE}}
+  clang_analyzer_warnIfReached(); // no-warning: Dead code.
+};
+
+void ptr1(List* n) {
+  List* n2 = new List(*n); // ctor
+  if (!n->next) {
+    if (n2->next) {
+      clang_analyzer_warnIfReached(); // unreachable
+    }
+  }
+  delete n2;
+}
+
+void ptr2(List* n) {
+  List* n2 = new List(); // ctor
+  *n2 = *n; // assignment
+  if (!n->next) {
+    if (n2->next) {
+      clang_analyzer_warnIfReached(); // unreachable
+    }
+  }
+  delete n2;
+}
+
+struct Wrapper {
+  List head;
+  int count;
+};
+
+void nestedLazyCompoundVal(List* n) {
+  Wrapper* w = 0;
+  {
+     Wrapper lw;
+     lw.head = *n;
+     w = new Wrapper(lw);
+  }
+  if (!n->next) {
+    if (w->head.next) {
+      // FIXME: Unreachable, w->head is a copy of *n, therefore
+      // w->head.next and n->next are equal
+      clang_analyzer_warnIfReached(); // expected-warning {{REACHABLE}}
+    }
+  }
+  delete w;
 }


        


More information about the cfe-commits mailing list