[clang] b9bca88 - [analyzer][NFC] Don't bind values to ObjCForCollectionStmt, replace it with a GDM trait

Kristóf Umann via cfe-commits cfe-commits at lists.llvm.org
Fri Sep 11 06:59:03 PDT 2020


Author: Kristóf Umann
Date: 2020-09-11T15:58:48+02:00
New Revision: b9bca883c970d36f408db80df21838c713c326db

URL: https://github.com/llvm/llvm-project/commit/b9bca883c970d36f408db80df21838c713c326db
DIFF: https://github.com/llvm/llvm-project/commit/b9bca883c970d36f408db80df21838c713c326db.diff

LOG: [analyzer][NFC] Don't bind values to ObjCForCollectionStmt, replace it with a GDM trait

Based on the discussion in D82598#2171312. Thanks @NoQ!

D82598 is titled "Get rid of statement liveness, because such a thing doesn't
exist", and indeed, expressions express a value, non-expression statements
don't.

if (a && get() || []{ return true; }())
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ has a value
    ~ has a value
    ~~~~~~~~~~ has a value
                  ~~~~~~~~~~~~~~~~~~~~ has a value
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ doesn't have a value

That is simple enough, so it would only make sense if we only assigned symbolic
values to expressions in the static analyzer. Yet the interface checkers can
access presents, among other strange things, the following two methods:

ProgramState::BindExpr(const Stmt *S, const LocationContext *LCtx, SVal V,
                       bool Invalidate=true)
ProgramState::getSVal(const Stmt *S, const LocationContext *LCtx)

So, what gives? Turns out, we make an exception for ReturnStmt (which we'll
leave for another time) and ObjCForCollectionStmt. For any other loops, in order
to know whether we should analyze another iteration, among other things, we
evaluate it's condition. Which is a problem for ObjCForCollectionStmt, because
it simply doesn't have one (CXXForRangeStmt has an implicit one!). In its
absence, we assigned the actual statement with a concrete 1 or 0 to indicate
whether there are any more iterations left. However, this is wildly incorrect,
its just simply not true that the for statement has a value of 1 or 0, we can't
calculate its liveness because that doesn't make any sense either, so this patch
turns it into a GDM trait.

Fixing this allows us to reinstate the assert removed in
https://reviews.llvm.org/rG032b78a0762bee129f33e4255ada6d374aa70c71.

Differential Revision: https://reviews.llvm.org/D86736

Added: 
    clang/test/Analysis/objc-live-crash.mm

Modified: 
    clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
    clang/lib/StaticAnalyzer/Checkers/BasicObjCFoundationChecks.cpp
    clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp
    clang/lib/StaticAnalyzer/Core/Environment.cpp
    clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
    clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp
    clang/lib/StaticAnalyzer/Core/SymbolManager.cpp

Removed: 
    


################################################################################
diff  --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
index cdfe986355c5..582a56cbee1e 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
@@ -869,6 +869,23 @@ class ExprEngine {
   void handleConstructor(const Expr *E, ExplodedNode *Pred,
                          ExplodedNodeSet &Dst);
 
+public:
+  /// Note whether this loop has any more iteratios to model. These methods are
+  /// essentially an interface for a GDM trait. Further reading in
+  /// ExprEngine::VisitObjCForCollectionStmt().
+  LLVM_NODISCARD static ProgramStateRef
+  setWhetherHasMoreIteration(ProgramStateRef State,
+                             const ObjCForCollectionStmt *O,
+                             const LocationContext *LC, bool HasMoreIteraton);
+
+  LLVM_NODISCARD static ProgramStateRef
+  removeIterationState(ProgramStateRef State, const ObjCForCollectionStmt *O,
+                       const LocationContext *LC);
+
+  LLVM_NODISCARD static bool hasMoreIteration(ProgramStateRef State,
+                                              const ObjCForCollectionStmt *O,
+                                              const LocationContext *LC);
+private:
   /// Store the location of a C++ object corresponding to a statement
   /// until the statement is actually encountered. For example, if a DeclStmt
   /// has CXXConstructExpr as its initializer, the object would be considered

diff  --git a/clang/lib/StaticAnalyzer/Checkers/BasicObjCFoundationChecks.cpp b/clang/lib/StaticAnalyzer/Checkers/BasicObjCFoundationChecks.cpp
index 918c6e361381..a86a410ebcbc 100644
--- a/clang/lib/StaticAnalyzer/Checkers/BasicObjCFoundationChecks.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/BasicObjCFoundationChecks.cpp
@@ -978,8 +978,7 @@ void ObjCLoopChecker::checkPostStmt(const ObjCForCollectionStmt *FCS,
   ProgramStateRef State = C.getState();
 
   // Check if this is the branch for the end of the loop.
-  SVal CollectionSentinel = C.getSVal(FCS);
-  if (CollectionSentinel.isZeroConstant()) {
+  if (!ExprEngine::hasMoreIteration(State, FCS, C.getLocationContext())) {
     if (!alreadyExecutedAtLeastOneLoopIteration(C.getPredecessor(), FCS))
       State = assumeCollectionNonEmpty(C, State, FCS, /*Assumption*/false);
 

diff  --git a/clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp
index 3e0caaf79ca0..ebe5ad53cc30 100644
--- a/clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp
@@ -11,6 +11,8 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "clang/AST/StmtObjC.h"
+#include "clang/AST/Type.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
@@ -54,10 +56,13 @@ class UndefBranchChecker : public Checker<check::BranchCondition> {
   void checkBranchCondition(const Stmt *Condition, CheckerContext &Ctx) const;
 };
 
-}
+} // namespace
 
 void UndefBranchChecker::checkBranchCondition(const Stmt *Condition,
                                               CheckerContext &Ctx) const {
+  // ObjCForCollection is a loop, but has no actual condition.
+  if (isa<ObjCForCollectionStmt>(Condition))
+    return;
   SVal X = Ctx.getSVal(Condition);
   if (X.isUndef()) {
     // Generate a sink node, which implicitly marks both outgoing branches as

diff  --git a/clang/lib/StaticAnalyzer/Core/Environment.cpp b/clang/lib/StaticAnalyzer/Core/Environment.cpp
index 1ccf4c6104a6..556ff6af15de 100644
--- a/clang/lib/StaticAnalyzer/Core/Environment.cpp
+++ b/clang/lib/StaticAnalyzer/Core/Environment.cpp
@@ -15,6 +15,7 @@
 #include "clang/AST/ExprCXX.h"
 #include "clang/AST/PrettyPrinter.h"
 #include "clang/AST/Stmt.h"
+#include "clang/AST/StmtObjC.h"
 #include "clang/Analysis/AnalysisDeclContext.h"
 #include "clang/Basic/LLVM.h"
 #include "clang/Basic/LangOptions.h"
@@ -85,6 +86,12 @@ SVal Environment::lookupExpr(const EnvironmentEntry &E) const {
 SVal Environment::getSVal(const EnvironmentEntry &Entry,
                           SValBuilder& svalBuilder) const {
   const Stmt *S = Entry.getStmt();
+  assert(!isa<ObjCForCollectionStmt>(S) &&
+         "Use ExprEngine::hasMoreIteration()!");
+  assert((isa<Expr>(S) || isa<ReturnStmt>(S)) &&
+         "Environment can only argue about Exprs, since only they express "
+         "a value! Any non-expression statement stored in Environment is a "
+         "result of a hack!");
   const LocationContext *LCtx = Entry.getLocationContext();
 
   switch (S->getStmtClass()) {
@@ -188,7 +195,14 @@ EnvironmentManager::removeDeadBindings(Environment Env,
     const EnvironmentEntry &BlkExpr = I.getKey();
     const SVal &X = I.getData();
 
-    if (SymReaper.isLive(BlkExpr.getStmt(), BlkExpr.getLocationContext())) {
+    const bool IsBlkExprLive =
+        SymReaper.isLive(BlkExpr.getStmt(), BlkExpr.getLocationContext());
+
+    assert((isa<Expr>(BlkExpr.getStmt()) || !IsBlkExprLive) &&
+           "Only Exprs can be live, LivenessAnalysis argues about the liveness "
+           "of *values*!");
+
+    if (IsBlkExprLive) {
       // Copy the binding to the new map.
       EBMapRef = EBMapRef.add(BlkExpr, X);
 

diff  --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index a4b11b5e8a96..409741cdb6e4 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -2129,6 +2129,83 @@ static const Stmt *ResolveCondition(const Stmt *Condition,
   llvm_unreachable("could not resolve condition");
 }
 
+using ObjCForLctxPair =
+    std::pair<const ObjCForCollectionStmt *, const LocationContext *>;
+
+REGISTER_MAP_WITH_PROGRAMSTATE(ObjCForHasMoreIterations, ObjCForLctxPair, bool)
+
+ProgramStateRef ExprEngine::setWhetherHasMoreIteration(
+    ProgramStateRef State, const ObjCForCollectionStmt *O,
+    const LocationContext *LC, bool HasMoreIteraton) {
+  assert(!State->contains<ObjCForHasMoreIterations>({O, LC}));
+  return State->set<ObjCForHasMoreIterations>({O, LC}, HasMoreIteraton);
+}
+
+ProgramStateRef
+ExprEngine::removeIterationState(ProgramStateRef State,
+                                 const ObjCForCollectionStmt *O,
+                                 const LocationContext *LC) {
+  assert(State->contains<ObjCForHasMoreIterations>({O, LC}));
+  return State->remove<ObjCForHasMoreIterations>({O, LC});
+}
+
+bool ExprEngine::hasMoreIteration(ProgramStateRef State,
+                                  const ObjCForCollectionStmt *O,
+                                  const LocationContext *LC) {
+  assert(State->contains<ObjCForHasMoreIterations>({O, LC}));
+  return *State->get<ObjCForHasMoreIterations>({O, LC});
+}
+
+/// Split the state on whether there are any more iterations left for this loop.
+/// Returns a (HasMoreIteration, HasNoMoreIteration) pair, or None when the
+/// acquisition of the loop condition value failed.
+static Optional<std::pair<ProgramStateRef, ProgramStateRef>>
+assumeCondition(const Stmt *Condition, ExplodedNode *N) {
+  ProgramStateRef State = N->getState();
+  if (const auto *ObjCFor = dyn_cast<ObjCForCollectionStmt>(Condition)) {
+    bool HasMoreIteraton =
+        ExprEngine::hasMoreIteration(State, ObjCFor, N->getLocationContext());
+    // Checkers have already ran on branch conditions, so the current
+    // information as to whether the loop has more iteration becomes outdated
+    // after this point.
+    State = ExprEngine::removeIterationState(State, ObjCFor,
+                                             N->getLocationContext());
+    if (HasMoreIteraton)
+      return std::pair<ProgramStateRef, ProgramStateRef>{State, nullptr};
+    else
+      return std::pair<ProgramStateRef, ProgramStateRef>{nullptr, State};
+  }
+  SVal X = State->getSVal(Condition, N->getLocationContext());
+
+  if (X.isUnknownOrUndef()) {
+    // Give it a chance to recover from unknown.
+    if (const auto *Ex = dyn_cast<Expr>(Condition)) {
+      if (Ex->getType()->isIntegralOrEnumerationType()) {
+        // Try to recover some path-sensitivity.  Right now casts of symbolic
+        // integers that promote their values are currently not tracked well.
+        // If 'Condition' is such an expression, try and recover the
+        // underlying value and use that instead.
+        SVal recovered =
+            RecoverCastedSymbol(State, Condition, N->getLocationContext(),
+                                N->getState()->getStateManager().getContext());
+
+        if (!recovered.isUnknown()) {
+          X = recovered;
+        }
+      }
+    }
+  }
+
+  // If the condition is still unknown, give up.
+  if (X.isUnknownOrUndef())
+    return None;
+
+  DefinedSVal V = X.castAs<DefinedSVal>();
+
+  ProgramStateRef StTrue, StFalse;
+  return State->assume(V);
+}
+
 void ExprEngine::processBranch(const Stmt *Condition,
                                NodeBuilderContext& BldCtx,
                                ExplodedNode *Pred,
@@ -2165,48 +2242,28 @@ void ExprEngine::processBranch(const Stmt *Condition,
     return;
 
   BranchNodeBuilder builder(CheckersOutSet, Dst, BldCtx, DstT, DstF);
-  for (const auto PredI : CheckersOutSet) {
-    if (PredI->isSink())
+  for (ExplodedNode *PredN : CheckersOutSet) {
+    if (PredN->isSink())
       continue;
 
-    ProgramStateRef PrevState = PredI->getState();
-    SVal X = PrevState->getSVal(Condition, PredI->getLocationContext());
-
-    if (X.isUnknownOrUndef()) {
-      // Give it a chance to recover from unknown.
-      if (const auto *Ex = dyn_cast<Expr>(Condition)) {
-        if (Ex->getType()->isIntegralOrEnumerationType()) {
-          // Try to recover some path-sensitivity.  Right now casts of symbolic
-          // integers that promote their values are currently not tracked well.
-          // If 'Condition' is such an expression, try and recover the
-          // underlying value and use that instead.
-          SVal recovered = RecoverCastedSymbol(PrevState, Condition,
-                                               PredI->getLocationContext(),
-                                               getContext());
-
-          if (!recovered.isUnknown()) {
-            X = recovered;
-          }
-        }
-      }
-    }
+    ProgramStateRef PrevState = PredN->getState();
 
-    // If the condition is still unknown, give up.
-    if (X.isUnknownOrUndef()) {
-      builder.generateNode(PrevState, true, PredI);
-      builder.generateNode(PrevState, false, PredI);
+    ProgramStateRef StTrue, StFalse;
+    if (const auto KnownCondValueAssumption = assumeCondition(Condition, PredN))
+      std::tie(StTrue, StFalse) = *KnownCondValueAssumption;
+    else {
+      assert(!isa<ObjCForCollectionStmt>(Condition));
+      builder.generateNode(PrevState, true, PredN);
+      builder.generateNode(PrevState, false, PredN);
       continue;
     }
-
-    DefinedSVal V = X.castAs<DefinedSVal>();
-
-    ProgramStateRef StTrue, StFalse;
-    std::tie(StTrue, StFalse) = PrevState->assume(V);
+    if (StTrue && StFalse)
+      assert(!isa<ObjCForCollectionStmt>(Condition));;
 
     // Process the true branch.
     if (builder.isFeasible(true)) {
       if (StTrue)
-        builder.generateNode(StTrue, true, PredI);
+        builder.generateNode(StTrue, true, PredN);
       else
         builder.markInfeasible(true);
     }
@@ -2214,7 +2271,7 @@ void ExprEngine::processBranch(const Stmt *Condition,
     // Process the false branch.
     if (builder.isFeasible(false)) {
       if (StFalse)
-        builder.generateNode(StFalse, false, PredI);
+        builder.generateNode(StFalse, false, PredN);
       else
         builder.markInfeasible(false);
     }

diff  --git a/clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp
index eb9a0be2e5d6..5a55e81497b0 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp
@@ -53,10 +53,8 @@ static void populateObjCForDestinationSet(
     ProgramStateRef state = Pred->getState();
     const LocationContext *LCtx = Pred->getLocationContext();
 
-    SVal hasElementsV = svalBuilder.makeTruthVal(hasElements);
-
-    // FIXME: S is not an expression. We should not be binding values to it.
-    ProgramStateRef nextState = state->BindExpr(S, LCtx, hasElementsV);
+    ProgramStateRef nextState =
+        ExprEngine::setWhetherHasMoreIteration(state, S, LCtx, hasElements);
 
     if (auto MV = elementV.getAs<loc::MemRegionVal>())
       if (const auto *R = dyn_cast<TypedValueRegion>(MV->getRegion())) {
@@ -93,10 +91,9 @@ void ExprEngine::VisitObjCForCollectionStmt(const ObjCForCollectionStmt *S,
   //  (1) binds the next container value to 'element'.  This creates a new
   //      node in the ExplodedGraph.
   //
-  //  (2) binds the value 0/1 to the ObjCForCollectionStmt* itself, indicating
-  //      whether or not the container has any more elements.  This value
-  //      will be tested in ProcessBranch.  We need to explicitly bind
-  //      this value because a container can contain nil elements.
+  //  (2) note whether the collection has any more elements (or in other words,
+  //      whether the loop has more iterations). This will be tested in
+  //      processBranch.
   //
   // FIXME: Eventually this logic should actually do dispatches to
   //   'countByEnumeratingWithState:objects:count:' (NSFastEnumeration).

diff  --git a/clang/lib/StaticAnalyzer/Core/SymbolManager.cpp b/clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
index 6ca7aec9caec..ae40ad910d84 100644
--- a/clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
+++ b/clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
@@ -14,6 +14,7 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/AST/Expr.h"
+#include "clang/AST/StmtObjC.h"
 #include "clang/Analysis/Analyses/LiveVariables.h"
 #include "clang/Analysis/AnalysisDeclContext.h"
 #include "clang/Basic/LLVM.h"
@@ -494,7 +495,8 @@ SymbolReaper::isLive(const Stmt *ExprVal, const LocationContext *ELCtx) const {
     return true;
   }
 
-  // If no statement is provided, everything is this and parent contexts is live.
+  // If no statement is provided, everything in this and parent contexts is
+  // live.
   if (!Loc)
     return true;
 

diff  --git a/clang/test/Analysis/objc-live-crash.mm b/clang/test/Analysis/objc-live-crash.mm
new file mode 100644
index 000000000000..b3b4f19bfc0d
--- /dev/null
+++ b/clang/test/Analysis/objc-live-crash.mm
@@ -0,0 +1,30 @@
+// RUN: %clang --analyze %s -fblocks
+
+// https://reviews.llvm.org/D82598#2171312
+
+ at interface Item
+// ...
+ at end
+
+ at interface Collection
+// ...
+ at end
+
+typedef void (^Blk)();
+
+struct RAII {
+  Blk blk;
+
+public:
+  RAII(Blk blk): blk(blk) {}
+  ~RAII() { blk(); }
+};
+
+void foo(Collection *coll) {
+  RAII raii(^{});
+  for (Item *item in coll) {}
+  int i;
+  {
+    int j;
+  }
+}


        


More information about the cfe-commits mailing list