[clang] [clang][dataflow] Discard unneeded `ExprToLoc` and `ExprToVal` entries. (PR #72850)

via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 20 02:57:36 PST 2023


https://github.com/martinboehme created https://github.com/llvm/llvm-project/pull/72850

This has two major benefits:

*  We avoid performing joins on boolean expressions and hence extending the flow
   condition in cases where this is not needed. Simpler flow conditions should
   reduce the amount of work we do in the SAT solver.

*  Debugging becomes easier when flow conditions are simpler and ExprToVal
   doesn’t contain any extraneous entries.

In tests on an internal codebase, we see a reduction in SAT solver timeouts of
over 40% and a reduction in "reached maximum iterations" errors of over 60%.

Benchmark results on Crubit's `pointer_nullability_analysis_benchmark show a
slight runtime increase for simple benchmarks, offset by runtime reductions
(some of them substantial) for more complex benchmarks:

```
name                              old cpu/op   new cpu/op   delta
BM_PointerAnalysisCopyPointer     29.5µs ± 2%  32.0µs ± 3%   +8.42%  (p=0.000 n=46+47)
BM_PointerAnalysisIntLoop          100µs ± 3%   109µs ± 3%   +9.40%  (p=0.000 n=54+60)
BM_PointerAnalysisPointerLoop      376µs ± 4%   253µs ± 3%  -32.51%  (p=0.000 n=48+55)
BM_PointerAnalysisBranch           116µs ± 3%   126µs ± 3%   +9.23%  (p=0.000 n=60+59)
BM_PointerAnalysisLoopAndBranch    776µs ± 3%   425µs ± 4%  -45.20%  (p=0.000 n=59+59)
BM_PointerAnalysisTwoLoops         184µs ± 2%   200µs ± 3%   +8.48%  (p=0.000 n=59+58)
BM_PointerAnalysisJoinFilePath    17.3ms ± 3%   9.4ms ± 2%  -45.48%  (p=0.000 n=60+60)
BM_PointerAnalysisCallInLoop      14.7ms ± 2%  11.0ms ± 3%  -25.02%  (p=0.000 n=57+56)
```

Running the pointer nullability check on several real-world files shows no
significant change in runtime.


>From d5c011c6f922e98da44b1ffa573c3630e3532b3d Mon Sep 17 00:00:00 2001
From: Martin Braenne <mboehme at google.com>
Date: Mon, 20 Nov 2023 10:56:28 +0000
Subject: [PATCH] [clang][dataflow] Discard unneeded `ExprToLoc` and
 `ExprToVal` entries.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This has two major benefits:

*  We avoid performing joins on boolean expressions and hence extending the flow
   condition in cases where this is not needed. Simpler flow conditions should
   reduce the amount of work we do in the SAT solver.

*  Debugging becomes easier when flow conditions are simpler and ExprToVal
   doesn’t contain any extraneous entries.

In tests on an internal codebase, we see a reduction in SAT solver timeouts of
over 40% and a reduction in "reached maximum iterations" errors of over 60%.

Benchmark results on Crubit's `pointer_nullability_analysis_benchmark show a
slight runtime increase for simple benchmarks, offset by runtime reductions
(some of them substantial) for more complex benchmarks:

```
name                              old cpu/op   new cpu/op   delta
BM_PointerAnalysisCopyPointer     29.5µs ± 2%  32.0µs ± 3%   +8.42%  (p=0.000 n=46+47)
BM_PointerAnalysisIntLoop          100µs ± 3%   109µs ± 3%   +9.40%  (p=0.000 n=54+60)
BM_PointerAnalysisPointerLoop      376µs ± 4%   253µs ± 3%  -32.51%  (p=0.000 n=48+55)
BM_PointerAnalysisBranch           116µs ± 3%   126µs ± 3%   +9.23%  (p=0.000 n=60+59)
BM_PointerAnalysisLoopAndBranch    776µs ± 3%   425µs ± 4%  -45.20%  (p=0.000 n=59+59)
BM_PointerAnalysisTwoLoops         184µs ± 2%   200µs ± 3%   +8.48%  (p=0.000 n=59+58)
BM_PointerAnalysisJoinFilePath    17.3ms ± 3%   9.4ms ± 2%  -45.48%  (p=0.000 n=60+60)
BM_PointerAnalysisCallInLoop      14.7ms ± 2%  11.0ms ± 3%  -25.02%  (p=0.000 n=57+56)
```

Running the pointer nullability check on several real-world files shows no
significant change in runtime.
---
 .../FlowSensitive/ControlFlowContext.h        |  22 ++-
 .../FlowSensitive/DataflowEnvironment.h       |   8 +-
 .../FlowSensitive/ControlFlowContext.cpp      |  35 ++--
 .../FlowSensitive/DataflowEnvironment.cpp     |  22 ++-
 .../TypeErasedDataflowAnalysis.cpp            |  18 +-
 .../FlowSensitive/DataflowEnvironmentTest.cpp |  36 ++++
 .../TypeErasedDataflowAnalysisTest.cpp        | 158 +++++++++++++++---
 7 files changed, 250 insertions(+), 49 deletions(-)

diff --git a/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h b/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
index 768387a121b920a..3a5c5235838f630 100644
--- a/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
+++ b/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
@@ -52,23 +52,39 @@ class ControlFlowContext {
     return StmtToBlock;
   }
 
+  /// Returns the expressions consumed by `Block`. These are all children of
+  /// statements in `Block` as well as children of a possible terminator.
+  const llvm::DenseSet<const Expr *> *
+  getExprConsumedByBlock(const CFGBlock *Block) const {
+    auto It = ExprConsumedByBlock.find(Block);
+    if (It == ExprConsumedByBlock.end())
+      return nullptr;
+    return &It->second;
+  }
+
   /// Returns whether `B` is reachable from the entry block.
   bool isBlockReachable(const CFGBlock &B) const {
     return BlockReachable[B.getBlockID()];
   }
 
 private:
-  ControlFlowContext(const Decl &D, std::unique_ptr<CFG> Cfg,
-                     llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock,
-                     llvm::BitVector BlockReachable)
+  ControlFlowContext(
+      const Decl &D, std::unique_ptr<CFG> Cfg,
+      llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock,
+      llvm::DenseMap<const CFGBlock *, llvm::DenseSet<const Expr *>>
+          ExprConsumedByBlock,
+      llvm::BitVector BlockReachable)
       : ContainingDecl(D), Cfg(std::move(Cfg)),
         StmtToBlock(std::move(StmtToBlock)),
+        ExprConsumedByBlock(std::move(ExprConsumedByBlock)),
         BlockReachable(std::move(BlockReachable)) {}
 
   /// The `Decl` containing the statement used to construct the CFG.
   const Decl &ContainingDecl;
   std::unique_ptr<CFG> Cfg;
   llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock;
+  const llvm::DenseMap<const CFGBlock *, llvm::DenseSet<const Expr *>>
+      ExprConsumedByBlock;
   llvm::BitVector BlockReachable;
 };
 
diff --git a/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h b/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
index 963197b728f4273..a2f97122567fbe7 100644
--- a/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
+++ b/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
@@ -211,12 +211,16 @@ class Environment {
   /// Joins two environments by taking the intersection of storage locations and
   /// values that are stored in them. Distinct values that are assigned to the
   /// same storage locations in `EnvA` and `EnvB` are merged using `Model`.
+  /// If `ExprsToKeep` is non-null, only retains state for expressions contained
+  /// in it.
   ///
   /// Requirements:
   ///
   ///  `EnvA` and `EnvB` must use the same `DataflowAnalysisContext`.
-  static Environment join(const Environment &EnvA, const Environment &EnvB,
-                          Environment::ValueModel &Model);
+  static Environment
+  join(const Environment &EnvA, const Environment &EnvB,
+       Environment::ValueModel &Model,
+       const llvm::DenseSet<const Expr *> *ExprsToKeep = nullptr);
 
   /// Widens the environment point-wise, using `PrevEnv` as needed to inform the
   /// approximation.
diff --git a/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp b/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
index 56246066e4aa13a..54f93c51a568523 100644
--- a/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
+++ b/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
@@ -24,25 +24,35 @@
 namespace clang {
 namespace dataflow {
 
-/// Returns a map from statements to basic blocks that contain them.
-static llvm::DenseMap<const Stmt *, const CFGBlock *>
-buildStmtToBasicBlockMap(const CFG &Cfg) {
-  llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock;
+/// Builds maps:
+/// - From statements to basic blocks that contain them.
+/// - From blocks to the expressions they consume.
+static void buildStatementMaps(
+    const CFG &Cfg, llvm::DenseMap<const Stmt *, const CFGBlock *> &StmtToBlock,
+    llvm::DenseMap<const CFGBlock *, llvm::DenseSet<const Expr *>>
+        &ExprConsumedByBlock) {
   for (const CFGBlock *Block : Cfg) {
     if (Block == nullptr)
       continue;
 
     for (const CFGElement &Element : *Block) {
-      auto Stmt = Element.getAs<CFGStmt>();
-      if (!Stmt)
+      auto S = Element.getAs<CFGStmt>();
+      if (!S)
         continue;
 
-      StmtToBlock[Stmt->getStmt()] = Block;
+      StmtToBlock[S->getStmt()] = Block;
+
+      for (const Stmt *Child : S->getStmt()->children())
+        if (const Expr *E = dyn_cast_or_null<Expr>(Child))
+          ExprConsumedByBlock[Block].insert(E);
     }
-    if (const Stmt *TerminatorStmt = Block->getTerminatorStmt())
+    if (const Stmt *TerminatorStmt = Block->getTerminatorStmt()) {
       StmtToBlock[TerminatorStmt] = Block;
+      for (const Stmt *Child : TerminatorStmt->children())
+        if (const Expr *E = dyn_cast_or_null<Expr>(Child))
+          ExprConsumedByBlock[Block].insert(E);
+    }
   }
-  return StmtToBlock;
 }
 
 static llvm::BitVector findReachableBlocks(const CFG &Cfg) {
@@ -108,12 +118,15 @@ ControlFlowContext::build(const Decl &D, Stmt &S, ASTContext &C) {
         std::make_error_code(std::errc::invalid_argument),
         "CFG::buildCFG failed");
 
-  llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock =
-      buildStmtToBasicBlockMap(*Cfg);
+  llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock;
+  llvm::DenseMap<const CFGBlock *, llvm::DenseSet<const Expr *>>
+      ExprConsumedByBlock;
+  buildStatementMaps(*Cfg, StmtToBlock, ExprConsumedByBlock);
 
   llvm::BitVector BlockReachable = findReachableBlocks(*Cfg);
 
   return ControlFlowContext(D, std::move(Cfg), std::move(StmtToBlock),
+                            std::move(ExprConsumedByBlock),
                             std::move(BlockReachable));
 }
 
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index a1051c529ab0d58..29fa93dcba896ba 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -37,10 +37,14 @@ static constexpr int MaxCompositeValueSize = 1000;
 
 /// Returns a map consisting of key-value entries that are present in both maps.
 template <typename K, typename V>
-llvm::DenseMap<K, V> intersectDenseMaps(const llvm::DenseMap<K, V> &Map1,
-                                        const llvm::DenseMap<K, V> &Map2) {
+llvm::DenseMap<K, V>
+intersectDenseMaps(const llvm::DenseMap<K, V> &Map1,
+                   const llvm::DenseMap<K, V> &Map2,
+                   const llvm::DenseSet<K> *KeysToKeep = nullptr) {
   llvm::DenseMap<K, V> Result;
   for (auto &Entry : Map1) {
+    if (KeysToKeep && !KeysToKeep->contains(Entry.first))
+      continue;
     auto It = Map2.find(Entry.first);
     if (It != Map2.end() && Entry.second == It->second)
       Result.insert({Entry.first, Entry.second});
@@ -210,7 +214,8 @@ llvm::MapVector<Key, Value *>
 joinKeyToValueMap(const llvm::MapVector<Key, Value *> &Map1,
                   const llvm::MapVector<Key, Value *> &Map2,
                   const Environment &Env1, const Environment &Env2,
-                  Environment &JoinedEnv, Environment::ValueModel &Model) {
+                  Environment &JoinedEnv, Environment::ValueModel &Model,
+                  const llvm::DenseSet<Key> *KeysToKeep = nullptr) {
   llvm::MapVector<Key, Value *> MergedMap;
   for (auto &Entry : Map1) {
     Key K = Entry.first;
@@ -219,6 +224,9 @@ joinKeyToValueMap(const llvm::MapVector<Key, Value *> &Map1,
     Value *Val = Entry.second;
     assert(Val != nullptr);
 
+    if (KeysToKeep && !KeysToKeep->contains(K))
+      continue;
+
     auto It = Map2.find(K);
     if (It == Map2.end())
       continue;
@@ -610,7 +618,8 @@ LatticeJoinEffect Environment::widen(const Environment &PrevEnv,
 }
 
 Environment Environment::join(const Environment &EnvA, const Environment &EnvB,
-                              Environment::ValueModel &Model) {
+                              Environment::ValueModel &Model,
+                              const llvm::DenseSet<const Expr *> *ExprsToKeep) {
   assert(EnvA.DACtx == EnvB.DACtx);
   assert(EnvA.ThisPointeeLoc == EnvB.ThisPointeeLoc);
   assert(EnvA.CallStack == EnvB.CallStack);
@@ -650,7 +659,8 @@ Environment Environment::join(const Environment &EnvA, const Environment &EnvB,
 
   JoinedEnv.DeclToLoc = intersectDenseMaps(EnvA.DeclToLoc, EnvB.DeclToLoc);
 
-  JoinedEnv.ExprToLoc = intersectDenseMaps(EnvA.ExprToLoc, EnvB.ExprToLoc);
+  JoinedEnv.ExprToLoc =
+      intersectDenseMaps(EnvA.ExprToLoc, EnvB.ExprToLoc, ExprsToKeep);
 
   // FIXME: update join to detect backedges and simplify the flow condition
   // accordingly.
@@ -658,7 +668,7 @@ Environment Environment::join(const Environment &EnvA, const Environment &EnvB,
       EnvA.FlowConditionToken, EnvB.FlowConditionToken);
 
   JoinedEnv.ExprToVal = joinKeyToValueMap(EnvA.ExprToVal, EnvB.ExprToVal, EnvA,
-                                          EnvB, JoinedEnv, Model);
+                                          EnvB, JoinedEnv, Model, ExprsToKeep);
 
   JoinedEnv.LocToVal = joinKeyToValueMap(EnvA.LocToVal, EnvB.LocToVal, EnvA,
                                          EnvB, JoinedEnv, Model);
diff --git a/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp b/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
index e54fb2a01ddeead..c0b195e1390c432 100644
--- a/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
+++ b/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
@@ -232,16 +232,19 @@ class JoinedStateBuilder {
   AnalysisContext &AC;
   std::vector<const TypeErasedDataflowAnalysisState *> All;
   std::deque<TypeErasedDataflowAnalysisState> Owned;
+  const llvm::DenseSet<const Expr *> *ExprsToKeep;
 
   TypeErasedDataflowAnalysisState
   join(const TypeErasedDataflowAnalysisState &L,
        const TypeErasedDataflowAnalysisState &R) {
     return {AC.Analysis.joinTypeErased(L.Lattice, R.Lattice),
-            Environment::join(L.Env, R.Env, AC.Analysis)};
+            Environment::join(L.Env, R.Env, AC.Analysis, ExprsToKeep)};
   }
 
 public:
-  JoinedStateBuilder(AnalysisContext &AC) : AC(AC) {}
+  JoinedStateBuilder(AnalysisContext &AC,
+                     const llvm::DenseSet<const Expr *> *ExprsToKeep)
+      : AC(AC), ExprsToKeep(ExprsToKeep) {}
 
   void addOwned(TypeErasedDataflowAnalysisState State) {
     Owned.push_back(std::move(State));
@@ -256,8 +259,12 @@ class JoinedStateBuilder {
       // to enable building analyses like computation of dominators that
       // initialize the state of each basic block differently.
       return {AC.Analysis.typeErasedInitialElement(), AC.InitEnv.fork()};
+
     if (All.size() == 1)
-      return Owned.empty() ? All.front()->fork() : std::move(Owned.front());
+      // Join the environment with itself so that we can discard unwanted
+      // expression state.
+      return {All[0]->Lattice, Environment::join(All[0]->Env, All[0]->Env,
+                                                 AC.Analysis, ExprsToKeep)};
 
     auto Result = join(*All[0], *All[1]);
     for (unsigned I = 2; I < All.size(); ++I)
@@ -311,7 +318,10 @@ computeBlockInputState(const CFGBlock &Block, AnalysisContext &AC) {
     }
   }
 
-  JoinedStateBuilder Builder(AC);
+  // When performing the join, only retain state for those expressions that are
+  // consumed by this block. This avoids performing joins and potentially
+  // extending the flow condition for expressions that we won't need anyway.
+  JoinedStateBuilder Builder(AC, AC.CFCtx.getExprConsumedByBlock(&Block));
   for (const CFGBlock *Pred : Preds) {
     // Skip if the `Block` is unreachable or control flow cannot get past it.
     if (!Pred || Pred->hasNoReturnElement())
diff --git a/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp b/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
index 751e86770d5e6dc..740640d00171162 100644
--- a/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
@@ -101,6 +101,42 @@ TEST_F(EnvironmentTest, CreateValueRecursiveType) {
   EXPECT_THAT(PV, NotNull());
 }
 
+TEST_F(EnvironmentTest, JoinWithExprsToKeep) {
+  // Code is empty -- we just want access to an `ASTContext`.
+  auto Unit =
+      tooling::buildASTFromCodeWithArgs("", {"-fsyntax-only", "-std=c++11"});
+  auto &Context = Unit->getASTContext();
+
+  OpaqueValueExpr ExprToKeep(SourceLocation(), Context.BoolTy, VK_PRValue);
+  OpaqueValueExpr ExprToDiscard(SourceLocation(), Context.BoolTy, VK_PRValue);
+
+  Environment Env1(DAContext);
+  Env1.setValue(ExprToKeep, Env1.getBoolLiteralValue(true));
+  Env1.setValue(ExprToDiscard, Env1.getBoolLiteralValue(true));
+
+  Environment Env2 = Env1.fork();
+
+  Environment::ValueModel Model;
+
+  // Without `ExprsToKeep`, we retain all expressions.
+  {
+    Environment JoinedEnv = Environment::join(Env1, Env2, Model);
+    EXPECT_EQ(JoinedEnv.getValue(ExprToKeep),
+              &JoinedEnv.getBoolLiteralValue(true));
+    EXPECT_EQ(JoinedEnv.getValue(ExprToDiscard),
+              &JoinedEnv.getBoolLiteralValue(true));
+  }
+
+  // With `ExprsToKeep`, we retain only the requested expressions.
+  {
+    llvm::DenseSet<const Expr *> ExprsToKeep = {&ExprToKeep};
+    Environment JoinedEnv = Environment::join(Env1, Env2, Model, &ExprsToKeep);
+    EXPECT_EQ(JoinedEnv.getValue(ExprToKeep),
+              &JoinedEnv.getBoolLiteralValue(true));
+    EXPECT_EQ(JoinedEnv.getValue(ExprToDiscard), nullptr);
+  }
+}
+
 TEST_F(EnvironmentTest, JoinRecords) {
   using namespace ast_matchers;
 
diff --git a/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp b/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
index e33bea47137ad73..f92afd8c3d84a08 100644
--- a/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
@@ -52,29 +52,48 @@ using ::testing::NotNull;
 using ::testing::Test;
 using ::testing::UnorderedElementsAre;
 
-template <typename AnalysisT>
-llvm::Expected<std::vector<
-    std::optional<DataflowAnalysisState<typename AnalysisT::Lattice>>>>
-runAnalysis(llvm::StringRef Code, AnalysisT (*MakeAnalysis)(ASTContext &)) {
-  std::unique_ptr<ASTUnit> AST =
-      tooling::buildASTFromCodeWithArgs(Code, {"-std=c++11"});
-
-  auto *Func = selectFirst<FunctionDecl>(
-      "func", match(functionDecl(ast_matchers::hasName("target")).bind("func"),
-                    AST->getASTContext()));
-  assert(Func != nullptr);
-
-  auto CFCtx =
-      llvm::cantFail(ControlFlowContext::build(*Func));
+class DataflowAnalysisTest : public Test {
+protected:
+  template <typename AnalysisT>
+  llvm::Expected<std::vector<
+      std::optional<DataflowAnalysisState<typename AnalysisT::Lattice>>>>
+  runAnalysis(llvm::StringRef Code, AnalysisT (*MakeAnalysis)(ASTContext &)) {
+    AST = tooling::buildASTFromCodeWithArgs(Code, {"-std=c++11"});
+
+    auto *Func = selectFirst<FunctionDecl>(
+        "func",
+        match(functionDecl(ast_matchers::hasName("target")).bind("func"),
+              AST->getASTContext()));
+    assert(Func != nullptr);
+
+    CFCtx = std::make_unique<ControlFlowContext>(
+        llvm::cantFail(ControlFlowContext::build(*Func)));
+
+    AnalysisT Analysis = MakeAnalysis(AST->getASTContext());
+    DACtx = std::make_unique<DataflowAnalysisContext>(
+        std::make_unique<WatchedLiteralsSolver>());
+    Environment Env(*DACtx, *Func);
+
+    return runDataflowAnalysis(*CFCtx, Analysis, Env);
+  }
 
-  AnalysisT Analysis = MakeAnalysis(AST->getASTContext());
-  DataflowAnalysisContext DACtx(std::make_unique<WatchedLiteralsSolver>());
-  Environment Env(DACtx);
+  template <typename StateT>
+  const StateT &
+  blockStateForStmt(const std::vector<std::optional<StateT>> &BlockStates,
+                    const Stmt *S) {
+    const CFGBlock *Block = CFCtx->getStmtToBlock().lookup(S);
+    assert(Block != nullptr);
+    const std::optional<StateT> &MaybeState = BlockStates[Block->getBlockID()];
+    assert(MaybeState.has_value());
+    return *MaybeState;
+  };
 
-  return runDataflowAnalysis(CFCtx, Analysis, Env);
-}
+  std::unique_ptr<ASTUnit> AST;
+  std::unique_ptr<ControlFlowContext> CFCtx;
+  std::unique_ptr<DataflowAnalysisContext> DACtx;
+};
 
-TEST(DataflowAnalysisTest, NoopAnalysis) {
+TEST_F(DataflowAnalysisTest, NoopAnalysis) {
   auto BlockStates = llvm::cantFail(
       runAnalysis<NoopAnalysis>("void target() {}", [](ASTContext &C) {
         return NoopAnalysis(C,
@@ -88,7 +107,7 @@ TEST(DataflowAnalysisTest, NoopAnalysis) {
 
 // Basic test that `diagnoseFunction` calls the Diagnoser function for the
 // number of elements expected.
-TEST(DataflowAnalysisTest, DiagnoseFunctionDiagnoserCalledOnEachElement) {
+TEST_F(DataflowAnalysisTest, DiagnoseFunctionDiagnoserCalledOnEachElement) {
   std::string Code = R"(void target() { int x = 0; ++x; })";
   std::unique_ptr<ASTUnit> AST =
       tooling::buildASTFromCodeWithArgs(Code, {"-std=c++11"});
@@ -111,6 +130,99 @@ TEST(DataflowAnalysisTest, DiagnoseFunctionDiagnoserCalledOnEachElement) {
                                    " (Lifetime ends)\n")));
 }
 
+// Tests that check we discard state for expressions correctly.
+using DiscardExprStateTest = DataflowAnalysisTest;
+
+TEST_F(DiscardExprStateTest, WhileStatement) {
+  std::string Code = R"(
+    void foo(int *p);
+    void target(int *p) {
+      while (p != nullptr)
+        foo(p);
+    }
+  )";
+  auto BlockStates = llvm::cantFail(runAnalysis<NoopAnalysis>(
+      Code, [](ASTContext &C) { return NoopAnalysis(C); }));
+
+  auto *NotEqOp = selectFirst<BinaryOperator>(
+      "op", match(binaryOperator(hasOperatorName("!=")).bind("op"),
+                  AST->getASTContext()));
+  ASSERT_NE(NotEqOp, nullptr);
+
+  auto *CallFoo = selectFirst<CallExpr>(
+      "call", match(callExpr(callee(functionDecl(hasName("foo")))).bind("call"),
+                    AST->getASTContext()));
+  ASSERT_NE(CallFoo, nullptr);
+
+  // In the block that evaluates the expression `p != nullptr`, this expression
+  // is associated with a value.
+  const auto &NotEqOpState = blockStateForStmt(BlockStates, NotEqOp);
+  EXPECT_NE(NotEqOpState.Env.getValue(*NotEqOp), nullptr);
+
+  // In the block that calls `foo(p)`, the value for `p != nullptr` is discarded
+  // because it is not consumed by this block.
+  const auto &CallFooState = blockStateForStmt(BlockStates, CallFoo);
+  EXPECT_EQ(CallFooState.Env.getValue(*NotEqOp), nullptr);
+}
+
+TEST_F(DiscardExprStateTest, BooleanOperator) {
+  std::string Code = R"(
+    bool target(bool b1, bool b2) {
+      return b1 && b2;
+    }
+  )";
+  auto BlockStates = llvm::cantFail(runAnalysis<NoopAnalysis>(
+      Code, [](ASTContext &C) { return NoopAnalysis(C); }));
+
+  auto *AndOp = selectFirst<BinaryOperator>(
+      "op", match(binaryOperator(hasOperatorName("&&")).bind("op"),
+                  AST->getASTContext()));
+  ASSERT_NE(AndOp, nullptr);
+
+  auto *Return = selectFirst<ReturnStmt>(
+      "return", match(returnStmt().bind("return"), AST->getASTContext()));
+  ASSERT_NE(Return, nullptr);
+
+  // In the block that evaluates the LHS of the `&&` operator, the LHS is
+  // associated with a value, while the right-hand side is not (unsurprisingly,
+  // as it hasn't been evaluated yet).
+  const auto &LHSState = blockStateForStmt(BlockStates, AndOp->getLHS());
+  auto *LHSValue = cast<BoolValue>(LHSState.Env.getValue(*AndOp->getLHS()));
+  ASSERT_NE(LHSValue, nullptr);
+  EXPECT_EQ(LHSState.Env.getValue(*AndOp->getRHS()), nullptr);
+
+  // In the block that evaluates the RHS, the RHS is associated with a
+  // value. The value for the LHS has been discarded as it is not consumed by
+  // this block.
+  const auto &RHSState = blockStateForStmt(BlockStates, AndOp->getRHS());
+  EXPECT_EQ(RHSState.Env.getValue(*AndOp->getLHS()), nullptr);
+  auto *RHSValue = cast<BoolValue>(RHSState.Env.getValue(*AndOp->getRHS()));
+  ASSERT_NE(RHSValue, nullptr);
+
+  // In the block that evaluates the return statement, the expression `b1 && b2`
+  // is associated with a value (and check that it's the right one).
+  // The expressions `b1` and `b2` are _not_ associated with a value in this
+  // block, even though they are consumed by the block, because:
+  // * This block has two prececessor blocks (the one that evaluates `b1` and
+  //   the one that evaluates `b2`).
+  // * `b1` is only associated with a value in the block that evaluates `b1` but
+  //   not the block that evalutes `b2`, so the join operation discards the
+  //   value for `b1`.
+  // * `b2` is only associated with a value in the block that evaluates `b2` but
+  //   not the block that evaluates `b1`, the the join operation discards the
+  //   value for `b2`.
+  // Nevertheless, the analysis generates the correct formula for `b1 && b2`
+  // because the transfer function for the `&&` operator retrieves the values
+  // for its operands from the environments for the blocks that compute the
+  // operands, rather than from the environment for the block that contains the
+  // `&&`.
+  const auto &ReturnState = blockStateForStmt(BlockStates, Return);
+  EXPECT_EQ(ReturnState.Env.getValue(*AndOp->getLHS()), nullptr);
+  EXPECT_EQ(ReturnState.Env.getValue(*AndOp->getRHS()), nullptr);
+  EXPECT_EQ(ReturnState.Env.getValue(*AndOp),
+            &ReturnState.Env.makeAnd(*LHSValue, *RHSValue));
+}
+
 struct NonConvergingLattice {
   int State;
 
@@ -142,7 +254,7 @@ class NonConvergingAnalysis
   }
 };
 
-TEST(DataflowAnalysisTest, NonConvergingAnalysis) {
+TEST_F(DataflowAnalysisTest, NonConvergingAnalysis) {
   std::string Code = R"(
     void target() {
       while(true) {}
@@ -163,7 +275,7 @@ TEST(DataflowAnalysisTest, NonConvergingAnalysis) {
 // An earlier version crashed for this condition (for boolean-typed lvalues), so
 // this test only verifies that the analysis runs successfully, without
 // examining any details of the results.
-TEST(DataflowAnalysisTest, JoinBoolLValues) {
+TEST_F(DataflowAnalysisTest, JoinBoolLValues) {
   std::string Code = R"(
     void target() {
       for (int x = 1; x; x = 0)



More information about the cfe-commits mailing list