[PATCH] D141709: [clang][dataflow] Fix bug in joining bool values.

Yitzhak Mandelbaum via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jan 13 10:38:56 PST 2023


ymandel created this revision.
ymandel added reviewers: sgatev, xazax.hun, gribozavr2.
Herald added subscribers: martong, rnkovacs.
Herald added a reviewer: NoQ.
Herald added a project: All.
ymandel requested review of this revision.
Herald added a project: clang.

Currently, the code assumes that all boolean-typed values are an instance of
`BoolValue` (or its subclasses). Yet, lvalues violate this assumption. This
patch drops the assumption and strengthens the check to confirm the shape of
both values being joined.

The patch also notes as FIXMES a number of problems discovered fixing this bug.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D141709

Files:
  clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
  clang/lib/Analysis/FlowSensitive/Transfer.cpp
  clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp


Index: clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
===================================================================
--- clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
+++ clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
@@ -129,6 +129,37 @@
             "maximum number of iterations reached");
 }
 
+// Regression test for joins of bool-typed lvalue expressions. The first loop
+// results in two passes through the code that follows. Each pass results in a
+// different `ReferenceValue` for the pointee of `v`. Then, the second loop
+// causes a join at the loop head where the two environments map expresssion
+// `*v` to different `ReferenceValue`s.
+//
+// An earlier version crashed for this condition (for boolean-typed lvalues), so
+// this test only verifies that the analysis runs successfully, without
+// examining any details of the results.
+TEST(DataflowAnalysisTest, JoinBoolLValues) {
+  std::string Code = R"(
+    void target() {
+      for (int x = 1; x; x = 0)
+        (void)x;
+      bool *v;
+      if (*v)
+        for (int x = 1; x; x = 0)
+          (void)x;
+    }
+  )";
+  ASSERT_THAT_ERROR(
+      runAnalysis<NoopAnalysis>(Code,
+                                [](ASTContext &C) {
+                                  auto EnableBuiltIns = DataflowAnalysisOptions{
+                                      DataflowAnalysisContext::Options{}};
+                                  return NoopAnalysis(C, EnableBuiltIns);
+                                })
+          .takeError(),
+      llvm::Succeeded());
+}
+
 struct FunctionCallLattice {
   using FunctionSet = llvm::SmallSet<std::string, 8>;
   FunctionSet CalledFunctions;
Index: clang/lib/Analysis/FlowSensitive/Transfer.cpp
===================================================================
--- clang/lib/Analysis/FlowSensitive/Transfer.cpp
+++ clang/lib/Analysis/FlowSensitive/Transfer.cpp
@@ -110,6 +110,8 @@
 // Unpacks the value (if any) associated with `E` and updates `E` to the new
 // value, if any unpacking occured.
 static Value *maybeUnpackLValueExpr(const Expr &E, Environment &Env) {
+  // FIXME: this is too flexible: it _allows_ a reference, while it should
+  // _require_ one, since lvalues should always be wrapped in `ReferenceValue`.
   auto *Loc = Env.getStorageLocation(E, SkipPast::Reference);
   if (Loc == nullptr)
     return nullptr;
Index: clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
===================================================================
--- clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -93,18 +93,20 @@
                                   Environment::ValueModel &Model) {
   // Join distinct boolean values preserving information about the constraints
   // in the respective path conditions.
-  if (Type->isBooleanType()) {
-    // FIXME: The type check above is a workaround and should be unnecessary.
-    // However, right now we can end up with BoolValue's in integer-typed
-    // variables due to our incorrect handling of boolean-to-integer casts (we
-    // just propagate the BoolValue to the result of the cast). For example:
+  if (isa<BoolValue>(&Val1) && isa<BoolValue>(&Val2)) {
+    // FIXME: Checking both values should be unnecessary, since they should have
+    // a consistent shape.  However, right now we can end up with BoolValue's in
+    // integer-typed variables due to our incorrect handling of
+    // boolean-to-integer casts (we just propagate the BoolValue to the result
+    // of the cast). So, a join can encounter an integer in one branch but a
+    // bool in the other.
+    // For example:
+    // ```
     // std::optional<bool> o;
-    //
-    //
     // int x;
-    // if (o.has_value()) {
+    // if (o.has_value())
     //   x = o.value();
-    // }
+    // ```
     auto *Expr1 = cast<BoolValue>(&Val1);
     auto *Expr2 = cast<BoolValue>(&Val2);
     auto &MergedVal = MergedEnv.makeAtomicBoolValue();
@@ -118,6 +120,8 @@
 
   // FIXME: Consider destroying `MergedValue` immediately if `ValueModel::merge`
   // returns false to avoid storing unneeded values in `DACtx`.
+  // FIXME: Creating the value based on the type alone creates misshapen values
+  // for lvalues, since the type does not reflect the need for `ReferenceValue`.
   if (Value *MergedVal = MergedEnv.createValue(Type))
     if (Model.merge(Type, Val1, Env1, Val2, Env2, *MergedVal, MergedEnv))
       return MergedVal;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D141709.489063.patch
Type: text/x-patch
Size: 4541 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20230113/a9eb0990/attachment-0001.bin>


More information about the cfe-commits mailing list