[clang] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions (PR #71284)

Balazs Benics via cfe-commits cfe-commits at lists.llvm.org
Sat Nov 4 07:15:39 PDT 2023


https://github.com/steakhal created https://github.com/llvm/llvm-project/pull/71284

The idea is that if we see a `X RELOP Y` being constrained to a RangeSet `S`, then check the eqclasses of X and Y respectively and for `X' RELOP Y'` SymSymExprs and try to infer their ranges.
If there is no contradiction with any of the equivalent alternatives, then intersecting all these RangeSets should never be empty - aka. there should be a value satisfying the constraints we have.

It costs around `|eqclass(X)| + |eqclass(y)|`.

The approach has its limitations, as demonstrated by `gh_62215_contradicting_nested_right_equivalent`, where we would need to apply the same logic, but on a sub-expression of a direct operand.

Before the patch, line 90, 100, and 112 would be reachable; and become unreachable after this. Line 127 will remain still reachable, but keep in mind that when cross-checking with Z3 (aka. Z3 refutation), then all 4 reports would be eliminated.

The idea comes from
https://github.com/llvm/llvm-project/issues/62215#issuecomment-1792878474

Fixes #62215

>From 92ece501b340c3a2a52b5a4614ddb70bb3e35c93 Mon Sep 17 00:00:00 2001
From: Balazs Benics <benicsbalazs at gmail.com>
Date: Sat, 4 Nov 2023 13:44:28 +0100
Subject: [PATCH] [analyzer][solver] On SymSym RelOps, check EQClass members
 for contradictions

The idea is that if we see a `X RELOP Y` being constrained to a RangeSet `S`,
then check the eqclasses of X and Y respectively and for `X' RELOP Y'`
SymSymExprs and try to infer their ranges.
If there is no contradiction with any of the equivalent alternatives,
then intersecting all these RangeSets should never be empty - aka. there
should be a value satisfying the constraints we have.

It costs around `|eqclass(X)| + |eqclass(y)|`.

The approach has its limitations, as demonstrated by
`gh_62215_contradicting_nested_right_equivalent`, where we would need to
apply the same logic, but on a sub-expression of a direct operand.

Before the patch, line 90, 100, and 112 would be reachable; and become
unreachable after this. Line 127 will remain still reachable, but keep
in mind that when cross-checking with Z3 (aka. Z3 refutation), then all
4 reports would be eliminated.

The idea comes from
https://github.com/llvm/llvm-project/issues/62215#issuecomment-1792878474

Fixes #62215
---
 .../Core/RangeConstraintManager.cpp           | 53 +++++++++++++++++++
 clang/test/Analysis/constraint-assignor.c     | 48 +++++++++++++++++
 2 files changed, 101 insertions(+)

diff --git a/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp b/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
index 5de99384449a4c8..d631369e895d3a9 100644
--- a/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
@@ -2067,6 +2067,12 @@ class ConstraintAssignor : public ConstraintAssignorBase<ConstraintAssignor> {
     return Assignor.assign(CoS, NewConstraint);
   }
 
+  /// Check if using an equivalent operand alternative would lead to
+  /// contradiction.
+  /// If a contradiction is witnessed, returns false; otherwise returns true.
+  bool handleEquivalentAlternativeSymOperands(const SymSymExpr *SymSym,
+                                              RangeSet Constraint);
+
   /// Handle expressions like: a % b != 0.
   template <typename SymT>
   bool handleRemainderOp(const SymT *Sym, RangeSet Constraint) {
@@ -2218,11 +2224,58 @@ bool ConstraintAssignor::assignSymExprToConst(const SymExpr *Sym,
   return true;
 }
 
+bool ConstraintAssignor::handleEquivalentAlternativeSymOperands(
+    const SymSymExpr *SymSym, RangeSet Constraint) {
+  SymbolRef LHS = SymSym->getLHS();
+  SymbolRef RHS = SymSym->getRHS();
+  EquivalenceClass LHSClass = EquivalenceClass::find(State, LHS);
+  EquivalenceClass RHSClass = EquivalenceClass::find(State, RHS);
+  SymbolSet SymbolsEqWithLHS = LHSClass.getClassMembers(State);
+  SymbolSet SymbolsEqWithRHS = RHSClass.getClassMembers(State);
+  llvm::SmallVector<SymSymExpr, 10> AlternativeSymSyms;
+
+  // Gather left alternatives.
+  for (SymbolRef AlternativeLHS : SymbolsEqWithLHS) {
+    if (AlternativeLHS == LHS)
+      continue;
+    AlternativeSymSyms.emplace_back(AlternativeLHS, SymSym->getOpcode(), RHS,
+                                    SymSym->getType());
+  }
+
+  // Gather right alternatives.
+  for (SymbolRef AlternativeRHS : SymbolsEqWithRHS) {
+    if (AlternativeRHS == RHS)
+      continue;
+    AlternativeSymSyms.emplace_back(LHS, SymSym->getOpcode(), AlternativeRHS,
+                                    SymSym->getType());
+  }
+
+  // Crosscheck the inferred ranges.
+  for (SymSymExpr AltSymSym : AlternativeSymSyms) {
+    RangeSet AltSymSymConstrant =
+        SymbolicRangeInferrer::inferRange(RangeFactory, State, &AltSymSym);
+    Constraint = intersect(RangeFactory, Constraint, AltSymSymConstrant);
+
+    // Check if we witnessed a contradiction with the equivalent alternative
+    // operand.
+    if (Constraint.isEmpty()) {
+      State = nullptr;
+      return false;
+    }
+  }
+  return true;
+}
+
 bool ConstraintAssignor::assignSymSymExprToRangeSet(const SymSymExpr *Sym,
                                                     RangeSet Constraint) {
   if (!handleRemainderOp(Sym, Constraint))
     return false;
 
+  if (const auto *SymSym = dyn_cast<SymSymExpr>(Sym);
+      SymSym && !handleEquivalentAlternativeSymOperands(SymSym, Constraint)) {
+    return false;
+  }
+
   std::optional<bool> ConstraintAsBool = interpreteAsBool(Constraint);
 
   if (!ConstraintAsBool)
diff --git a/clang/test/Analysis/constraint-assignor.c b/clang/test/Analysis/constraint-assignor.c
index 8210e576c98bd21..d5078b406e12601 100644
--- a/clang/test/Analysis/constraint-assignor.c
+++ b/clang/test/Analysis/constraint-assignor.c
@@ -82,3 +82,51 @@ void remainder_with_adjustment_of_composit_lhs(int x, int y) {
   clang_analyzer_eval(x + y != -1);    // expected-warning{{TRUE}}
   (void)(x * y); // keep the constraints alive.
 }
+
+void gh_62215(int x, int y, int z) {
+  if (x != y) return; // x == y
+  if (z <= x) return; // z > x
+  if (z >= y) return; // z < y
+  clang_analyzer_warnIfReached(); // no-warning: This should be dead code.
+  (void)(x + y + z); // keep the constraints alive.
+}
+
+void gh_62215_contradicting_right_equivalent(int x, int y, int z) {
+  if (x == y && z > x) {
+    clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
+
+    // `z < y` should mean the same thing as `z < x`, which would contradict with `z > x`
+    if (z < y) {
+      clang_analyzer_warnIfReached(); // no-warning: dead code
+    }
+  }
+  (void)(x + y + z); // keep the constraints alive.
+}
+
+void gh_62215_contradicting_left_equivalent(int x, int y, int z) {
+  if (x == y && z > x) {
+    clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
+
+    // `y > z` should mean the same thing as `x > z`, which would contradict with `z > x`
+    if (y > z) {
+      clang_analyzer_warnIfReached(); // no-warning
+    }
+  }
+  (void)(x + y + z); // keep the constraints alive.
+}
+
+void gh_62215_contradicting_nested_right_equivalent(int x, int y, int z) {
+  if (y > 1 && y < 10) { // y: [2,9]
+    if (x == y && z > x) {
+      clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
+
+      // `z < (y - 1)` should mean the same thing as `z < (x - 1)`, which
+      // should contradict with `z > x` (assuming x,y: [2,9])
+      if (z < (y - 1)) {
+        // FIXME: This should be dead code.
+        clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}} Z3 crosscheck eliminate this btw
+      }
+    }
+  }
+  (void)(x + y + z); // keep the constraints alive.
+}



More information about the cfe-commits mailing list