[PATCH] D103314: [Analyzer][solver] Simplify existing constraints when a new constraint is added

Wed Jun 2 07:06:39 PDT 2021

martong marked 3 inline comments as done.
martong added inline comments.

================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1559
+  // absolute minimum.
+  LLVM_NODISCARD ProgramStateRef simplifyEquivalenceClass(
+      ProgramStateRef State, EquivalenceClass Class, SymbolSet ClassMembers) {
----------------
vsavchenko wrote:
> Maybe it should be a `simplify` method of the class itself?
Yeah, makes sense.

================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1568-1573
+        EquivalenceClass ClassOfSimplifiedSym =
+            EquivalenceClass::find(State, SimplifiedMemberSym);
+        // We are about to add the newly simplified symbol to the existing
+        // equivalence class, but they are known to be non-equal. This is a
+        // contradiction.
+        if (DisequalClasses.contains(ClassOfSimplifiedSym))
----------------
vsavchenko wrote:
> I think we can add a method `isDisequalTo` or just use `areEqual` in a this way:
> are equal?
>   [Yes] -> nothing to do here
>   [No]  -> return nullptr
>   [Don't know] -> merge
Good point, I've added a new overload to the static `areEqual` and added a method `isEqualTo` that uses `areEqual`.

================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1566-1569
+    ConstraintRangeTy Constraints = State->get<ConstraintRange>();
+    for (std::pair<EquivalenceClass, RangeSet> ClassConstraint : Constraints) {
+      EquivalenceClass Class = ClassConstraint.first;
+      SymbolSet ClassMembers = Class.getClassMembers(State);
----------------
vsavchenko wrote:
> martong wrote:
> > vsavchenko wrote:
> > > You don't actually use constraints here, so (let me write it in python) instead of:
> > > ```
> > > [update(classMap[class]) for class, constraint in constraints.items()]
> > > ```
> > > you can use
> > > ```
> > > [update(members) for class, members in classMap.items()]
> > > ```
> > Actually, trivial equivalence classes (those that have only one symbol member) are not stored in the State. Thus, we must skim through the constraints as well in order to be able to simplify symbols in the constraints.
> > In short, we have to iterate both collections.
> > 
> Ah, I see.  Then I would say that your previous solution is more readable (if we keep `simplify`, of course).
My previous solution might be more readable, though, that's not working.
Actually, I think I failed to explain properly why do we have to iterate both collections.

* We have to iterate the ConstraintMap because trivial constraints are not stored in the State but we want to simplify symbols in the constraints. So, if we were to iterate over only the ClassMap then the simplest test-case would fail:
```
int test_rhs_further_constrained(int x, int y) {
  if (x + y != 0)
    return 0;
  if (y != 0)
    return 0;
  clang_analyzer_eval(x + y == 0); // expected-warning{{TRUE}}  
  clang_analyzer_eval(y == 0);     // expected-warning{{TRUE}}     FAIL
  return 0;
}
```

* We have to iterate the ClassMap in order to update all equivalence classes that we store in the State. Consider the example you brought up before:
```
void test_equivalence_classes_are_updated(int a, int b, int c, int d) {
  if (a + b != c)
    return;
  if (a != d)
    return;
  if (b != 0)
    return;
  // Keep the symbols and the constraints! alive.
  (void)(a * b * c * d);
  clang_analyzer_eval(c == d); // expected-warning{{TRUE}}
  return;
}
```
Before we start to simulate `b==0`,  we have only these equivalence classes in the State: E1{`a+b`, `c`} and E2{`a`, `d`}. And we have these constraints: SymExpr(`a+b==c`) -> out-of [0, 0], SymExpr(`a==d`) -> out-of [0, 0].
Now, when we evaluate `b==0`in setConstraint when iterating the ConstraintMap then SymExpr(`a+b==c`) becomes SymExpr(`a==c`). But the equality classes are not updated. And we can update them if we scan through the ClassMap. 

Another alternative solution could be to re-trigger the `track` mechanism when we iterate over the ConstraintMap, but `track` seemed to be an exclusive interface towards the higher abstraction Range**d**ConstraintManager. On the other hand, reusing the `track` mechanism could result better performance than doing another iteration on the ClassMap. Do you think it would be a better approach? And how could we reuse the `track` mechanism without getting confused with the `Adjustment` stuff?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103314/new/

https://reviews.llvm.org/D103314