[PATCH] D102696: [Analyzer] Find constraints that are directly attached to a BinOp

Gabor Marton via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu May 27 06:47:50 PDT 2021


martong added a comment.

Alright, I see your point. I agree that solving the problem of "$a +$b +$c
constrained and then $a + $c got constrained" requires canonicalization.
However, canonicalization seems to be not trivial to implement. And there are
some other easier cases that I think we could (should) solve before starting to
implement canonicalization.

Parent map is just an optimization. The functionality should be working even if
we apply brute-force to go through all existing constraints.

Because of the above, I suggest the following steps. These steps could be
implemented as individual patches that depend on each other and once all of
them accepted then we could merge them in one lock-step.

1. Update `setConstraint` to simplify existing constraints (and adding the

simplified constraint) when a new constraint is added. In this step we'd just
simply iterate over all existing constraints and would try to simplfy them with
`simplifySVal`. This would solve the simplest cases where we have two symbols
in the tree. E.g. these cases would pass:

  int test_rhs_further_constrained(int x, int y) {
    if (x + y != 0)
      return 0;
    if (y != 0)
      return 0;
    clang_analyzer_eval(x + y == 0); // expected-warning{{TRUE}}
    clang_analyzer_eval(y == 0);     // expected-warning{{TRUE}}
    return 0;
  }
  
  int test_lhs_and_rhs_further_constrained(int x, int y) {
    if (x % y != 1)
      return 0;
    if (x != 1)
      return 0;
    if (y != 2)
      return 0;
    clang_analyzer_eval(x % y == 1); // expected-warning{{TRUE}}
    clang_analyzer_eval(y == 2);     // expected-warning{{TRUE}}
    return 0;
  }



2. Add the capability to simplify more complex constraints, where there are 3

symbols in the tree. In this change I suggest the extension and overhaul of
`simplifySVal`. This change would make the following cases to pass (notice the
3rd check in each test-case).

  int test_left_tree_constrained(int x, int y, int z) {
    if (x + y + z != 0)
      return 0;
    if (x + y != 0)
      return 0;
    clang_analyzer_eval(x + y + z == 0); // expected-warning{{TRUE}}
    clang_analyzer_eval(x + y == 0);     // expected-warning{{TRUE}}
    clang_analyzer_eval(z == 0);         // expected-warning{{TRUE}}
    x = y = z = 1;
    return 0;
  }
  
  int test_right_tree_constrained(int x, int y, int z) {
    if (x + (y + z) != 0)
      return 0;
    if (y + z != 0)
      return 0;
    clang_analyzer_eval(x + y + z == 0); // expected-warning{{TRUE}}
    clang_analyzer_eval(y + z == 0);     // expected-warning{{TRUE}}
    clang_analyzer_eval(x == 0);         // expected-warning{{TRUE}}
    return 0;
  }



3. Add canonical trees to the solver. The way we should do this is to build

"flat" sub-trees of associative operands. E.g. `b + a + c` becomes `+(a, b, c)`,
i.e. a tree with a root and 3 children [1]. The ordering of the children
could be done by their addresses (`Stmt *`) and we could simply store them in an
array. Probably we'll need a new mapping: SymbolRef -> CanonicalTree.

When a new constraint is added then we build up its canonical tree (if not
existing yet) Then, we check each existing constraints' canonical tree to check
whether the newly constrained expression is a sub-expression of that. (The
check could be a simple recursively descending search in the children arrays of
the nodes.) If it is a sub-expression then we simplify that with a customized
canonical tree visitor that will call evalBinOp appropriately.

In this step we should handle only `+` and `*`

This would pass the below tests:

  int test_left_tree_constrained_commutative(int x, int y, int z) {
    if (x + y + z != 0)
      return 0;
    if (y + x != 0)
      return 0;
    clang_analyzer_eval(x + y + z == 0); // expected-warning{{TRUE}}
    clang_analyzer_eval(x + y == 0);     // expected-warning{{TRUE}}
    clang_analyzer_eval(z == 0);         // expected-warning{{TRUE}}
    return 0;
  }
  
  int test_associative(int x, int y, int z) {
    if (x + y + z != 0)
      return 0;
    if (x + z != 0)
      return 0;
    clang_analyzer_printState();
    clang_analyzer_eval(x + y + z == 0); // expected-warning{{TRUE}}
    clang_analyzer_eval(x + z == 0);     // expected-warning{{TRUE}}
    clang_analyzer_eval(y == 0);         // expected-warning{{TRUE}}
    return 0;



4. Extend 3) with `-` and `/`. `lhs - rhs` becomes a `lhs + (-rhs)` and `lhs /rhs`

is substituted wit `lhs * (1/rhs)`.

5. There are still some expressions that the solver will not be able to reason about

(e.g. `<<`), but it may happen that we store a constraint for them. In these
cases it is a good idea to search for the expression in the constraint map, but
without constant folding. This functionality might be solved by step 2). If
not, then https://reviews.llvm.org/D102696 could be a good alternative.

6. Do performance measurements and enhance perhaps with parent maps.

I am planning to upload a patch for steps 1) and 2) very soon.

[1] Hans Vangheluwe et al, An algorithm to implement a canonical representation
of algebraic expressions and equations in AToM3


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102696/new/

https://reviews.llvm.org/D102696



More information about the cfe-commits mailing list