[PATCH] D103440: [WIP][analyzer] Introduce range-based reasoning for addition operator

Fri Jun 4 01:18:42 PDT 2021

vsavchenko added inline comments.

================
Comment at: clang/test/Analysis/constant-folding.c:282
+
+    if (c >= -20 && d >= -40) {
+      clang_analyzer_eval((c + d) < -1); // expected-warning{{TRUE}}
----------------
manas wrote:
> vsavchenko wrote:
> > Great, it's good to check negative numbers.
> > I have one note here.  The fact that we don't have a testing framework and use this approach of inspecting the actual analysis has an interesting implication.
> > 
> > ```
> > if (a == 10) { // here we split path into 2 paths: one where a == 10, and one where a != 10;
> >                // one goes inside of if, one doesn't
> >   . . .
> > }
> > if (a >= 5) { // here we split into 3 paths: a == 10, a < 5, and a in [5, 9] and [11, INT_MAX] (aka a >= 5 and a != 10).
> >               // 1 path was infeasible: a == 10 and a < 5
> >               // Two of these paths go inside of the if, one doesn't
> >   . . .
> >   clang_analyzer_eval(a == 10); // it will produce two results: TRUE and FALSE
> > }
> > clang_analyzer_eval(a == 10); // it will produce three results: TRUE, FALSE, and FALSE
> > ```
> > 
> > We don't want to test how path splitting works in these particular tests (they are about solver and constant folding after all), that's why we try to have only one path going through each `clang_analyzer_eval(...)`
> > 
> > In this example, you still have residual paths where `c != INT_MIN`, `c == INT_MIN and d != INT_MIN`, and `c == INT_MIN and d == INT_MIN`.
> I should add tests for these paths as well so that these can be checked. For further cases, I will enforce single path evaluation in test cases (which will make it easier to handle here).
In these tests we should really avoid having multiple feasible paths going through one `clang_analyzer_eval`.  We can add `// expected-warning` to it multiple times, but it's harder to understand as a reader of these tests because you need to split all the paths in your mind just like the analyzer does.  It can be useful when testing other features of the analyzer, but here it's not essential.

And I don't think that you fully understood me.  When multiple paths go through `clang_analyzer_eval`, each of them produces a separate "warning".
That's why if you want to account for residual paths, you shouldn't add more `if` statements, but add something like:
```
clang_analyzer_eval(a == 10); // expected-warning{{TRUE}}
                              // expected-warning at -1{{TRUE}}
                              // expected-warning at -2{{FALSE}}
```

So, please, instead of writing tests like this, rewrite your test, so that we don't have residual paths.

================
Comment at: clang/test/Analysis/constant-folding.c:304-305
+  if (c > 0 && d > 0) {
+    clang_analyzer_eval((c + d) > 1); // expected-warning{{TRUE}}
+    clang_analyzer_eval((c + d) < -1); // expected-warning{{TRUE}}
+  }
----------------
manas wrote:
> vsavchenko wrote:
> > How can these two statements be TRUE at the same time?
> Right, my bad. They both should be `UNKNOWN`.
> 
> As, `c` and `d` are signed 32-bit positive integers, hence their respective values will be in `[1, INT_MAX]`.
> 
> When `c == d == 1` then `c + d == 2`, and when `c == d == INT_MAX` then `c + d == -2` (overflow). Only possible values which `c + d` **cannot** attain are `{-1, 0, 1}`.
> 
> As a simple proof:
> 
> Dividing this range into 
>   R1 = [1, INT_MAX/2] and R2 = [(INT_MAX/2) + 1, INT_MAX]
> 
> Now, `c + d` will have 4 combination of ranges to be solved:
> - `R1 + R1` : this will never overflow as the maximum value it can attain will be `INT_MAX - 1` (when `c == d  == INT_MAX/2`)
> - while in `R1 + R2`, `R2 + R1`, and `R2 + R2` expressions will overflow, leading value of `c + d` from `INT_MIN` to `-2`, except for case `INT_MAX/2 + (INT_MAX/2 + 1)`, where `c + d == INT_MAX`.
> 
> Hence, only possible values which can never be attained by `c + d` will be `{-1, 0, 1}`. So, the range for our purposes will be `[INT_MIN, -2] U [2, INT_MAX]`.
> 
> I think I should write tests for `c + d != {-1, 0, 1}` which will make more sense here.
I see, that's a good test!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103440/new/

https://reviews.llvm.org/D103440