[PATCH] D103440: [WIP][analyzer] Introduce range-based reasoning for addition operator

Mon Jun 21 01:22:10 PDT 2021

vsavchenko added inline comments.

================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1395
+
+  if (Min > Max) {
+    // This implies that an overflow has occured as either boundary would have
----------------
I commented on this part previously, you shouldn't get fixated on `Min > Max` because `Max >= Min` doesn't imply that overflow didn't happen (less important) and that the range `[Min, Max]` is correct (more important).  Recall one of the examples that I had in that email.

There is an interesting pattern about results that we can use:
* no overflows happened -> `[Min, Max]` (should be always true)
* 1 overflow happened -> we need to invert the range, but there are two different cases:
    * `Min > Max`, perfect, that's what we expected -> `[Tmin, Max] ∪ [Min, Tmax]`
    * `Max >= Min`, we overflowed and wrapped the whole range -> `[Tmin, Tmax]`
* 2 overflows happened on one side -> `[Min, Max]`
* 2 overflows happened on both sides -> `[Tmin, Tmax]`

You need to think of this problem in terms of what really happens and not in terms of combating with assertions.

================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1411
+    if ((LHS.From() < 0 && RHS.From() < 0)) {
+      llvm::APSInt CrucialPoint = Tmin - LHS.From();
+      if (RHS.Includes(CrucialPoint)) {
----------------
What if `LHS.From()` is `Tmin` for signed `T`?
What if `T` is unsigned?  Does `Tmin - LHS.From()` (aka `0 - LHS.From()`) make sense?

================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1448-1451
+  if ((LHS.From() > 0 && RHS.From() > 0 && Min < 0) ||
+      (LHS.From() < 0 && RHS.From() < 0 && Min > 0) ||
+      (LHS.To() > 0 && RHS.To() > 0 && Max < 0) ||
+      (LHS.To() < 0 && RHS.To() < 0 && Max > 0)) {
----------------
Speaking of unsigned `T`, does it work for unsigned overflows?  Do we have tests for that?

================
Comment at: clang/test/Analysis/constant-folding.c:280-281
+  if (c < 0 && c != INT_MIN && d < 0) {
+    clang_analyzer_eval((c + d) == -1); // expected-warning{{FALSE}}
+    clang_analyzer_eval((c + d) == 0); // expected-warning{{FALSE}}
+    clang_analyzer_eval((c + d) <= -2); // expected-warning{{UNKNOWN}}
----------------
manas wrote:
> When I pull these cases out to a separate function (say testAdditionRules2) in constant-folding.c then algorithm is able to reason about the range correctly, but inside testAdditionRules, it is unable to figure stuff out. Does it have something to do with constraints being propagated which we discussed below?
> 
> @vsavchenko wrote: 
> > I have one note here.  The fact that we don't have a testing framework and use this approach of inspecting the actual analysis has an interesting implication.
> > 
> > ```
> > if (a == 10) { // here we split path into 2 paths: one where a == 10, and one where a != 10;
> >                // one goes inside of if, one doesn't
> >   . . .
> > }
> > if (a >= 5) { // here we split into 3 paths: a == 10, a < 5, and a in [5, 9] and [11, INT_MAX] (aka a >= 5 and a != 10).
> >               // 1 path was infeasible: a == 10 and a < 5
> >               // Two of these paths go inside of the if, one doesn't
> >   . . .
> >   clang_analyzer_eval(a == 10); // it will produce two results: TRUE and FALSE
> > }
> > clang_analyzer_eval(a == 10); // it will produce three results: TRUE, FALSE, and FALSE
> > ```
> > 
> > We don't want to test how path splitting works in these particular tests (they are about solver and constant folding after all), that's why we try to have only one path going through each `clang_analyzer_eval(...)`
> > 
> 
> 
It might be that or it might be something different.  Just by looking at this example, the previous `if` statement shouldn't add more paths that go inside of this `if`, so it shouldn't be the root cause.
Whenever you encounter problems and you want to tell other people, **please, provide more detail**.  Did you notice it when running the test case?  What was the output?  What did you try?  How did you extract it into a separate function?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103440/new/

https://reviews.llvm.org/D103440