[PATCH] D103440: [WIP][analyzer] Introduce range-based reasoning for addition operator

Manas Gupta via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 21 21:17:52 PDT 2021


manas added inline comments.


================
Comment at: clang/test/Analysis/constant-folding.c:280-281
+  if (c < 0 && c != INT_MIN && d < 0) {
+    clang_analyzer_eval((c + d) == -1); // expected-warning{{FALSE}}
+    clang_analyzer_eval((c + d) == 0); // expected-warning{{FALSE}}
+    clang_analyzer_eval((c + d) <= -2); // expected-warning{{UNKNOWN}}
----------------
vsavchenko wrote:
> manas wrote:
> > vsavchenko wrote:
> > > manas wrote:
> > > > When I pull these cases out to a separate function (say testAdditionRules2) in constant-folding.c then algorithm is able to reason about the range correctly, but inside testAdditionRules, it is unable to figure stuff out. Does it have something to do with constraints being propagated which we discussed below?
> > > > 
> > > > @vsavchenko wrote: 
> > > > > I have one note here.  The fact that we don't have a testing framework and use this approach of inspecting the actual analysis has an interesting implication.
> > > > > 
> > > > > ```
> > > > > if (a == 10) { // here we split path into 2 paths: one where a == 10, and one where a != 10;
> > > > >                // one goes inside of if, one doesn't
> > > > >   . . .
> > > > > }
> > > > > if (a >= 5) { // here we split into 3 paths: a == 10, a < 5, and a in [5, 9] and [11, INT_MAX] (aka a >= 5 and a != 10).
> > > > >               // 1 path was infeasible: a == 10 and a < 5
> > > > >               // Two of these paths go inside of the if, one doesn't
> > > > >   . . .
> > > > >   clang_analyzer_eval(a == 10); // it will produce two results: TRUE and FALSE
> > > > > }
> > > > > clang_analyzer_eval(a == 10); // it will produce three results: TRUE, FALSE, and FALSE
> > > > > ```
> > > > > 
> > > > > We don't want to test how path splitting works in these particular tests (they are about solver and constant folding after all), that's why we try to have only one path going through each `clang_analyzer_eval(...)`
> > > > > 
> > > > 
> > > > 
> > > It might be that or it might be something different.  Just by looking at this example, the previous `if` statement shouldn't add more paths that go inside of this `if`, so it shouldn't be the root cause.
> > > Whenever you encounter problems and you want to tell other people, **please, provide more detail**.  Did you notice it when running the test case?  What was the output?  What did you try?  How did you extract it into a separate function?
> > I put a new test function in `constant-folding.c` as:
> > 
> > ```
> > void testAdditionRulesNew(int c, int d) {
> >   if (c < 0 && c != INT_MIN && d < 0) {
> >     clang_analyzer_eval((c + d) == 0); // expected-warning{{FALSE}}
> >   }
> > }
> > ```
> > I tested this specific function as:
> > 
> >   ./build/bin/clang -cc1 -analyze -analyzer-checker=core,debug.ExprInspection -analyze-function=testAdditionRulesNew constant-folding.c
> > 
> > And I got the following output:
> > 
> >   ../clang/test/Analysis/constant-folding.c:338:5: warning: FALSE [debug.ExprInspection]
> >     clang_analyzer_eval((c + d) == 0); // expected-warning{{FALSE}}
> > 
> > which is correct.
> > 
> > But when I ran the same test inside `testAdditionRules`, using:
> >     ./build/bin/clang -cc1 -analyze -analyzer-checker=core,debug.ExprInspection -analyze-function=testAdditionRules constant-folding.c
> > then I got:
> > 
> >   ../clang/test/Analysis/constant-folding.c:281:5: warning: FALSE [debug.ExprInspection]
> >     clang_analyzer_eval((c + d) == 0); // expected-warning{{FALSE}}                                                                                            
> >     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                          
> >   ../clang/test/Analysis/constant-folding.c:281:5: warning: TRUE [debug.ExprInspection]
> >     clang_analyzer_eval((c + d) == 0); // expected-warning{{FALSE}}                                                                                            
> >     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                      
> > 
> > Here, `c = [INT_MIN + 1, -1]` and `d = [INT_MIN, 0]`, so `c + d = [INT_MIN, -2] U [1, INT_MAX]`. So `c + d == 0` should be false. But in latter case, it is reasoning `c + d == 0` to be `UNKNOWN`.
> > 
> > Also, the error arises in `c + d == 0` case only and not in `c + d == -1` case. I mistakenly highlighted that case while commenting.
> Hmm, I don't know what can be the reason.
> 
> There are three reasonable approaches:
> 1. Straightforward: use debugger.  Find a place when we ask the solver about this assumption and dig in.
> 2. Print-based: use `clang_analyzer_printState` inside of this if.  Among other things it will print you all constraints that we know about, you can check if they match your expectations.  Also, if it is indeed because of residual paths, you will get another print from residual path.
> 3. Reduction: add  the whole `testAdditionRules` into `testAdditionRulesNew` and start removing parts of it.  At some point problem should disappear (based on your previous observations).  That "minimal" reproducer can give you insight into what's going on.
> 
> These are not mutually exclusive, so you can try them in combination.
I tried print the ProgramState as in both the cases like:

  if (c < 0 && c != INT_MIN && d < 0) {
    clang_analyzer_printState();
    clang_analyzer_eval(...);
    ...
  }

1. During `testAdditionRulesNew`, when I added `clang_analyzer_printState` just before doing any `clang_analyzer_eval` calls, I got the following constraints (these are as expected):

```  
  "constraints": [                                                                                                                                             
    { "symbol": "reg_$0<int c>", "range": "{ [-2147483647, -1] }" },                                                                                           
    { "symbol": "reg_$1<int d>", "range": "{ [-2147483648, -1] }" }                                                                                            
  ],
```

2. In `testAdditionRules`, adding `printState` before any evaluation led to:

```
  "constraints": [                                                                                                                                             
    { "symbol": "reg_$2<int c>", "range": "{ [-2147483647, -1] }" },                                                                                           
    { "symbol": "reg_$3<int d>", "range": "{ [-2147483648, -1] }" },                                                                                           
    { "symbol": "(reg_$2<int c>) + (reg_$3<int d>)", "range": "{ [0, 2147483647] }" }                                                                          
  ],

  "constraints": [
    { "symbol": "reg_$2<int c>", "range": "{ [-2147483647, -1] }" },
    { "symbol": "reg_$3<int d>", "range": "{ [-2147483648, -1] }" },
    { "symbol": "(reg_$2<int c>) + (reg_$3<int d>)", "range": "{ [-2147483648, -2] }" }
  ],
```
A point to note is that this ProgramState is **before** any `clang_analyzer_eval` calls have been made. So, the third constraint, in both sets of constraints (for `c + d`) should not exist there, that is,

  a. { "symbol": "(reg_$2<int c>) + (reg_$3<int d>)", "range": "{ [0, 2147483647] }" }
  b. { "symbol": "(reg_$2<int c>) + (reg_$3<int d>)", "range": "{ [-2147483648, -2] }" }

should not be in the constraints set.

And as obvious, the `UNKNOWN` reasoning for `c + d == 0` in `testAdditionRules` is arising from constraint (a) above, which is a wrong reasoning (it ideally should be `[1, INT_MAX]` and not `[0, INT_MAX]`).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103440/new/

https://reviews.llvm.org/D103440



More information about the cfe-commits mailing list