[PATCH] D79432: [analyzer] StdLibraryFunctionsChecker: Add summaries for libc

Thu Sep 17 06:37:14 PDT 2020

martong added inline comments.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp:1123-1124
+      "abs", Summary(ArgTypes{IntTy}, RetType{IntTy}, EvalCallAsPure)
+                 .Case({ArgumentCondition(0, WithinRange, SingleValue(0)),
+                        ReturnValueCondition(WithinRange, SingleValue(0))})
+                 .Case({ArgumentCondition(0, WithinRange, Range(1, IntMax)),
----------------
balazske wrote:
> martong wrote:
> > NoQ wrote:
> > > NoQ wrote:
> > > > The three-way state split is unjustified here. Usage of `abs` is not a sufficient indication that the value may be 0, otherwise:
> > > > ```lang=c++
> > > > int foo(int x, int y) {
> > > >   int z = abs(y);   // Assuming 'abs' has taken branch on which y == 0...
> > > >   return x / z;     // ... we'll be forced to emit a division by zero warning here.
> > > > }
> > > > ```
> > > > 
> > > > Generally, there are very few cases when state splits upon function calls are justified. The common cases are:
> > > > - The function returns bool and finding that bool is the only reason to ever call this function. Eg., `isalpha()` and such.
> > > > - The function can at any time completely unpredictably take any of the branches, in other words, taint is involved. Eg., `scanf()` can always fail simply because the user of the program wrote something special into stdin.
> > > > returns bool
> > > 
> > > Or something that kinda resembles bool (eg., `isalpha()` returns a variety of different ints in practice due to its static lookup table implementation strategy but the user only cares about whether it's zero or non-zero).
> > Alright, I agree, I'll remove these cases. 
> > 
> > Generally speaking I realized that it is hard to create these cases, and it is very rare when we can come up with a meaningful case set for any function. So while we are at it, could you please take a look at another patch, where I add no cases just argument constraints: D82288 ?
> But it could make sense that if the input of the function is known before the call, we can compute the output? For `abs(x)` if we know that `x == 0` is true the checker could assume `abs(x) == 0` too. These "cases" provide data for this functionality.
> For abs(x) if we know that x == 0 is true the checker could assume abs(x) == 0 too. 
Yes, this is absolutely true.

I think, we could approach this with introducing a postcall Kind for the summaries. Currently, we already have `EvalCallAsPure` and `NoEvalCall` for evalCall. We could have let's say 
```
enum PostCallKind { UnconditionedStateSplit, ConstrainReturnIfArgConstraintsAreSatisfied }
```
For the ConstrainReturnIfArgConstraintsAreSatisfied we should check if the preconditions for the Args in the Case are all evaluated to Known and if so only would we apply the constraint on the return value.
@NoQ What do you think?
@steakhal FYI.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79432/new/

https://reviews.llvm.org/D79432