[cfe-dev] [EXTERNAL] Re: Newby SA question

Thu Oct 11 10:12:53 PDT 2018

Artem,

  Спасибо for the detailed explanation. It helps a lot when you understand the thinking behind a feature. I totally agree with your reasoning and the fact that tradeoff space varies form one code base to another. In my case I have to balance between acceptance of SA in a specific industrial environment and theoretical soundness of the analysis. I guess I will have to  exercise some of the mentioned tradeoffs.

  Thank you once more for the good explanation.

Sergei

-----Original Message-----
From: Artem Dergachev <noqnoqneo at gmail.com> 
Sent: Wednesday, October 10, 2018 10:54 PM
To: Sergei Larin <slarin at codeaurora.org>; cfe-dev at lists.llvm.org
Cc: llvm.dev <llvm.dev at quicinc.com>
Subject: [EXTERNAL] Re: [cfe-dev] Newby SA question

Well, that's how it has always been and is expected to work, though it might be not ideal, so it's kinda a little bit of both. I'll just go on explaining how exactly Static Analyzer understands these code snippets, and it should become clear.

In foo(), Static Analyzer says: "If the loop is executed 0 times, then cc is inevitably equal to 0, from which it follows that in test() the third argument is 0". This reasoning is, well, correct. The only problem here is in the first "If": the developer might be sure that the loop always executes at least once, even though there's no indication of it whatsoever in the code under analysis.

It might, of course, turn out that the developer is wrong. But if he's right, he doesn't mind writing "assert(cc);" at the beginning of the function, which would not only suppress the false positive, but also reduce the undefined consequences of calling test() with 0 to a simple assertion failure, and, as a bonus, document the contract of the function for future generations of developers. If such assertion ever fails, Static Analyzer was right. If it never fails, the developer was right.

Another way to describe it is that Static Analyzer proposes a unit test for foo() that consists in calling it with cc equal to 0, because it triggers an interesting corner-case: this is the only case in which the loop is executed 0 times.

In bar(), on the other hand, there's no indication that the branch on which "xx == 0" is of any interest. We would not explore it separately, pretty much like we won't explore "xx == 2018" separately. We'll just think of all these possibilities as a single path on which xx (or, more specifically, the algebraic symbol "reg_$2<unsigned int xx>" that denotes the unknown value of variable xx at the beginning of the analysis, i.e. the argument with which the function was called) "can be anything". That'd be the case until we encounter a line of code like "if (xx == 2018) { ... }". This line of code indicates that the developer expects that both branches of if() can potentially occur because otherwise it would have been dead code that doesn't make any sense and telling the developer to remove it would be considered a good thing.

Such presumption of no dead code is pretty much one of the central ideas behind symbolic execution, the static analysis technique used in Static Analyzer. The Analyzer interprets the program and "forks" the interpreter every time it encounters an "if" in the program, remembering that symbol "reg_$2<unsigned int xx>" satisfies certain equations
("constraints") on each branch (eg., it is equal to 2018 on one branch and not-equal to 2018 on another branch). If the system of all equations collected on the current execution path so far has at least one solution, the path is explored further, otherwise it is discarded as impossible.

Now let's go back to foo(). Is the presence of a for() loop an indication of the possibility that the loop is executed exactly once? 
Not necessarily. If the loop is known to always be executed at least once, code still makes perfect sense, and every statement in the code is still exercised. So the assumption that the loop is sometimes executed exactly 0 times is definitely too eager, there's little difference between that and the assumption that the loop is sometimes executed exactly 2018 times.

In fact, there's a certain amount of discussion about this problem (eg.,
https://bugs.llvm.org/show_bug.cgi?id=32911) in which some advocate for completely skipping paths on which loops are executed 0 times because bugs found on them are too often not interesting. I personally mildly tend to believe that this corner-case is often missed by programmers, so it's actually pretty interesting, but this opinion is not really that strong. I guess nobody went far enough to gather enough data to support one of these opinions - and it's veeery hard to gather enough data in static analysis because a representative selection of codebases is almost impossible to obtain, they're just too different.

Is this a big problem? I think it's quite clear that it isn't, really. 
Because, well, you can always put that assert() and it'd be good in so many ways.

  Is this an easy problem to fix? Probably not. Apart from the easy hack that destroys coverage for 0-iteration paths, the actual proper solution is "loop widening", which is a form of what you described as "flushing" 
state after the analysis. There's an experimental flag "-analyzer-config widen-loops=true" which enables such behavior, and some users are using it, but i won't recommend using it yet because there are known crashes due to how it sometimes flushes information that is impossible to flush (eg., C++ references cannot be reassigned). It's probably not that hard to fix all such problems and have a working implementation of widen-loops=true. But i'm also not quite convinced that such primitive behavior is actually good. If you destroy all information gathered before entering the loop and within the loop, you may lose a lot of important bits of information that are completely obvious by looking at the code. Eg., imagine code in which a variable is always set to 0 before the loop, and the loop definitely doesn't touch it, and then after the loop we assume that it is equal to 2018. That's a false positive that would be much more annoying to the user because the
assert() that says "this variable is not touched within the loop" is not only hard to write in the general case, but also looks ridiculous in the code. So when doing loop widening, it is important to preserve as much information as possible, and that is a much more complicated problem than simply discarding information.

What's specifically hard about loop widening is how would checkers interact with it. If the internal state trait of the program tracked by the checker changes within the loop, the core of the Analyzer won't be able to figure out what should be the final state of the checker's internal trait. In fact, it won't even be able to access the checker's private data or make sense out of it. This means that in order to implement loop widening, we need to force every checker developer to implement additional checker callback, i.e. "checkLoopWidening()", to update its private state traits. And one of the biggest strengths of the Analyzer, with all its naive understanding of control flow, is a very friendly checker API. Well, we could do better, but even in its current shape a lot of people find it very pleasant to work with.

One of the areas where our path split behavior shines is "taint analysis". Which is still experimental, but very promising. For instance, if you knew that cc in foo() or xx in bar() was in fact read from the user, the positive would immediately become definitely-true, even in bar(), and we'll be able to report it. This happens because "tainted" symbols are guaranteed to potentially take any value within the current constraints, i.e. symbolic values obtained from the user can be forged by a malicious hacker to correspond to any concrete value. 
Which is why, say, division by zero checker drops the check for stateNonZero being impossible when the symbol is tainted.

But unless you are dealing with tainted symbols or being really paranoid about verifying that your program is correct, you shouldn't report the error unless you are also sure that the non-buggy state is impossible. 
That'd be super loud.

So that's where we are. As usual: trade-offs, heuristics, budgets, suppressions... Can't really get away from it, and in this case it seems that the current behavior is both very easy to understand and not at all that problematic.

For more info on our attitude towards suppressions please see https://clang-analyzer.llvm.org/faq.html#use_assert and the next few questions of the FAQ and the links in it.

On 10/10/18 12:37 PM, Sergei Larin via cfe-dev wrote:
> Hello,
>
>    I have (unexpectedly) found myself developing a specific SA checker, and during the process I run across an interesting observation.
>
> 1) The checker traces zero value passed to a specific function. That is not that important, but I based my implementation on zero arguments passed to memcpy/memset checker. Simple enough.
> 2) I realized that the path from function arguments to the checked function matters. Here is an example:
>
> void test (unsigned short *, unsigned, unsigned);
>
> void foo(const float aa[], unsigned short bb[], unsigned int cc) {
>    unsigned char dd;
>    float ee = 0.f;
>    for (dd = 0; dd < cc; dd++)
>      ee += aa[dd];
>
>    if (ee == 0.f)
>      test(bb, 99, cc);
> }
>
>
> void bar(const float aa[], unsigned short bb[], unsigned int cc, 
> unsigned int xx) {
>    unsigned char dd;
>    float ee = 0.f;
>    for (dd = 0; dd < cc; dd++)
>      ee += aa[dd];
>
>    if (ee == 0.f)
>      test(bb, 99, xx);
> }
>
> In function foo arg cc passed to test() is SVal (0 U32b) (among other values) so my checker is triggered.
> In function bar arg xx is SVal (reg_$2<unsigned int xx>) which does _not_ match zero check.
>
> The reason for that, as far as I can read from the code, is the for() loop. In foo cc is used in (dd<cc) inequality, and value for it is estimated with a range(reg_$0<unsigned int cc> : { [0, 0] }) which at the point my checker run for test() is re-interpreted as known value of zero.
>
> 3) From #2 I can make a conclusion that presence of comparison (dd<cc above) alters analysis in a substantial way.
>
> Now the question - is this expected behavior or a bug? If this is expected behavior is there a way to affect it with any options already available? If this is a bug, is there a mechanism to "flush" state after analysis of the for loop is over, and I want to run value estimator at the point of test() evaluation from scratch? Is there a better way to achieve determinism?
>
>
> ...or I simply do not know what I am talking about 😊
>
> Any feedback is highly appreciated. Thank you.
>
> Sergei
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev