[cfe-dev] [StaticAnalyzer] Can it detect __builtin_trap() ?

Tue Jun 6 03:23:19 PDT 2017

We do not currently warn on those, apart from debug.ExprInspection 
checker that warns whenever void clang_analyzer_warnIfReached() is 
called (we use this for regression tests).

It should be easy to write the checker you want. It'd probably subscribe 
on checkPreCall, see if the builtin is being called, and if there are 
arguments it may check if symbolic values for these arguments can be 
assumed to be true or false on this path, and then throw a report. You 
may look at BuiltinFunctionsChecker and NoReturnFunctionsChecker for 
examples.

--

However, we most likely wouldn't ever be able to warn on violating the 
assumptions by default, because we believe that it'd cause a lot more 
false positives than useful reports. Instead, for now we have to rely on 
these functions to avoid false positives by making the analyzer trust 
the assertions added by developers. This decision is not entirely 
pleasant - i totally see what you're thinking - but for now we have to 
keep a good practical balance between our ability to find bugs and avoid 
false positives.

The whole point of the analyzer is to find paths through the program 
that lead to various bad events, like division by zero or null 
dereference. If a certain syntactic construct, like a trap() call, 
immediately constitutes a failure, then the analyzer would essentially 
always warn on every trap() call unless he's sure it's dead code - and 
the analyzer tries to avoid seeing dead code as much as he can, instead 
believing that bugs indeed happen, because that's essentially how it 
finds most of the bugs, and this is the whole point of symbolic execution.

Consider an example:

void foo(x) {
   if (x == 42)
     trap();
}

If we analyze this function as a top-level function, as opposed to being 
called from another function (as we cannot afford whole-program analysis 
- it's ridiculously hyperexponential), then we'd assume that x is equal 
to 42, because the if-statement is there for a reason, and then we'd 
warn that a trap may be triggered if x is equal to 42. However, that's 
not what we intend to see. Instead, we should use trap() to notify us 
that the branch shouldn't be analyzed.

Consider another example:

   if (x == 1) {
     // ...
   }
   if (y == 2) {
     // ...
   }
   if (y == 2) {
     assume(x == 1);
   }

On the first if(), we can safely assume that both (x == 1) and (x != 1) 
is possible. Otherwise, the if() would be a dead code, and we presume 
that there's no dead code (which is, again, the whole point of symbolic 
execution).

On the second if(), we assume that all four cases are possible: (x == 1, 
y == 2), (x == 1, y != 2), (x != 1, y == 2), (x != 1, y != 2). This 
assumption is not entirely safe anymore: it might be that only the 
second and the third case take place, and there's still no dead code. 
However, we take this risk in order to find more bugs and reduce the 
analyzer's complexity, because making a completely safe assumption that 
is still useful would be very hard.

On the third if(), we are entering from two of the four branches that 
already exist: (x == 1, y == 2), (x != 1, y == 2). When we encounter 
assume(), we throw away (x != 1, y == 2). This compensates for the 
unsafety on the second if(). This way developers can suppress our unsafe 
branches.

In case of loops it gets even worse - we've no idea how many times a 
loop may execute.

So generally the tradeoff is:
- For various reasons, we sometimes allow ourselves to explore more 
paths than we should. This includes eager state splits, conservative 
evaluation of functions, lack of knowledge about the initial state of 
the program, shortcomings of our fast constraint solver that sometimes 
fails to see that certain constraints are incompatibly, etc.
- We then allow the developers to suppress unwanted paths with various 
no-return functions.

--

Regardless of the problems discussed above, i believe that your use case 
has a point. In some cases, we should be able to clearly see that the 
code does indeed violate the assumption. An obvious case would be 
assume(false). Or we should be able to warn on foo(42), where foo(x) is 
the function from the first example. Such warning would be a lot more 
complex than what we've originally expected though. For example, we 
won't be able to easily discriminate between "bool a = false; 
assume(a);" and "assume(x == 1);" in the second example on branch (x != 
1, y == 2), because in both cases we have plain "false" as the symbolic 
value we're assuming. Similarly, if someone passes 42 to foo, it'd be 
very similar to analyzing this function as top-level and assuming x == 
42 at the if-statement, however our checker would have to behave 
differently.

So, one way or another, we don't currently have a good mechanism to 
discriminate between these cases and write a really useful checker the 
way you want it.

05/06/2017 7:57 PM, Andrzej Krzemienski via cfe-dev wrote:
> Hi Everyone,
> Does Clan Static Analyzer recognize `__builtin_trap()` or 
> `__builtin_assume()` or `__builtin_unreachable()` or some similar thin 
> that indicates undesired failure? So that I (library author) could 
> assist the analyzer in reporting bad usages of a library:
>
> ```
> T& optional<T>::value()
> {
>   if (!this->is_initialized)
>     __builtin_trap(); // if analyzer reaches this path, it should 
> report a warnin
>   return reinterpret_cast<T&>(_raw_storage);
> }
> ```
>
> Regards,
> &rzej;
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev