[cfe-dev] RFC clang analyzer false positives (for loop)

Fri Aug 26 08:37:58 PDT 2016

On 8/26/16 5:51 PM, Joerg Sonnenberger via cfe-dev wrote:
> On Fri, Aug 26, 2016 at 02:36:31PM +0000, Daniel Marjamäki wrote:
>> However if most warnings for a particular use case are in practice false positives I say these should be dropped. So I think the question should be if most warnings are false positives? Do you think your true positive ratio in your own code is good? Is it higher than 50%?
>>
>>> But the example you have given is *not* wrong. It is completely correct.
>> Imho it is a false positive if I can easily see with human intelligence that the warning is wrong.
> But you can't without adding additional assumptions that are not part of
> the code. In this case "nr must be positive". That's why it is not a
> false positive -- you have to add external knowledge. If the constraint
> can't be handled by the analyzer, that's again a different situation.
> That's e.g. a problem when the initialisation happens in a different
> function and the analyzer can't or doesn't want to see it. Another case
> is if the constraint depends on a non-trivial computation and the
> analyzer isn't smart enough to figure it out. All those are examples of
> false positives: the information is available, but the system doesn't
> use them. It is not a false positive if information is missing in first
> place.
>
Suppose we analyzed the program from the very main(), scanned all call 
sites in all translation units, and proven that the path is never 
reached in this program. Then the positive would have been gone. So 
perhaps this positive is not unavoidable, and i don't think this 
information is missing - just the analyzer was not taught to use it. If 
it was a library interface function, then yeah, the user is known to be 
able to throw arbitrary stuff into it. But perhaps the library 
documentation will suggest that "if nr is non-positive, the behavior is 
undefined"... no longer sure if it's true or false this time.

But i'd also consider classifying positives as "useful" and "useless", 
rather than "true" or "false". For example, when the code needs to be 
fixed, this positive is good. When it requires documenting (documenting 
contracts through asserts included), it's also probably useful, but 
perhaps less useful - there was no bug in the program, so the user 
wouldn't notice the change. This particular positive, depending on the 
circumstances, may or may not be very useful. Assertion is worth adding, 
yeah. But the software's end user may not benefit from it, so the 
manager may avoid supporting the usage of such tool in the project, if 
most positives need to be suppressed rather than fixed.