[cfe-dev] RFC clang analyzer false positives (for loop)

Fri Aug 26 09:11:19 PDT 2016

On 8/26/16 7:05 PM, Mehdi Amini wrote:
>> On Aug 26, 2016, at 8:28 AM, Artem Dergachev via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>>
>> On 8/26/16 3:19 PM, Joerg Sonnenberger via cfe-dev wrote:
>>> If they don't want to see any false positives, they shouldn't even ask
>>> the compiler for warnings. It is a completely absurd constraint to put
>>> on any analysis system. The trick for tools like Coverity and where the
>>> majority of the research budget goes is to develop heuristics on what
>>> false positives should be silently dropped.
>> While false positives are obviously inevitable (there are various well-known reasons for the clang static analyzer's technique to have false positives; even outside the reach of halting problem), there are reasons why false positives are destructive:
>>
>> (1) If a new user takes the tool, picks 3-4 positives and finds that they're all false, she may never give the tool another chance.
>> (2) If you have 1% false positives on your codebase, it means that there's a pattern that the tool fails upon; but there might be another codebase on which that pattern is popular and you'd get thousands of warnings with 100% false positive rate.
>>
>> So yeah, we inevitably have to treat every false positive as carefully as possible, much more carefully than false negatives. That said, ugly heuristics are rarely the best choice, yeah.
>
> Couldn’t there be a “mode” where the analyzer would only report when it finds a patch where the condition happens.
> I.e. instead of considering “I don’t know what is the possible value for nr so I complain about a possible error”, having the alternate behavior of “I know that there is a path where nr can be <= 0 and the loop not executed so I complain”.
>
> That may suppress a large number of "true positives” indeed, but if you start on a “dirty” codebase that may be useful to focus on “real” issues.
>
> —
> Mehdi

That must be possible, and i think i'm also throwing in a generalization 
of this idea in 
http://lists.llvm.org/pipermail/cfe-dev/2016-August/050550.html (paths 
on which the condition certainly happens, i.e. because all values are 
concrete, are totally-"realistic"; branch conditions based on 
SymbolRegionValue of static function parameters must have less realism 
compared to branches based on externally visible function parameters, 
and so on).