[cfe-dev] RFC clang analyzer false positives?

Fri Aug 26 08:17:50 PDT 2016

Some inline answers:

On 8/25/16 10:23 AM, Daniel Marjamäki via cfe-dev wrote:

> Hello!
>
> We try to avoid false positives in the static analyzer. In my opinion we should try pretty hard. When there is reason for doubt, I would say it's better with false negatives.
>
> I wonder what you think about these.
>
>   1. Dead store
>
>   ...
>       default: // should not happen
>            x = 0;
>            ASSERT(FALSE);  // <- dead store, x is assigned a value that is not read
>        }
>        return x;
>   }
>
> In certain configurations the execution is terminated and then the dead store is correct. But in other configurations the "x" value will be returned.
>
> Do you think that it would be acceptable to skip the warning if the noreturn function is in a macro call? And ideally we would warn if the value is not used after the macro but that might be quite hard to achieve (I did not look but it's my assumption).
I think that's a fundamental issue with all deadcode checkers, and i 
don't think i see a good solution. Unfortunately, the analyzer deals 
with preprocessed code most of the time, and figuring out what was there 
before the preprocessing is hard - after all, when the code is 
preprocessed out, it isn't even compiled; often, it cannot be. So yeah, 
whenever macros are used for turning code on and off, our deadcode 
checkers (or other all-path-based checkers, which aren't many i think) 
are having problems.

For all other checkers, we certainly should not drop asserts. Because 
they sew away the paths that are very unlikely and will be definitely 
treated as false positives.

>   2. Garbage value.
>
>       #define INVALID ~0
>       struct XY { int x; int y; };
>       
>       inline struct XY getXY(int index) {
>         struct XY xy;
>         if (index < 12) {
>           xy.x = 1;
>           xy.y = 2;
>         } else {
>           xy.x = INVALID;
>         }
>         return xy;
>       }
>       
>       void f(int index) {
>         struct XY xy;
>         xy = getXY(index);
>         if (xy.y == 0) {}   // <- garbage value xy.y
>       }
>
> In the real code the "getXY" function is implemented in a header and can be called from many locations.
>
> Clang assumes that the condition "index < 12" in getXY() can be false. Should such assumption really be made since it's a subfunction declared in a header that might be called from many places? If the condition was moved to f() I would agree that we can assume that it can be false - otherwise it would be useless to have it.
>
> I believe we could solve this FP in our code by adding an assertion in f() that makes sure index is less than 12. However such assertion would be misplaced. We want to validate input values as soon as possible. The "index" is validated long before the function f() is called.
I agree that this is a problem. Symbolic execution is very much based on 
psychological assumption that if there's a branch in the program, then 
the author put it here for a reason. Which means, both paths must be 
possible.

For inlined calls, that's certainly not true. The author might have 
never seen this branch.

In my head, the completely untested idea for such cases would be to 
compute a "realism score" of every path, and give the user a lever for 
setting realism threshold. For example, branches in the top-level 
function would score towards "this branch is realistic", branches inside 
callees would score towards "we're not sure this path is possible", 
branches with UnknownVal as condition would have a significantly 
negative realism score (both true and false branch), branches with 
tainted condition are hugely realistic (the attacker may forge any value 
here, so we badly need to scan both true and false branch), loop 
condition branches would probably have lower realism, and so on.

I suspect there'd be a lot of problems with this approach, but that's 
something i wish to try eventually.

> Best regards,
> Daniel Marjamäki
>
> ..................................................................................................................
> Daniel Marjamäki Senior Engineer
> Evidente ES East AB  Warfvinges väg 34  SE-112 51 Stockholm  Sweden
>
> Mobile:                 +46 (0)709 12 42 62
> E-mail:                 Daniel.Marjamaki at evidente.se
>
> www.evidente.se
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>