[cfe-dev] [GSoC 2018] Using the Z3 SMT Solver to Validate Bugs Reported by the Clang Static Analyzer

Mikhail Ramalho via cfe-dev cfe-dev at lists.llvm.org
Tue Jun 26 10:23:47 PDT 2018


Hi all,

Hi guys,

I'm investigating the constraints being dropped, here's what I got so far.

First of all, I'm using the following program:

void foo(unsigned width)
{
  int base;
  int i = 0;

  if (!(i % width))
    base = 1;

  if(base == 1)
    abort();
}

I started by looking at ExprEngine::processBranch, where

SVal X = PrevState->getSVal(Condition, PredI->getLocationContext());

returns Unknown for the remainder condition. The getSVal ends up looking in
a map for the value result, so I found that bindExpr fills that map.

Going back a bit, into ExprEngine::Visit, when evaluating a
Stmt::BinaryOperatorClass (regardless of the eagerly assume flag)
ExprEngine::VisitBinaryOperator is called, which eventually calls evalBinOp
and, since it doesn't understand remainder, it returns unknown and BindExpr
is never called.

Back to ExprEngine::processBranch, when the symbol is unknown the following
piece of code is called:

    // If the condition is still unknown, give up.
    if (X.isUnknownOrUndef()) {
      builder.generateNode(PrevState, true, PredI);
      builder.generateNode(PrevState, false, PredI);
      continue;
    }

and the condition is simply ignored. When the result is defined, it creates
two nodes assuming the constraints.

~

My idea is when the SVal is undef or unknown, instead of generating two
nodes with no  knowledge about the constraints, we could create two ranged
constraints, like:

i % width: [0,0]

and

i % width: [1,1]

for each path. That way we can keep the constraints with reasonable values.

What do you think? It feels like this will break stuff further down the
line, but I'll know for sure if I implement the change.

~

Artem's response:

Yep, i strongly believe that any UnknownVal should be treated as a
synonym of "not implemented".

In *this* example you might also notice that there's no symbol for 'i',
but it's a concrete integer 0 instead. So you can evaluate the whole
remainder to 0, unless 'width' is also equal to 0 (in which case the
answer would be UndefinedVal). Also note that when enabled, DivZero
checker will refute the 'width == 0' branch.

In the general case you have no choice but to produce an IntSymExpr (if
'i' is a concrete integer other than 0) or a SymSymExpr (if 'i' is a
symbol).


Em qui, 24 de mai de 2018 às 18:15, Mikhail Ramalho <
mikhail.ramalho at gmail.com> escreveu:

>
> Total Time time.analyzer.time.sys (s) (mean) time.analyzer.time.user (s)
> (mean) time.analyzer.time.wall (s) (mean)
>
> Reported bugs
> Tmux 99.2 0.076 27.253 7.656
>
> 32
> Tmux + z3 152.88 0.074 56.251 11.505
>
> 32
> Ratio 154.11% 97.37% 206.40% 150.27%
> Diff 0
>
>
>
>
>
>
>
>
> Redis 173.69 0.057 7.083 7.271
>
> 146
> Redis + z3 193.43 0.057 7.621 7.728
>
> 140
> Ratio 111.37% 100.00% 107.60% 106.29%
> Diff 6
>
>
>
>
>
>
>
>
> OpenSSL 264.93 0.042 3.31 3.412
>
> 204
> OpenSSL + z3 213.53 0.035 3.099 3.152
>
> 204
> Ratio 80.60% 83.33% 93.63% 92.38%
> Diff 0
>
>
>
>
>
>
>
>
> Twin 143.17 0.067 6.593 6.696
>
> 138
> Twin + z3 133.83 0.06 6.79 6.882
>
> 138
> Ratio 93.48% 89.55% 102.99% 102.78%
> Diff 0
>
>
>
>
>
>
>
>
> Git + z3 333.9 0.075 8.52 8.67
>
> 96
> Git + z3 289.59 0.062 7.924 8.023
>
> 90
> Ratio 86.73% 82.67% 93.00% 92.54%
> Diff 6
>
>
>
>
>
>
>
>
> Postgresql 889.35 0.079 8.482 8.631
>
> 676
> Postgresql + z3 902.86 0.077 9.694 9.863
>
> 676
> Ratio 101.52% 97.47% 114.29% 114.27%
> Diff 0
>
>
>
>
>
>
>
>
> Sqlite3 1206.3 0.262 368.446 370.786
>
> 200
> Sqlite3 + z3 1260.85 0.43 407.763 409.688
>
> 199
> Ratio 104.52% 164.12% 110.67% 110.49%
> Diff 1
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Average 104.62% 102.07% 118.37% 109.86%
>
>
>
>
> 2018-05-24 15:11 GMT+01:00 Mikhail Ramalho <mikhail.ramalho at gmail.com>:
>
>> Hi all,
>>
>> This is my first report to the community, comparing the results with and
>> without the Z3 refutation when analyzing a number of projects.
>>
>> ~
>>
>> First of all, I'd like to thank Réka Kovács as the first version of the
>> refutation using Z3 was created by her (https://reviews.llvm.org/D45517)!
>> Thank you very much!
>>
>> After applying patch D45517, you can use the refutation check by
>> using -analyzer-config crosscheck-with-z3=true. Obviously, you need a
>> version of clang built with Z3.
>>
>> ~
>>
>> I'm currently analyzing 7 C projects (unfortunately, there's a bug
>> preventing us from analyzing FFmpeg):
>>
>> 1. Tmux
>> 2. Redis
>> 3. OpenSSL
>> 4. Twin
>> 5. Git
>> 6. Postgresql
>> 7. Sqlite3
>>
>> In short, the refutation check slows down the verification by ~20%. It
>> removed 6 FPs from Redis, 6 FPs from git and 1 FP from Sqlite3 (FP means
>> false positive). Surprisingly enough, some analysis were faster with the
>> crosscheck, but I'm not sure why (maybe ccache?).
>>
>> Attached is a spreadsheet (report1.ods) with some number (total time,
>> average time per check, # of reported bugs) and a txt with all the raw data
>> from the analysis (raw.txt). I'll add these data to google drive for the
>> next report.
>>
>> In order to generate the raw data, you need to use a version of clang
>> with assertions enabled, call scan-build.py with '-analyzer-config
>> serialize-stats=true' and you need to apply patch
>> https://reviews.llvm.org/D43134.
>>
>> Thank you very much,
>>
>>
>> 2018-05-01 15:27 GMT+01:00 Mikhail Ramalho <mikhail.ramalho at gmail.com>:
>>
>>> Hello all,
>>>
>>> My proposal for GSoC 2018 [0] about Bug Validation in the Clang Static
>>> Analyzer using the Z3 SMT Solver was accepted.
>>>
>>> I'll work with George Karpenkov to improve the bug reports that the
>>> static analyzer produces by reducing the number of false bugs.
>>>
>>> Thank you,
>>>
>>> [0]
>>> https://docs.google.com/document/d/1-zNSv0l4WyoxYpJUAw8LFnQq_TY4AGjIpPu1VPkmO-g
>>>
>>> --
>>>
>>> Mikhail Ramalho.
>>>
>>
>>
>>
>> --
>>
>> Mikhail Ramalho.
>>
>
>
>
> --
>
> Mikhail Ramalho.
>


-- 

Mikhail Ramalho.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180626/111f7668/attachment.html>


More information about the cfe-dev mailing list