<div dir="ltr">Hi,<div><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Another solution, since you're using Z3 already, is to implement runtime<br>

support for querying the underlying ConstraintManager's about the types<br>

of constraints that it supports (e.g. canReasonAboutFoo() ). Then, you<br>

can use this to generate the appropriate SVal's at runtime, which could<br>

include support for symbolic extension/truncation, remainder, shifts, etc.<br>

<br></blockquote><div><br></div><div>I see, but for the project I proposed to check why the constraints were being dropped, regardless of having the SMT solver enabled or not.</div><div><br></div><div>~</div><div><br></div><div>Anyway, I removed 4 lines of code and it seems to be working:</div><div><br></div><div><div>diff --git a/lib/StaticAnalyzer/Core/SValBuilder.cpp b/lib/StaticAnalyzer/Core/SValBuilder.cpp</div><div>index 137e9a7..39051a8 100644</div><div>--- a/lib/StaticAnalyzer/Core/SValBuilder.cpp</div><div>+++ b/lib/StaticAnalyzer/Core/SValBuilder.cpp</div><div>@@ -377,9 +377,6 @@ SVal SValBuilder::makeSymExprValNN(ProgramStateRef State,</div><div>                                    BinaryOperator::Opcode Op,</div><div>                                    NonLoc LHS, NonLoc RHS,</div><div>                                    QualType ResultTy) {</div><div>-  if (!State->isTainted(RHS) && !State->isTainted(LHS))</div><div>-    return UnknownVal();</div><div>-</div><div>   const SymExpr *symLHS = LHS.getAsSymExpr();</div><div>   const SymExpr *symRHS = RHS.getAsSymExpr();</div><div>   // TODO: When the Max Complexity is reached, we should conjure a symbol</div><div>@@ -406,9 +403,6 @@ SVal SValBuilder::evalBinOp(ProgramStateRef state, BinaryOperator::Opcode op,</div><div>   if (lhs.isUndef() || rhs.isUndef())</div><div>     return UndefinedVal();</div><div> </div><div>-  if (lhs.isUnknown() || rhs.isUnknown())</div><div>-    return UnknownVal();</div><div>-</div><div>   if (lhs.getAs<nonloc::LazyCompoundVal>() ||</div><div>       rhs.getAs<nonloc::LazyCompoundVal>()) {</div><div>     return UnknownVal();</div></div><div><br></div><div>The static analyzer now generates the following constraints for each path:</div><div><br></div><div><div>0U % (reg_$0<unsigned int width>): { [1, 4294967295] }</div></div><div><br></div><div>and </div><div><br></div><div><div>0U % (reg_$0<unsigned int width>): { [0, 0] }</div></div><div><br></div><div>And Z3 is able to remove the false positive.</div><div><br></div><div>Now I need to understand the impact of the changes, specially the taint check removal.</div><div><br></div><div>Thank you,</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>

Dominic<br>

<br>

On 6/26/2018 1:23 PM, Mikhail Ramalho via cfe-dev wrote:<br>

> Hi all,<br>

><br>

> Hi guys,<br>

><br>

> I'm investigating the constraints being dropped, here's what I got so far.<br>

><br>

> First of all, I'm using the following program:<br>

><br>

> void foo(unsigned width)<br>

> {<br>

>   int base;<br>

>   int i = 0;<br>

><br>

>   if (!(i % width))<br>

>     base = 1;<br>

><br>

>   if(base == 1)<br>

>     abort();<br>

> }<br>

><br>

> I started by looking at ExprEngine::processBranch, where <br>

><br>

> SVal X = PrevState->getSVal(Condition, PredI->getLocationContext());<br>

><br>

> returns Unknown for the remainder condition. The getSVal ends up<br>

> looking in a map for the value result, so I found that bindExpr fills<br>

> that map.<br>

><br>

> Going back a bit, into ExprEngine::Visit, when evaluating a<br>

> Stmt::BinaryOperatorClass (regardless of the eagerly assume flag)<br>

> ExprEngine::VisitBinaryOperator is called, which eventually calls<br>

> evalBinOp and, since it doesn't understand remainder, it returns<br>

> unknown and BindExpr is never called.<br>

><br>

> Back to ExprEngine::processBranch, when the symbol is unknown the<br>

> following piece of code is called:<br>

><br>

>     // If the condition is still unknown, give up.<br>

>     if (X.isUnknownOrUndef()) {<br>

>       builder.generateNode(PrevState, true, PredI);<br>

>       builder.generateNode(PrevState, false, PredI);<br>

>       continue;<br>

>     }<br>

><br>

> and the condition is simply ignored. When the result is defined, it<br>

> creates two nodes assuming the constraints.<br>

><br>

> ~<br>

><br>

> My idea is when the SVal is undef or unknown, instead of generating<br>

> two nodes with no  knowledge about the constraints, we could create<br>

> two ranged constraints, like:<br>

><br>

> i % width: [0,0]<br>

><br>

> and <br>

><br>

> i % width: [1,1]<br>

><br>

> for each path. That way we can keep the constraints with reasonable<br>

> values.<br>

><br>

> What do you think? It feels like this will break stuff further down<br>

> the line, but I'll know for sure if I implement the change.<br>

><br>

> ~<br>

><br>

> Artem's response:<br>

><br>

> Yep, i strongly believe that any UnknownVal should be treated as a <br>

> synonym of "not implemented".<br>

><br>

> In *this* example you might also notice that there's no symbol for 'i', <br>

> but it's a concrete integer 0 instead. So you can evaluate the whole <br>

> remainder to 0, unless 'width' is also equal to 0 (in which case the <br>

> answer would be UndefinedVal). Also note that when enabled, DivZero <br>

> checker will refute the 'width == 0' branch.<br>

><br>

> In the general case you have no choice but to produce an IntSymExpr (if <br>

> 'i' is a concrete integer other than 0) or a SymSymExpr (if 'i' is a <br>

> symbol).<br>

><br>

><br>

> Em qui, 24 de mai de 2018 às 18:15, Mikhail Ramalho<br>

> <<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a> <mailto:<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a>>> escreveu:<br>

><br>

><br>

>       Total Time      time.analyzer.time.sys (s) (mean)<br>

>     time.analyzer.time.user (s) (mean)        time.analyzer.time.wall (s)<br>

>     (mean)    <br>

>       <br>

>       Reported bugs<br>

>     Tmux      99.2    0.076   27.253  7.656   <br>

>       <br>

>       32<br>

>     Tmux + z3         152.88  0.074   56.251  11.505  <br>

>       <br>

>       32<br>

>     Ratio     154.11%         97.37%  206.40%         150.27%         <br>

>       Diff    0<br>

><br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>     Redis     173.69  0.057   7.083   7.271   <br>

>       <br>

>       146<br>

>     Redis + z3        193.43  0.057   7.621   7.728   <br>

>       <br>

>       140<br>

>     Ratio     111.37%         100.00%         107.60%         106.29%         <br>

>       Diff    6<br>

><br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>     OpenSSL   264.93  0.042   3.31    3.412   <br>

>       <br>

>       204<br>

>     OpenSSL + z3      213.53  0.035   3.099   3.152   <br>

>       <br>

>       204<br>

>     Ratio     80.60%  83.33%  93.63%  92.38%  <br>

>       Diff    0<br>

><br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>     Twin      143.17  0.067   6.593   6.696   <br>

>       <br>

>       138<br>

>     Twin + z3         133.83  0.06    6.79    6.882   <br>

>       <br>

>       138<br>

>     Ratio     93.48%  89.55%  102.99%         102.78%         <br>

>       Diff    0<br>

><br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>     Git + z3  333.9   0.075   8.52    8.67    <br>

>       <br>

>       96<br>

>     Git + z3  289.59  0.062   7.924   8.023   <br>

>       <br>

>       90<br>

>     Ratio     86.73%  82.67%  93.00%  92.54%  <br>

>       Diff    6<br>

><br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>     Postgresql        889.35  0.079   8.482   8.631   <br>

>       <br>

>       676<br>

>     Postgresql + z3   902.86  0.077   9.694   9.863   <br>

>       <br>

>       676<br>

>     Ratio     101.52%         97.47%  114.29%         114.27%         <br>

>       Diff    0<br>

><br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>     Sqlite3   1206.3  0.262   368.446         370.786         <br>

>       <br>

>       200<br>

>     Sqlite3 + z3      1260.85         0.43    407.763         409.688         <br>

>       <br>

>       199<br>

>     Ratio     104.52%         164.12%         110.67%         110.49%         <br>

>       Diff    1<br>

><br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

><br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>       <br>

>     Average   104.62%         102.07%         118.37%         109.86%         <br>

>       <br>

>       <br>

><br>

><br>

><br>

>     2018-05-24 15:11 GMT+01:00 Mikhail Ramalho<br>

>     <<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a> <mailto:<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a>>>:<br>

><br>

>         Hi all,<br>

><br>

>         This is my first report to the community, comparing the<br>

>         results with and without the Z3 refutation when analyzing a<br>

>         number of projects.<br>

><br>

>         ~<br>

><br>

>         First of all, I'd like to thank Réka Kovács as the first<br>

>         version of the refutation using Z3 was created by her<br>

>         (<a href="https://reviews.llvm.org/D45517" rel="noreferrer" target="_blank">https://reviews.llvm.org/D45517</a>)! Thank you very much!<br>

><br>

>         After applying patch D45517, you can use the refutation check<br>

>         by using -analyzer-config crosscheck-with-z3=true. Obviously,<br>

>         you need a version of clang built with Z3.<br>

><br>

>         ~<br>

><br>

>         I'm currently analyzing 7 C projects (unfortunately, there's a<br>

>         bug preventing us from analyzing FFmpeg):<br>

><br>

>         1. Tmux<br>

>         2. Redis <br>

>         3. OpenSSL<br>

>         4. Twin<br>

>         5. Git<br>

>         6. Postgresql<br>

>         7. Sqlite3<br>

><br>

>         In short, the refutation check slows down the verification by<br>

>         ~20%. It removed 6 FPs from Redis, 6 FPs from git and 1 FP<br>

>         from Sqlite3 (FP means false positive). Surprisingly enough,<br>

>         some analysis were faster with the crosscheck, but I'm not<br>

>         sure why (maybe ccache?).<br>

><br>

>         Attached is a spreadsheet (report1.ods) with some number<br>

>         (total time, average time per check, # of reported bugs) and a<br>

>         txt with all the raw data from the analysis (raw.txt). I'll<br>

>         add these data to google drive for the next report.<br>

><br>

>         In order to generate the raw data, you need to use a version<br>

>         of clang with assertions enabled, call scan-build.py with<br>

>         '-analyzer-config serialize-stats=true' and you need to apply<br>

>         patch <a href="https://reviews.llvm.org/D43134" rel="noreferrer" target="_blank">https://reviews.llvm.org/D43134</a>.<br>

><br>

>         Thank you very much,<br>

><br>

><br>

>         2018-05-01 15:27 GMT+01:00 Mikhail Ramalho<br>

>         <<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a> <mailto:<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a>>>:<br>

><br>

>             Hello all,<br>

><br>

>             My proposal for GSoC 2018 [0] about Bug Validation in the<br>

>             Clang Static Analyzer using the Z3 SMT Solver was accepted.<br>

><br>

>             I'll work with George Karpenkov to improve the bug reports<br>

>             that the static analyzer produces by reducing the number<br>

>             of false bugs.<br>

><br>

>             Thank you,<br>

><br>

>             [0] <a href="https://docs.google.com/document/d/1-zNSv0l4WyoxYpJUAw8LFnQq_TY4AGjIpPu1VPkmO-g" rel="noreferrer" target="_blank">https://docs.google.com/document/d/1-zNSv0l4WyoxYpJUAw8LFnQq_TY4AGjIpPu1VPkmO-g</a><br>

><br>

>             -- <br>

><br>

>             Mikhail Ramalho.<br>

><br>

><br>

><br>

><br>

>         -- <br>

><br>

>         Mikhail Ramalho.<br>

><br>

><br>

><br>

><br>

>     -- <br>

><br>

>     Mikhail Ramalho.<br>

><br>

><br>

><br>

> -- <br>

><br>

> Mikhail Ramalho.<br>

><br>

><br>

> _______________________________________________<br>

> cfe-dev mailing list<br>

> <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

<br>

<br>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><br></div><div>Mikhail Ramalho.</div></div></div></div>