<div dir="ltr">Hi,<div><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Another solution, since you're using Z3 already, is to implement runtime<br>
support for querying the underlying ConstraintManager's about the types<br>
of constraints that it supports (e.g. canReasonAboutFoo() ). Then, you<br>
can use this to generate the appropriate SVal's at runtime, which could<br>
include support for symbolic extension/truncation, remainder, shifts, etc.<br>
<br></blockquote><div><br></div><div>I see, but for the project I proposed to check why the constraints were being dropped, regardless of having the SMT solver enabled or not.</div><div><br></div><div>~</div><div><br></div><div>Anyway, I removed 4 lines of code and it seems to be working:</div><div><br></div><div><div>diff --git a/lib/StaticAnalyzer/Core/SValBuilder.cpp b/lib/StaticAnalyzer/Core/SValBuilder.cpp</div><div>index 137e9a7..39051a8 100644</div><div>--- a/lib/StaticAnalyzer/Core/SValBuilder.cpp</div><div>+++ b/lib/StaticAnalyzer/Core/SValBuilder.cpp</div><div>@@ -377,9 +377,6 @@ SVal SValBuilder::makeSymExprValNN(ProgramStateRef State,</div><div> BinaryOperator::Opcode Op,</div><div> NonLoc LHS, NonLoc RHS,</div><div> QualType ResultTy) {</div><div>- if (!State->isTainted(RHS) && !State->isTainted(LHS))</div><div>- return UnknownVal();</div><div>-</div><div> const SymExpr *symLHS = LHS.getAsSymExpr();</div><div> const SymExpr *symRHS = RHS.getAsSymExpr();</div><div> // TODO: When the Max Complexity is reached, we should conjure a symbol</div><div>@@ -406,9 +403,6 @@ SVal SValBuilder::evalBinOp(ProgramStateRef state, BinaryOperator::Opcode op,</div><div> if (lhs.isUndef() || rhs.isUndef())</div><div> return UndefinedVal();</div><div> </div><div>- if (lhs.isUnknown() || rhs.isUnknown())</div><div>- return UnknownVal();</div><div>-</div><div> if (lhs.getAs<nonloc::LazyCompoundVal>() ||</div><div> rhs.getAs<nonloc::LazyCompoundVal>()) {</div><div> return UnknownVal();</div></div><div><br></div><div>The static analyzer now generates the following constraints for each path:</div><div><br></div><div><div>0U % (reg_$0<unsigned int width>): { [1, 4294967295] }</div></div><div><br></div><div>and </div><div><br></div><div><div>0U % (reg_$0<unsigned int width>): { [0, 0] }</div></div><div><br></div><div>And Z3 is able to remove the false positive.</div><div><br></div><div>Now I need to understand the impact of the changes, specially the taint check removal.</div><div><br></div><div>Thank you,</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
Dominic<br>
<br>
On 6/26/2018 1:23 PM, Mikhail Ramalho via cfe-dev wrote:<br>
> Hi all,<br>
><br>
> Hi guys,<br>
><br>
> I'm investigating the constraints being dropped, here's what I got so far.<br>
><br>
> First of all, I'm using the following program:<br>
><br>
> void foo(unsigned width)<br>
> {<br>
> int base;<br>
> int i = 0;<br>
><br>
> if (!(i % width))<br>
> base = 1;<br>
><br>
> if(base == 1)<br>
> abort();<br>
> }<br>
><br>
> I started by looking at ExprEngine::processBranch, where <br>
><br>
> SVal X = PrevState->getSVal(Condition, PredI->getLocationContext());<br>
><br>
> returns Unknown for the remainder condition. The getSVal ends up<br>
> looking in a map for the value result, so I found that bindExpr fills<br>
> that map.<br>
><br>
> Going back a bit, into ExprEngine::Visit, when evaluating a<br>
> Stmt::BinaryOperatorClass (regardless of the eagerly assume flag)<br>
> ExprEngine::VisitBinaryOperator is called, which eventually calls<br>
> evalBinOp and, since it doesn't understand remainder, it returns<br>
> unknown and BindExpr is never called.<br>
><br>
> Back to ExprEngine::processBranch, when the symbol is unknown the<br>
> following piece of code is called:<br>
><br>
> // If the condition is still unknown, give up.<br>
> if (X.isUnknownOrUndef()) {<br>
> builder.generateNode(PrevState, true, PredI);<br>
> builder.generateNode(PrevState, false, PredI);<br>
> continue;<br>
> }<br>
><br>
> and the condition is simply ignored. When the result is defined, it<br>
> creates two nodes assuming the constraints.<br>
><br>
> ~<br>
><br>
> My idea is when the SVal is undef or unknown, instead of generating<br>
> two nodes with no knowledge about the constraints, we could create<br>
> two ranged constraints, like:<br>
><br>
> i % width: [0,0]<br>
><br>
> and <br>
><br>
> i % width: [1,1]<br>
><br>
> for each path. That way we can keep the constraints with reasonable<br>
> values.<br>
><br>
> What do you think? It feels like this will break stuff further down<br>
> the line, but I'll know for sure if I implement the change.<br>
><br>
> ~<br>
><br>
> Artem's response:<br>
><br>
> Yep, i strongly believe that any UnknownVal should be treated as a <br>
> synonym of "not implemented".<br>
><br>
> In *this* example you might also notice that there's no symbol for 'i', <br>
> but it's a concrete integer 0 instead. So you can evaluate the whole <br>
> remainder to 0, unless 'width' is also equal to 0 (in which case the <br>
> answer would be UndefinedVal). Also note that when enabled, DivZero <br>
> checker will refute the 'width == 0' branch.<br>
><br>
> In the general case you have no choice but to produce an IntSymExpr (if <br>
> 'i' is a concrete integer other than 0) or a SymSymExpr (if 'i' is a <br>
> symbol).<br>
><br>
><br>
> Em qui, 24 de mai de 2018 às 18:15, Mikhail Ramalho<br>
> <<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a> <mailto:<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a>>> escreveu:<br>
><br>
><br>
> Total Time time.analyzer.time.sys (s) (mean)<br>
> time.analyzer.time.user (s) (mean) time.analyzer.time.wall (s)<br>
> (mean) <br>
> <br>
> Reported bugs<br>
> Tmux 99.2 0.076 27.253 7.656 <br>
> <br>
> 32<br>
> Tmux + z3 152.88 0.074 56.251 11.505 <br>
> <br>
> 32<br>
> Ratio 154.11% 97.37% 206.40% 150.27% <br>
> Diff 0<br>
><br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> Redis 173.69 0.057 7.083 7.271 <br>
> <br>
> 146<br>
> Redis + z3 193.43 0.057 7.621 7.728 <br>
> <br>
> 140<br>
> Ratio 111.37% 100.00% 107.60% 106.29% <br>
> Diff 6<br>
><br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> OpenSSL 264.93 0.042 3.31 3.412 <br>
> <br>
> 204<br>
> OpenSSL + z3 213.53 0.035 3.099 3.152 <br>
> <br>
> 204<br>
> Ratio 80.60% 83.33% 93.63% 92.38% <br>
> Diff 0<br>
><br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> Twin 143.17 0.067 6.593 6.696 <br>
> <br>
> 138<br>
> Twin + z3 133.83 0.06 6.79 6.882 <br>
> <br>
> 138<br>
> Ratio 93.48% 89.55% 102.99% 102.78% <br>
> Diff 0<br>
><br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> Git + z3 333.9 0.075 8.52 8.67 <br>
> <br>
> 96<br>
> Git + z3 289.59 0.062 7.924 8.023 <br>
> <br>
> 90<br>
> Ratio 86.73% 82.67% 93.00% 92.54% <br>
> Diff 6<br>
><br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> Postgresql 889.35 0.079 8.482 8.631 <br>
> <br>
> 676<br>
> Postgresql + z3 902.86 0.077 9.694 9.863 <br>
> <br>
> 676<br>
> Ratio 101.52% 97.47% 114.29% 114.27% <br>
> Diff 0<br>
><br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> Sqlite3 1206.3 0.262 368.446 370.786 <br>
> <br>
> 200<br>
> Sqlite3 + z3 1260.85 0.43 407.763 409.688 <br>
> <br>
> 199<br>
> Ratio 104.52% 164.12% 110.67% 110.49% <br>
> Diff 1<br>
><br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
><br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> <br>
> Average 104.62% 102.07% 118.37% 109.86% <br>
> <br>
> <br>
><br>
><br>
><br>
> 2018-05-24 15:11 GMT+01:00 Mikhail Ramalho<br>
> <<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a> <mailto:<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a>>>:<br>
><br>
> Hi all,<br>
><br>
> This is my first report to the community, comparing the<br>
> results with and without the Z3 refutation when analyzing a<br>
> number of projects.<br>
><br>
> ~<br>
><br>
> First of all, I'd like to thank Réka Kovács as the first<br>
> version of the refutation using Z3 was created by her<br>
> (<a href="https://reviews.llvm.org/D45517" rel="noreferrer" target="_blank">https://reviews.llvm.org/D45517</a>)! Thank you very much!<br>
><br>
> After applying patch D45517, you can use the refutation check<br>
> by using -analyzer-config crosscheck-with-z3=true. Obviously,<br>
> you need a version of clang built with Z3.<br>
><br>
> ~<br>
><br>
> I'm currently analyzing 7 C projects (unfortunately, there's a<br>
> bug preventing us from analyzing FFmpeg):<br>
><br>
> 1. Tmux<br>
> 2. Redis <br>
> 3. OpenSSL<br>
> 4. Twin<br>
> 5. Git<br>
> 6. Postgresql<br>
> 7. Sqlite3<br>
><br>
> In short, the refutation check slows down the verification by<br>
> ~20%. It removed 6 FPs from Redis, 6 FPs from git and 1 FP<br>
> from Sqlite3 (FP means false positive). Surprisingly enough,<br>
> some analysis were faster with the crosscheck, but I'm not<br>
> sure why (maybe ccache?).<br>
><br>
> Attached is a spreadsheet (report1.ods) with some number<br>
> (total time, average time per check, # of reported bugs) and a<br>
> txt with all the raw data from the analysis (raw.txt). I'll<br>
> add these data to google drive for the next report.<br>
><br>
> In order to generate the raw data, you need to use a version<br>
> of clang with assertions enabled, call scan-build.py with<br>
> '-analyzer-config serialize-stats=true' and you need to apply<br>
> patch <a href="https://reviews.llvm.org/D43134" rel="noreferrer" target="_blank">https://reviews.llvm.org/D43134</a>.<br>
><br>
> Thank you very much,<br>
><br>
><br>
> 2018-05-01 15:27 GMT+01:00 Mikhail Ramalho<br>
> <<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a> <mailto:<a href="mailto:mikhail.ramalho@gmail.com" target="_blank">mikhail.ramalho@gmail.com</a>>>:<br>
><br>
> Hello all,<br>
><br>
> My proposal for GSoC 2018 [0] about Bug Validation in the<br>
> Clang Static Analyzer using the Z3 SMT Solver was accepted.<br>
><br>
> I'll work with George Karpenkov to improve the bug reports<br>
> that the static analyzer produces by reducing the number<br>
> of false bugs.<br>
><br>
> Thank you,<br>
><br>
> [0] <a href="https://docs.google.com/document/d/1-zNSv0l4WyoxYpJUAw8LFnQq_TY4AGjIpPu1VPkmO-g" rel="noreferrer" target="_blank">https://docs.google.com/document/d/1-zNSv0l4WyoxYpJUAw8LFnQq_TY4AGjIpPu1VPkmO-g</a><br>
><br>
> -- <br>
><br>
> Mikhail Ramalho.<br>
><br>
><br>
><br>
><br>
> -- <br>
><br>
> Mikhail Ramalho.<br>
><br>
><br>
><br>
><br>
> -- <br>
><br>
> Mikhail Ramalho.<br>
><br>
><br>
><br>
> -- <br>
><br>
> Mikhail Ramalho.<br>
><br>
><br>
> _______________________________________________<br>
> cfe-dev mailing list<br>
> <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
<br>
<br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><br></div><div>Mikhail Ramalho.</div></div></div></div>