<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class="">The reason symbols are getting dropped, is because, currently, CSA doesn’t generate symbols for things that RangedConstraintManager doesn’t support, unless taint tracking is enabled. The querying that I mentioned is just a performance optimization, to check the capabilities of the ConstraintManager at runtime and avoid generating unsupported symbols.</div><div class=""><br class=""></div><div class="">If, as Artem suggests, the new policy is to generate accurate symbols, even if the underlying ConstraintManager doesn’t support reasoning over them, then the solution is to identify all the places where unknown values (over-approximations) are being generated, and replace them with the correct SVal. </div><div class=""><br class=""></div><div class="">In your patch below, removing the first part is correct. In fact, I have the same change in D35450. But, the second isn’t, because at that point you already have an unknown value, and must propagate it to maintain correctness of the analysis. </div><div class=""><br class=""></div><div class="">The reason I mention D28955 and D35450, is because I’ve already started on some of these changes, so you don’t have to do everything. I’ll try to rebase and update them this weekend.<br class=""><div class=""><br class=""></div><div class="">Dominic<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Jun 26, 2018, at 2:37 PM, Mikhail Ramalho <<a href="mailto:mikhail.ramalho@gmail.com" class="">mikhail.ramalho@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class="">Hi,<div class=""><br class=""></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;">Another solution, since you're using Z3 already, is to implement runtime<br class="">support for querying the underlying ConstraintManager's about the types<br class="">of constraints that it supports (e.g. canReasonAboutFoo() ). Then, you<br class="">can use this to generate the appropriate SVal's at runtime, which could<br class="">include support for symbolic extension/truncation, remainder, shifts, etc.<br class=""><br class=""></blockquote><div class=""><br class=""></div><div class="">I see, but for the project I proposed to check why the constraints were being dropped, regardless of having the SMT solver enabled or not.</div><div class=""><br class=""></div><div class="">~</div><div class=""><br class=""></div><div class="">Anyway, I removed 4 lines of code and it seems to be working:</div><div class=""><br class=""></div><div class=""><div class="">diff --git a/lib/StaticAnalyzer/Core/SValBuilder.cpp b/lib/StaticAnalyzer/Core/SValBuilder.cpp</div><div class="">index 137e9a7..39051a8 100644</div><div class="">--- a/lib/StaticAnalyzer/Core/SValBuilder.cpp</div><div class="">+++ b/lib/StaticAnalyzer/Core/SValBuilder.cpp</div><div class="">@@ -377,9 +377,6 @@ SVal SValBuilder::makeSymExprValNN(ProgramStateRef State,</div><div class=""> BinaryOperator::Opcode Op,</div><div class=""> NonLoc LHS, NonLoc RHS,</div><div class=""> QualType ResultTy) {</div><div class="">- if (!State->isTainted(RHS) && !State->isTainted(LHS))</div><div class="">- return UnknownVal();</div><div class="">-</div><div class=""> const SymExpr *symLHS = LHS.getAsSymExpr();</div><div class=""> const SymExpr *symRHS = RHS.getAsSymExpr();</div><div class=""> // TODO: When the Max Complexity is reached, we should conjure a symbol</div><div class="">@@ -406,9 +403,6 @@ SVal SValBuilder::evalBinOp(ProgramStateRef state, BinaryOperator::Opcode op,</div><div class=""> if (lhs.isUndef() || rhs.isUndef())</div><div class=""> return UndefinedVal();</div><div class=""> </div><div class="">- if (lhs.isUnknown() || rhs.isUnknown())</div><div class="">- return UnknownVal();</div><div class="">-</div><div class=""> if (lhs.getAs<nonloc::LazyCompoundVal>() ||</div><div class=""> rhs.getAs<nonloc::LazyCompoundVal>()) {</div><div class=""> return UnknownVal();</div></div><div class=""><br class=""></div><div class="">The static analyzer now generates the following constraints for each path:</div><div class=""><br class=""></div><div class=""><div class="">0U % (reg_$0<unsigned int width>): { [1, 4294967295] }</div></div><div class=""><br class=""></div><div class="">and </div><div class=""><br class=""></div><div class=""><div class="">0U % (reg_$0<unsigned int width>): { [0, 0] }</div></div><div class=""><br class=""></div><div class="">And Z3 is able to remove the false positive.</div><div class=""><br class=""></div><div class="">Now I need to understand the impact of the changes, specially the taint check removal.</div><div class=""><br class=""></div><div class="">Thank you,</div><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><br class="">Dominic<br class=""><br class="">On 6/26/2018 1:23 PM, Mikhail Ramalho via cfe-dev wrote:<br class="">> Hi all,<br class="">><br class="">> Hi guys,<br class="">><br class="">> I'm investigating the constraints being dropped, here's what I got so far.<br class="">><br class="">> First of all, I'm using the following program:<br class="">><br class="">> void foo(unsigned width)<br class="">> {<br class="">> int base;<br class="">> int i = 0;<br class="">><br class="">> if (!(i % width))<br class="">> base = 1;<br class="">><br class="">> if(base == 1)<br class="">> abort();<br class="">> }<br class="">><br class="">> I started by looking at ExprEngine::processBranch, where <br class="">><br class="">> SVal X = PrevState->getSVal(Condition, PredI->getLocationContext());<br class="">><br class="">> returns Unknown for the remainder condition. The getSVal ends up<br class="">> looking in a map for the value result, so I found that bindExpr fills<br class="">> that map.<br class="">><br class="">> Going back a bit, into ExprEngine::Visit, when evaluating a<br class="">> Stmt::BinaryOperatorClass (regardless of the eagerly assume flag)<br class="">> ExprEngine::VisitBinaryOperator is called, which eventually calls<br class="">> evalBinOp and, since it doesn't understand remainder, it returns<br class="">> unknown and BindExpr is never called.<br class="">><br class="">> Back to ExprEngine::processBranch, when the symbol is unknown the<br class="">> following piece of code is called:<br class="">><br class="">> // If the condition is still unknown, give up.<br class="">> if (X.isUnknownOrUndef()) {<br class="">> builder.generateNode(PrevState, true, PredI);<br class="">> builder.generateNode(PrevState, false, PredI);<br class="">> continue;<br class="">> }<br class="">><br class="">> and the condition is simply ignored. When the result is defined, it<br class="">> creates two nodes assuming the constraints.<br class="">><br class="">> ~<br class="">><br class="">> My idea is when the SVal is undef or unknown, instead of generating<br class="">> two nodes with no knowledge about the constraints, we could create<br class="">> two ranged constraints, like:<br class="">><br class="">> i % width: [0,0]<br class="">><br class="">> and <br class="">><br class="">> i % width: [1,1]<br class="">><br class="">> for each path. That way we can keep the constraints with reasonable<br class="">> values.<br class="">><br class="">> What do you think? It feels like this will break stuff further down<br class="">> the line, but I'll know for sure if I implement the change.<br class="">><br class="">> ~<br class="">><br class="">> Artem's response:<br class="">><br class="">> Yep, i strongly believe that any UnknownVal should be treated as a <br class="">> synonym of "not implemented".<br class="">><br class="">> In *this* example you might also notice that there's no symbol for 'i', <br class="">> but it's a concrete integer 0 instead. So you can evaluate the whole <br class="">> remainder to 0, unless 'width' is also equal to 0 (in which case the <br class="">> answer would be UndefinedVal). Also note that when enabled, DivZero <br class="">> checker will refute the 'width == 0' branch.<br class="">><br class="">> In the general case you have no choice but to produce an IntSymExpr (if <br class="">> 'i' is a concrete integer other than 0) or a SymSymExpr (if 'i' is a <br class="">> symbol).<br class="">><br class="">><br class="">> Em qui, 24 de mai de 2018 às 18:15, Mikhail Ramalho<br class="">> <<a href="mailto:mikhail.ramalho@gmail.com" target="_blank" class="">mikhail.ramalho@gmail.com</a><span class="Apple-converted-space"> </span><mailto:<a href="mailto:mikhail.ramalho@gmail.com" target="_blank" class="">mikhail.ramalho@gmail.com</a>>> escreveu:<br class="">><br class="">><br class="">> Total Time time.analyzer.time.sys (s) (mean)<br class="">> time.analyzer.time.user (s) (mean) time.analyzer.time.wall (s)<br class="">> (mean) <span class="Apple-converted-space"> </span><br class="">> <br class="">> Reported bugs<br class="">> Tmux 99.2 0.076 27.253 7.656 <br class="">> <br class="">> 32<br class="">> Tmux + z3 152.88 0.074 56.251 11.505 <span class="Apple-converted-space"> </span><br class="">> <br class="">> 32<br class="">> Ratio 154.11% 97.37% 206.40% 150.27% <br class="">> Diff 0<br class="">><br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> Redis 173.69 0.057 7.083 7.271 <br class="">> <br class="">> 146<br class="">> Redis + z3 193.43 0.057 7.621 7.728 <br class="">> <br class="">> 140<br class="">> Ratio 111.37% 100.00% 107.60% 106.29% <br class="">> Diff 6<br class="">><br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> OpenSSL 264.93 0.042 3.31 3.412 <br class="">> <br class="">> 204<br class="">> OpenSSL + z3 213.53 0.035 3.099 3.152 <br class="">> <br class="">> 204<br class="">> Ratio 80.60% 83.33% 93.63% 92.38% <span class="Apple-converted-space"> </span><br class="">> Diff 0<br class="">><br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> Twin 143.17 0.067 6.593 6.696 <br class="">> <br class="">> 138<br class="">> Twin + z3 133.83 0.06 6.79 6.882 <br class="">> <br class="">> 138<br class="">> Ratio 93.48% 89.55% 102.99% 102.78% <br class="">> Diff 0<br class="">><br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> Git + z3 333.9 0.075 8.52 8.67 <span class="Apple-converted-space"> </span><br class="">> <br class="">> 96<br class="">> Git + z3 289.59 0.062 7.924 8.023 <br class="">> <br class="">> 90<br class="">> Ratio 86.73% 82.67% 93.00% 92.54% <span class="Apple-converted-space"> </span><br class="">> Diff 6<br class="">><br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> Postgresql 889.35 0.079 8.482 8.631 <br class="">> <br class="">> 676<br class="">> Postgresql + z3 902.86 0.077 9.694 9.863 <br class="">> <br class="">> 676<br class="">> Ratio 101.52% 97.47% 114.29% 114.27% <br class="">> Diff 0<br class="">><br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> Sqlite3 1206.3 0.262 368.446 370.786 <br class="">> <br class="">> 200<br class="">> Sqlite3 + z3 1260.85 0.43 407.763 409.688 <br class="">> <br class="">> 199<br class="">> Ratio 104.52% 164.12% 110.67% 110.49% <br class="">> Diff 1<br class="">><br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">><br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> <br class="">> Average 104.62% 102.07% 118.37% 109.86% <br class="">> <br class="">> <br class="">><br class="">><br class="">><br class="">> 2018-05-24 15:11 GMT+01:00 Mikhail Ramalho<br class="">> <<a href="mailto:mikhail.ramalho@gmail.com" target="_blank" class="">mikhail.ramalho@gmail.com</a><span class="Apple-converted-space"> </span><mailto:<a href="mailto:mikhail.ramalho@gmail.com" target="_blank" class="">mikhail.ramalho@gmail.com</a>>>:<br class="">><br class="">> Hi all,<br class="">><br class="">> This is my first report to the community, comparing the<br class="">> results with and without the Z3 refutation when analyzing a<br class="">> number of projects.<br class="">><br class="">> ~<br class="">><br class="">> First of all, I'd like to thank Réka Kovács as the first<br class="">> version of the refutation using Z3 was created by her<br class="">> (<a href="https://reviews.llvm.org/D45517" rel="noreferrer" target="_blank" class="">https://reviews.llvm.org/D45517</a>)! Thank you very much!<br class="">><br class="">> After applying patch D45517, you can use the refutation check<br class="">> by using -analyzer-config crosscheck-with-z3=true. Obviously,<br class="">> you need a version of clang built with Z3.<br class="">><br class="">> ~<br class="">><br class="">> I'm currently analyzing 7 C projects (unfortunately, there's a<br class="">> bug preventing us from analyzing FFmpeg):<br class="">><br class="">> 1. Tmux<br class="">> 2. Redis <br class="">> 3. OpenSSL<br class="">> 4. Twin<br class="">> 5. Git<br class="">> 6. Postgresql<br class="">> 7. Sqlite3<br class="">><br class="">> In short, the refutation check slows down the verification by<br class="">> ~20%. It removed 6 FPs from Redis, 6 FPs from git and 1 FP<br class="">> from Sqlite3 (FP means false positive). Surprisingly enough,<br class="">> some analysis were faster with the crosscheck, but I'm not<br class="">> sure why (maybe ccache?).<br class="">><br class="">> Attached is a spreadsheet (report1.ods) with some number<br class="">> (total time, average time per check, # of reported bugs) and a<br class="">> txt with all the raw data from the analysis (raw.txt). I'll<br class="">> add these data to google drive for the next report.<br class="">><br class="">> In order to generate the raw data, you need to use a version<br class="">> of clang with assertions enabled, call scan-build.py with<br class="">> '-analyzer-config serialize-stats=true' and you need to apply<br class="">> patch <a href="https://reviews.llvm.org/D43134" rel="noreferrer" target="_blank" class="">https://reviews.llvm.org/D43134</a>.<br class="">><br class="">> Thank you very much,<br class="">><br class="">><br class="">> 2018-05-01 15:27 GMT+01:00 Mikhail Ramalho<br class="">> <<a href="mailto:mikhail.ramalho@gmail.com" target="_blank" class="">mikhail.ramalho@gmail.com</a><span class="Apple-converted-space"> </span><mailto:<a href="mailto:mikhail.ramalho@gmail.com" target="_blank" class="">mikhail.ramalho@gmail.com</a>>>:<br class="">><br class="">> Hello all,<br class="">><br class="">> My proposal for GSoC 2018 [0] about Bug Validation in the<br class="">> Clang Static Analyzer using the Z3 SMT Solver was accepted.<br class="">><br class="">> I'll work with George Karpenkov to improve the bug reports<br class="">> that the static analyzer produces by reducing the number<br class="">> of false bugs.<br class="">><br class="">> Thank you,<br class="">><br class="">> [0] <a href="https://docs.google.com/document/d/1-zNSv0l4WyoxYpJUAw8LFnQq_TY4AGjIpPu1VPkmO-g" rel="noreferrer" target="_blank" class="">https://docs.google.com/document/d/1-zNSv0l4WyoxYpJUAw8LFnQq_TY4AGjIpPu1VPkmO-g</a><br class="">><br class="">> --<span class="Apple-converted-space"> </span><br class="">><br class="">> Mikhail Ramalho.<br class="">><br class="">><br class="">><br class="">><br class="">> --<span class="Apple-converted-space"> </span><br class="">><br class="">> Mikhail Ramalho.<br class="">><br class="">><br class="">><br class="">><br class="">> --<span class="Apple-converted-space"> </span><br class="">><br class="">> Mikhail Ramalho.<br class="">><br class="">><br class="">><br class="">> --<span class="Apple-converted-space"> </span><br class="">><br class="">> Mikhail Ramalho.<br class="">><br class="">><br class="">> _______________________________________________<br class="">> cfe-dev mailing list<br class="">><span class="Apple-converted-space"> </span><a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a><br class="">><span class="Apple-converted-space"> </span><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br class=""><br class=""><br class=""></blockquote></div><br clear="all" class=""><div class=""><br class=""></div>--<span class="Apple-converted-space"> </span><br class=""><div dir="ltr" class="gmail_signature"><div dir="ltr" class=""><div class=""><br class=""></div><div class="">Mikhail Ramalho.</div></div></div></div></div></blockquote></div><br class=""></div></div></body></html>