<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class="">Preliminary evaluation of a patch which prefers exploring nodes associated with statements which weren’t seen before first:</div><div class=""><br class=""></div>On openssl:<div class=""><br class=""></div><div class=""> - Adds four reports</div><div class=""> - Removes four reports </div><div class=""> - Path lengths before: 317, 75, 75, 72, 70, 58, 50, 50, 44, 36, 23, 23, 21, 21, 20, 20, 19, 19, 19, 19, 18, 18, 18, 16, 15, 15, 15, 14, 13, 13, 12, 11, 11, 9, 7, 7, 6, 4</div><div class=""> - Path lengths after: 72, 60, 59, 53, 53, 52, 46, 38, 37, 30, 29, 28, 23, 21, 20, 19, 19, 19, 19, 19, 18, 16, 15, 15, 15, 15, 13, 13, 12, 12, 11, 9, 8, 7, 7, 7, 6, 4</div><div class=""><br class=""></div><div class="">The quality of the added reports seems higher, mainly due to the fact that report length is shorter.</div><div class=""><br class=""></div><div class="">On postgresql:</div><div class=""><br class=""></div><div class=""> - Added 80 reports</div><div class=""> - Removed 154 reports</div><div class=""> -> Out of those, 72 are reports on the yacc/bison autogenerated files, so whatever the cause is, good thing they are gone</div><div class=""> - The overall number of reports is 1188</div><div class=""> - Path lengths are lower on overall, but not in such a dramatic way</div><div class=""> - For many reports, I am quite confused as to why they got removed</div><div class=""><br class=""></div><div class="">On sqlite:</div><div class=""><br class=""></div><div class=""> - 7 inserted, 7 removed</div><div class=""><br class=""></div><div class=""><div class=""><div><blockquote type="cite" class=""><div class="">On Jan 30, 2018, at 1:10 PM, Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com" class="">noqnoqneo@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">On 30/01/2018 12:40 PM, Gábor Horváth via cfe-dev wrote:</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;" class="">Hi George, Artem,<br class=""><br class="">I am glad that you are looking into this problem!<br class=""><br class="">On 30 January 2018 at 01:12, George Karpenkov via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><span class="Apple-converted-space"> </span><<a href="mailto:cfe-dev@lists.llvm.org" class="">mailto:cfe-dev@lists.llvm.org</a>>> wrote:<br class=""><br class=""> Hi All,<br class=""><br class=""> I was investigating recently bug reports with very long analyzer<br class=""> paths (more than a few hundred nodes).<br class=""> In many of such cases the path is long for no good reason: namely,<br class=""> the analyzer would go 3 times around the loop before<br class=""> going further.<br class=""> The issue is surprisingly common, and it was exacerbated with a<br class=""> recent bump of analyzer thresholds.<br class=""><br class=""> The problem is reproduced on the following file:<br class=""><br class=""> ```<br class=""> extern int coin();<br class=""><br class=""> int foo() {<br class=""> int *x = 0;<br class=""> while (coin()) {<br class=""> if (coin())<br class=""> return *x;<br class=""> }<br class=""> return 0;<br class=""> }<br class=""><br class=""> void bar() {<br class=""> while(coin())<br class=""> if (coin())<br class=""> foo();<br class=""> }<br class=""> ```<br class=""><br class=""> While a shortest path to the error does not loop around, the<br class=""> current version of the analyzer<br class=""> will go around the loop three times before going further.<br class=""> (and we are quite fortunate that the unrolling limit for loops is<br class=""> three, otherwise it would keep going<br class=""> until the unrolling limit is reached).<br class=""><br class=""> Multiple issues were discovered during the investigation.<br class=""><br class=""> 1. Analyzer queue does not have a concept of priority, and<br class=""> performs a simple DFS by default.<br class=""> Thus if the successor of the if-branch under the loop in “bar"<br class=""> containing the desired destination is generated second,<br class=""> it will never be evaluated until the loop exploration limit is<br class=""> exhausted.<br class=""><br class=""> 2. The previous issue slows down the exploration, but is not<br class=""> enough to get a pathological behavior of ultra-long paths.<br class=""> The second problem is a combination of:<br class=""> a) Block counter is not a part of a node's identity, and node A<br class=""> with a small block counter can be merged into a node B with a<br class=""> large block counter,<br class=""> and the resulting node will have a block counter associated with B.<br class=""><br class=""><br class="">Sorry for the questions, just wanted to clarify some things. You mean ExplodedNodes? By merge, you mean the same thing as "caching-out"?<br class=""></blockquote><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">Yeah, George notices that if we construct the same ExplodedNode on two different paths that have different block counts, we'd cache-out on the latter path, while the worklist element of the first path would still possess the original block count.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">Which happens a lot when we're evaluating foo() conservatively in this example.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">This isn't directly related to our problem though, as i noticed in<span class="Apple-converted-space"> </span></span><a href="http://lists.llvm.org/pipermail/cfe-dev/2018-January/056719.html" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">http://lists.llvm.org/pipermail/cfe-dev/2018-January/056719.html</a><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class=""><span class="Apple-converted-space"> </span>.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""> b) The issue in (a) is triggered due to our heuristic to abandon<br class=""> the function’s exploration and switch to conservative evaluation<br class=""> if we are already *inside* the function and the block limit has<br class=""> been reached.<br class=""><br class=""> Issue (1) combined with (2-b) causes the problematic behavior: the<br class=""> issue is discovered on the longest path first,<br class=""> and by the time the shortest path gets to “bar”, the block limit<br class=""> is already reached, and the switch to conservative evaluation is<br class=""> performed.<br class=""><br class=""> Thus there are two mitigation strategies currently being evaluated:<br class=""><br class=""> i) Remove the heuristic in (2-b)<br class=""> ii) Use a priority queue to hold nodes which should be explored;<br class=""> prefer nodes which give new source code coverage over others<br class=""> (or alternatively prefer nodes with least depth of loop stack)<br class=""><br class=""> Me and Artem have evaluated the option (i) and the results were<br class=""> surprisingly good: some reports disappear, and slightly more<br class=""> reports reappear.<br class=""> The quality of the new reports seems to be slightly better, and I<br class=""> am still trying to figure out exact reasons.<br class=""> I suspect merges resulting from heuristic (2-b) cause us to lose<br class=""> some actually valid reports.<br class=""><br class=""><br class="">I also find the results surprising. If you have more information about the reasons please do not forget to follow up this thread. We are curious :)<br class=""><br class=""><br class=""> Option (ii) has not been evaluated fully yet, but current<br class=""> experiments show slightly more reports (5-10%), and a radical<br class=""> decline in report lengths<br class=""> (e.g. from 400+ to <100 for largest reports)<br class=""><br class=""> Are there any thoughts on the matter?<br class=""><br class=""> Personally I think we should do both (i) and (ii), even if they<br class=""> would shake up the results.<br class=""> - The original idea for heuristics (2-b) was to be able to produce<br class=""> a report even if we are out of budget, but since it actually<br class=""> results in less reports,<br class=""> I think the data does not validate the approach.<br class=""><br class=""> - Option (ii) is AFAIK how most similar engines work, and should<br class=""> get us much larger coverage (and shorter paths) for the same node<br class=""> budget,<br class=""> even at the cost of log(N) overhead of the priority queue.<br class=""> Moreover, not having the priority queue will bite us later if we<br class=""> ever decide to further<br class=""> increase the analyzer budget or to increase the unroll limit.<br class=""><br class=""><br class="">I wonder what will the performance implication be. But I also like the idea of having a priority queue. If we find that we get more and better report<br class="">but also have worse performance, we can also consider reducing the analysis budget slightly.<br class=""><br class="">Regards,<br class="">Gábor<br class=""><br class=""><br class=""> George<br class=""><br class=""><br class=""><br class=""> _______________________________________________<br class=""> cfe-dev mailing list<br class=""> <a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><span class="Apple-converted-space"> </span><<a href="mailto:cfe-dev@lists.llvm.org" class="">mailto:cfe-dev@lists.llvm.org</a>><br class=""> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br class=""> <<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>><br class=""><br class=""><br class=""><br class=""><br class="">_______________________________________________<br class="">cfe-dev mailing list<br class=""><a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a></blockquote></div></blockquote></div><br class=""></div></div></body></html>