[cfe-dev] [analyzer] exploration strategies and paths
George Karpenkov via cfe-dev
cfe-dev at lists.llvm.org
Wed Jan 31 17:55:24 PST 2018
Patch available at https://reviews.llvm.org/D42775
> On Jan 31, 2018, at 5:00 PM, George Karpenkov <ekarpenkov at apple.com> wrote:
>
> The list did not like a posting with many images,
> so I have posted an evaluation to phabricator: https://reviews.llvm.org/M1 <https://reviews.llvm.org/M1>
>
> The text part was:
>
> After fixing a few bugs, another evaluation of the approach shows considerably better results.
>
> On openssl:
>
> 9 reports added
> 1 report removed
> On postgresql:
>
> 377 reports added
> 43 reports removed
> On sqlite3 + a few other misc files:
>
> 239 reports added
> 1 report removed
> Note on histograms (here and below)
>
> -> Histograms only show the ratio for same bugs (compared using issue hash),
> that is, if the histogram says “decrease by a factor of three”, it means the new approach finds the *same* bug
> with a path size 1/3d of the original
> -> Histograms omit data points where the path length has remained the same
> (as otherwise they completely dominate the histogram)
> -> Relative histograms are provided as both ratio and logarithm of the ratio.
> Logarithms of the ratio are convenient as they are symmetric in case changes balance out
> (e.g. log(1/2) = -log(2/1))
>
>> On Jan 30, 2018, at 4:23 PM, George Karpenkov <ekarpenkov at apple.com <mailto:ekarpenkov at apple.com>> wrote:
>>
>> Preliminary evaluation of a patch which prefers exploring nodes associated with statements which weren’t seen before first:
>>
>> On openssl:
>>
>> - Adds four reports
>> - Removes four reports
>> - Path lengths before: 317, 75, 75, 72, 70, 58, 50, 50, 44, 36, 23, 23, 21, 21, 20, 20, 19, 19, 19, 19, 18, 18, 18, 16, 15, 15, 15, 14, 13, 13, 12, 11, 11, 9, 7, 7, 6, 4
>> - Path lengths after: 72, 60, 59, 53, 53, 52, 46, 38, 37, 30, 29, 28, 23, 21, 20, 19, 19, 19, 19, 19, 18, 16, 15, 15, 15, 15, 13, 13, 12, 12, 11, 9, 8, 7, 7, 7, 6, 4
>>
>> The quality of the added reports seems higher, mainly due to the fact that report length is shorter.
>>
>> On postgresql:
>>
>> - Added 80 reports
>> - Removed 154 reports
>> -> Out of those, 72 are reports on the yacc/bison autogenerated files, so whatever the cause is, good thing they are gone
>> - The overall number of reports is 1188
>> - Path lengths are lower on overall, but not in such a dramatic way
>> - For many reports, I am quite confused as to why they got removed
>>
>> On sqlite:
>>
>> - 7 inserted, 7 removed
>>
>>> On Jan 30, 2018, at 1:10 PM, Artem Dergachev <noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com>> wrote:
>>>
>>>
>>>
>>> On 30/01/2018 12:40 PM, Gábor Horváth via cfe-dev wrote:
>>>> Hi George, Artem,
>>>>
>>>> I am glad that you are looking into this problem!
>>>>
>>>> On 30 January 2018 at 01:12, George Karpenkov via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> <mailto:cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I was investigating recently bug reports with very long analyzer
>>>> paths (more than a few hundred nodes).
>>>> In many of such cases the path is long for no good reason: namely,
>>>> the analyzer would go 3 times around the loop before
>>>> going further.
>>>> The issue is surprisingly common, and it was exacerbated with a
>>>> recent bump of analyzer thresholds.
>>>>
>>>> The problem is reproduced on the following file:
>>>>
>>>> ```
>>>> extern int coin();
>>>>
>>>> int foo() {
>>>> int *x = 0;
>>>> while (coin()) {
>>>> if (coin())
>>>> return *x;
>>>> }
>>>> return 0;
>>>> }
>>>>
>>>> void bar() {
>>>> while(coin())
>>>> if (coin())
>>>> foo();
>>>> }
>>>> ```
>>>>
>>>> While a shortest path to the error does not loop around, the
>>>> current version of the analyzer
>>>> will go around the loop three times before going further.
>>>> (and we are quite fortunate that the unrolling limit for loops is
>>>> three, otherwise it would keep going
>>>> until the unrolling limit is reached).
>>>>
>>>> Multiple issues were discovered during the investigation.
>>>>
>>>> 1. Analyzer queue does not have a concept of priority, and
>>>> performs a simple DFS by default.
>>>> Thus if the successor of the if-branch under the loop in “bar"
>>>> containing the desired destination is generated second,
>>>> it will never be evaluated until the loop exploration limit is
>>>> exhausted.
>>>>
>>>> 2. The previous issue slows down the exploration, but is not
>>>> enough to get a pathological behavior of ultra-long paths.
>>>> The second problem is a combination of:
>>>> a) Block counter is not a part of a node's identity, and node A
>>>> with a small block counter can be merged into a node B with a
>>>> large block counter,
>>>> and the resulting node will have a block counter associated with B.
>>>>
>>>>
>>>> Sorry for the questions, just wanted to clarify some things. You mean ExplodedNodes? By merge, you mean the same thing as "caching-out"?
>>>
>>> Yeah, George notices that if we construct the same ExplodedNode on two different paths that have different block counts, we'd cache-out on the latter path, while the worklist element of the first path would still possess the original block count.
>>>
>>> Which happens a lot when we're evaluating foo() conservatively in this example.
>>>
>>> This isn't directly related to our problem though, as i noticed in http://lists.llvm.org/pipermail/cfe-dev/2018-January/056719.html <http://lists.llvm.org/pipermail/cfe-dev/2018-January/056719.html> .
>>>
>>>
>>>> b) The issue in (a) is triggered due to our heuristic to abandon
>>>> the function’s exploration and switch to conservative evaluation
>>>> if we are already *inside* the function and the block limit has
>>>> been reached.
>>>>
>>>> Issue (1) combined with (2-b) causes the problematic behavior: the
>>>> issue is discovered on the longest path first,
>>>> and by the time the shortest path gets to “bar”, the block limit
>>>> is already reached, and the switch to conservative evaluation is
>>>> performed.
>>>>
>>>> Thus there are two mitigation strategies currently being evaluated:
>>>>
>>>> i) Remove the heuristic in (2-b)
>>>> ii) Use a priority queue to hold nodes which should be explored;
>>>> prefer nodes which give new source code coverage over others
>>>> (or alternatively prefer nodes with least depth of loop stack)
>>>>
>>>> Me and Artem have evaluated the option (i) and the results were
>>>> surprisingly good: some reports disappear, and slightly more
>>>> reports reappear.
>>>> The quality of the new reports seems to be slightly better, and I
>>>> am still trying to figure out exact reasons.
>>>> I suspect merges resulting from heuristic (2-b) cause us to lose
>>>> some actually valid reports.
>>>>
>>>>
>>>> I also find the results surprising. If you have more information about the reasons please do not forget to follow up this thread. We are curious :)
>>>>
>>>>
>>>> Option (ii) has not been evaluated fully yet, but current
>>>> experiments show slightly more reports (5-10%), and a radical
>>>> decline in report lengths
>>>> (e.g. from 400+ to <100 for largest reports)
>>>>
>>>> Are there any thoughts on the matter?
>>>>
>>>> Personally I think we should do both (i) and (ii), even if they
>>>> would shake up the results.
>>>> - The original idea for heuristics (2-b) was to be able to produce
>>>> a report even if we are out of budget, but since it actually
>>>> results in less reports,
>>>> I think the data does not validate the approach.
>>>>
>>>> - Option (ii) is AFAIK how most similar engines work, and should
>>>> get us much larger coverage (and shorter paths) for the same node
>>>> budget,
>>>> even at the cost of log(N) overhead of the priority queue.
>>>> Moreover, not having the priority queue will bite us later if we
>>>> ever decide to further
>>>> increase the analyzer budget or to increase the unroll limit.
>>>>
>>>>
>>>> I wonder what will the performance implication be. But I also like the idea of having a priority queue. If we find that we get more and better report
>>>> but also have worse performance, we can also consider reducing the analysis budget slightly.
>>>>
>>>> Regards,
>>>> Gábor
>>>>
>>>>
>>>> George
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> <mailto:cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180131/761bb912/attachment.html>
More information about the cfe-dev
mailing list