[cfe-dev] [analyzer] exploration strategies and paths

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Mon Jan 29 18:31:41 PST 2018



On 29/01/2018 6:24 PM, Péter Szécsi wrote:
>
>
> 2018-01-30 2:43 GMT+01:00 Artem Dergachev <noqnoqneo at gmail.com 
> <mailto:noqnoqneo at gmail.com>>:
>
>
>
>     On 29/01/2018 5:36 PM, Péter Szécsi wrote:
>
>         Hi George and Artem,
>
>         2018-01-30 1:44 GMT+01:00 Artem Dergachev via cfe-dev
>         <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>         <mailto:cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>>:
>
>
>
>             On 29/01/2018 4:12 PM, George Karpenkov via cfe-dev wrote:
>
>                 Hi All,
>
>                 I was investigating recently bug reports with very long
>                 analyzer paths (more than a few hundred nodes).
>                 In many of such cases the path is long for no good reason:
>                 namely, the analyzer would go 3 times around the loop
>         before
>                 going further.
>                 The issue is surprisingly common, and it was
>         exacerbated with
>                 a recent bump of analyzer thresholds.
>
>
>             Yeah, i guess everybody who used the analyzer has seen some of
>             those nasty reports with iterating over loops 4 times.
>         It's like
>             why does it find the issue on the last iteration rather
>         than on
>             the first iteration, given that we use a depth-first
>         strategy? So
>             it's a great long-overdue thing to fix.
>
>             George, do you have any non-internal before/after html
>         reports to
>             attach?
>
>
>                 The problem is reproduced on the following file:
>
>                 ```
>                 extern int coin();
>
>                 int foo() {
>                      int *x = 0;
>                      while (coin()) {
>                          if (coin())
>                              return *x;
>                      }
>                      return 0;
>                 }
>
>                 void bar() {
>                      while(coin())
>                          if (coin())
>                              foo();
>                 }
>                 ```
>
>                 While a shortest path to the error does not loop
>         around, the
>                 current version of the analyzer
>                 will go around the loop three times before going further.
>                 (and we are quite fortunate that the unrolling limit
>         for loops
>                 is three, otherwise it would keep going
>                 until the unrolling limit is reached).
>
>                 Multiple issues were discovered during the investigation.
>
>                 1. Analyzer queue does not have a concept of priority, and
>                 performs a simple DFS by default.
>                 Thus if the successor of the if-branch under the loop
>         in “bar"
>                 containing the desired destination is generated second,
>                 it will never be evaluated until the loop exploration
>         limit is
>                 exhausted.
>
>                 2. The previous issue slows down the exploration, but
>         is not
>                 enough to get a pathological behavior of ultra-long paths.
>                 The second problem is a combination of:
>                 a) Block counter is not a part of a node's identity,
>         and node
>                 A with a small block counter can be merged into a node
>         B with
>                 a large block counter,
>                 and the resulting node will have a block counter
>         associated
>                 with B.
>                 b) The issue in (a) is triggered due to our heuristic to
>                 abandon the function’s exploration and switch to
>         conservative
>                 evaluation
>                 if we are already *inside* the function and the block
>         limit
>                 has been reached.
>
>                 Issue (1) combined with (2-b) causes the problematic
>         behavior:
>                 the issue is discovered on the longest path first,
>                 and by the time the shortest path gets to “bar”, the block
>                 limit is already reached, and the switch to conservative
>                 evaluation is performed.
>
>
>             2-a is not even required here.
>
>             With our DFS exploration order, on every iteration of the
>             while-loop within bar(), we'd take the false-branch of
>         if() within
>             bar() from the worklist, see that it goes back to loop,
>         and end up
>             with new true-branch and false-branch nodes of the next
>         iteration
>             on the top of the worklist. Then we pop the false-branch
>         again,
>             etc., until we run out of block count limit while having 4
>             true-branches in the worklist. Those would therefore
>         evaluate in
>             the opposite order, and the first time we enter foo() we'd
>         be on
>             the 4th iteration.
>
>             This situation can happen regardless of in which order we
>         evaluate
>             if()-branches, by slightly modifying the example. So if
>         the idea
>             in the previous paragraph is unclear, it should still be
>         obvious
>             that sometimes we'd run into a function call on the longer
>         path
>             earlier than on a shorter path.
>
>             Now, once we enter foo() and immediately find the bug, we
>         also run
>             out of block count limit within foo(). Recall that we are
>         on the
>             4th iteration of the while-loop in bar(), and here is
>         where the
>             bug is found. Now, once evaluation of foo() is over, we record
>             that we failed to fully inline it, so it's probably too
>         complex,
>             so let's evaluate it conservatively.
>
>             It means that on 3th, 2nd, 1st iteration we won't be able
>         to find
>             the bug, because foo() is evaluated conservatively. So
>         we're stuck
>             with the long report forever.
>
>         I do not exactly see it why. If I'm not mistaken you described
>         the replay-without-inlining heuristics. However, I believe
>         this information is stored in the state which means that this
>         only affects one path (in this example the 4th iteration bug
>         finding path). But whenever we simulate the path of the
>         3rd/2nd/1st iteration where the if(coin()) is true, that is
>         another path. Maybe I just do not see something trivial os
>         just misunderstood something but could you explain me, why
>         does it affect other paths?
>
>
>     Nope, it's not part of the program state. It's in
>     FunctionSummaries, which is a field in CoreEngine. So whenever we
>     find a function we were unable to inline even once during
>     analysis, we'd never inline it anymore on any branch. See stuff
>     around markReachedMaxBlockCount().
>
> Hmm I've missed that but checked it now. Thanks! (Im a little bit 
> confused then, why ReplayWithoutInlining stuff is stored in the state 
> as well. It seems it is used only for sanity checks but I havent gone 
> deep into that in the previous minutes.)

I guess there are two parts of the heuristic - the 
replay-without-inlining-like-i-mean-right-now part and the 
dont-ever-try-to-inline-this-thing-again-im-sick-of-it-already part. The 
former is what we need to immediately do on the current path, so it 
needs support on the program state side to remember that we need to 
re-inline it. I'm not sure it's *actually* needed though. The latter is 
path-insensitive.


> In this case I would suggest this alternative idea that - instead of 
> removing the heuristic - it could go into the ProgramState. I mean, it 
> does make sense to me, not to reinline functions which will probably 
> exceed the block count (again). However, this would still solve the 
> above mentioned problem since we would find the bug on the shorter 
> path as well. (If they are marked as same, the analyzer reports it 
> with the shorter bugpath, I believe). What do you think?

This is indeed a more careful middle-ground change than cutting out the 
heuristic entirely, but i guess this is an intentional part of the 
heuristic - we are unlikely to be able to fully explore the function on 
the other path anyway, so why bother. We'd still need to understand why 
do positives skew in a particular direction. But that's definitely an 
option.

>
>                 Thus there are two mitigation strategies currently being
>                 evaluated:
>
>                 i) Remove the heuristic in (2-b)
>                 ii) Use a priority queue to hold nodes which should be
>                 explored; prefer nodes which give new source code coverage
>                 over others
>                 (or alternatively prefer nodes with least depth of
>         loop stack)
>
>                 Me and Artem have evaluated the option (i) and the results
>                 were surprisingly good: some reports disappear, and
>         slightly
>                 more reports reappear.
>                 The quality of the new reports seems to be slightly
>         better,
>                 and I am still trying to figure out exact reasons.
>
>
>             Yeah, i guess some explanation is necessary here. The skew of
>             results is pretty huge, and it's surprising that the number of
>             reports actually increases.
>
>         Just to make sure: Does the number of actually *different
>         *reports increases? In case of a missing uniquing location a
>         checker could generate a lot of "spam", so find and report the
>         same bug on different paths. (And turning off these heuristics
>         could lead into that. Probably you already checked this, but
>         seems really suspicious.)
>
>
>     Yeah, it's like +600/-300 unique reports on my nightly internal
>     codebase run, which is a huge skew.
>
>
>
>         Another interesting stuff about (i) could be that how many
>         times we reached a max size ExplodedGraph (max number of
>         steps) and how many more steps we do because of turning of
>         these heuristics? I feel like this change should result higher
>         analysis time but less coverage overall. However, I am not
>         sure how significant these changes are. So, if you manage to
>         find more valuable bugs then these changes worth it, I guess.
>
>             Just to be clear, both replay-without-inlining and
>             dont-inline-again-after-bailout heuristics were disabled
>         in this test.
>
>                 I suspect merges resulting from heuristic (2-b) cause
>         us to
>                 lose some actually valid reports.
>
>
>             Because replay-without-inlining is disabled, there should
>         not be
>             many merges.
>
>                 Option (ii) has not been evaluated fully yet, but current
>                 experiments show slightly more reports (5-10%), and a
>         radical
>                 decline in report lengths
>                 (e.g. from 400+ to <100 for largest reports)
>
>                 Are there any thoughts on the matter?
>
>                 Personally I think we should do both (i) and (ii), even if
>                 they would shake up the results.
>                 - The original idea for heuristics (2-b) was to be able to
>                 produce a report even if we are out of budget, but
>         since it
>                 actually results in less reports,
>                 I think the data does not validate the approach.
>
>                 - Option (ii) is AFAIK how most similar engines work, and
>                 should get us much larger coverage (and shorter paths)
>         for the
>                 same node budget,
>                 even at the cost of log(N) overhead of the priority queue.
>                 Moreover, not having the priority queue will bite us
>         later if
>                 we ever decide to further
>                 increase the analyzer budget or to increase the unroll
>         limit.
>
>
>             In the example above, (ii) means evaluating the first
>         true-branch
>             of the if() in bar() before the second false-branch of the
>         if() in
>             bar(), simply because it's *on an earlier loop iteration*.
>         This,
>             indeed, sounds like the right thing to do, like, logically,
>             hopefully we'd be able to confirm this with a more careful
>         evaluation.
>
>         Option (ii) is something I personally really support and would
>         like to see implemented in the analyzer. I was already
>         thinking on this change earlier but did not seem easy to come
>         up with a reliable heuristic for that. The aim is clear and
>         great but how would you do it? Would you rate the
>         possibilities based on the number of visits of the possible
>         next analyzed blocks?
>
>
>         Thanks,
>         Peter
>
>
>




More information about the cfe-dev mailing list