[cfe-dev] [StaticAnalyzer] LoopUnrolling measurements
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Mon Aug 28 08:24:34 PDT 2017
This sounds good to me to get enabled by default. You've overcome a huge
bunch of troubles to get that far, and these results are quite
impressing. The analyzer no longer gives up on straightforward paths
that take just a bit longer than expected, and even if looping through
the same loop more times doesn't bring immediate benefit, getting past
the loop to cover blocks behind it seems to pay off very well indeed.
On 8/28/17 5:35 AM, Péter Szécsi via cfe-dev wrote:
> Some further measurements with the limit of 128 resulted better
> statistics!
>
> Time measurements: (avg of 4 or 5 run depending on the project)
>
> curl libpng vim bitcoin FFmpeg xerces LLVM
> Normal 52.63 62.872 191.9225 208.25 526.215 229.9425 5044.62
> Unroll_limit 54.402 59.946 198.5075 212.94 646.3325 234.045
> 5125.96
> Unroll_limit_v2 51.952 58.908 193.2825 200.792 543.86 225.8675
> 4838.26
>
>
> Coverage measurements (% of reachable basic blocks statistics):
>
> curl libpng vim bitcoin FFmpeg xerces LLVM
> Normal 58.05 55.91 51.13 68.58 49.78 74.86 71.15
> Unroll_limit 69.23 56.04 51.46 68.78 62.07 74.91 71.14
> Unroll_limit_v2 69.14 56.04 51.53 68.7 52.46 74.91 71.13
>
>
> Summary table which contains the deviation in percent to the normal run.
>
> curl libpng vim bitcoin FFmpeg xerces LLVM
> Unroll_limit – time 3.37 -4.7 3.43 2.25 22.8 1.78 1.61
> Unroll_limit – cov 11.18 0.13 0.33 0.2 12.29 0.05 -0.01
> Unroll_limit_v2 - time -1.29 -6.31 0.71 -3.58 3.35 -1.78 -4.1
> Unroll_limit_v2 – cov 11.09 0.13 0.4 0.12 2.68 0.05 -0.02
>
>
> So according to these numbers the coverage has increased for (almost)
> every project while the time of the analysis has not changed
> drastically. In some cases it even decreased.
>
> The number of findings:
>
> curl libpng vim bitcoin FFmpeg xerces LLVM
> Normal 35 32 81 10 375 52 181
> Unroll 27 33 81 10 368 52 184
> Unroll v2 27 32 81 10 369 52 184
>
>
> Most of the lost findings (Curl, FFmpeg) come from the fact that those
> bugs were found in complex function where the analysis reached the
> maximum node number of the ExplodedGraph. LoopUnrolling caused some
> more paths (more precisely some path was not aborted as before) so the
> analyzer could not find the remaining ones. Some other bugs (FFmpeg)
> could disappear because we inlined more functions (or at least other
> functions) so we did not find the bug which was only found by
> analyzing it as a top level function. However, this means that other
> functions happened to be analyzed as top level and these resulted some
> new findings other than the ones which come from the increased
> coverage caused by the unrolling.
>
> In view of these measurements I would suggest enabling this feature by
> default or at least considering it. (The main counter-argument to this
> could be that the concrete implementation could be enhanced since it
> stores data in the State which could be stored in the LocationContext.
> But, I think that other than the implementation details, the
> functionality itself and the measurement results are great.)
>
> What do you think?
>
> Peter
>
>
> 2017-08-24 17:32 GMT+02:00 Péter Szécsi <peterszecsi95 at gmail.com
> <mailto:peterszecsi95 at gmail.com>>:
>
> Hello everyone,
>
> I am working on improving the loop modeling strategy of the Clang
> Static Analyzer.
> In the current state of the analyzer it handles loops quite
> simple. It unrolls it 4 time by default and than cut the analysis
> of that path where the loop would have been unrolled more times.
>
> One motivation, if not anything else, can be that there were
> already questions on this behaviour in the cfe-dev mailing list,
> even it was thought as a bug.
> If we want to have a motivation goal which can be expressed in a
> more objective way than the coverage of the analysis is the
> closest what can show the impact of the modeling. Why?
> Because the naive loop handling (described above) can easily lead
> to unsimulated/unchecked code lines. A small example for that:
>
> int f(){
> int a[6];
> for(int i = 0; i < 6; i++)
> a[i] = i+2;
> //complex body
> }
>
> In this example the analysis will be stopped at the 4th step of
> the loop and it would not be continued after the loop. So this
> cause a lot of coverage loss in this case. For this problem I
> experimented with identifying specific known bound loops which
> would be worth to completely unroll. (eg. the above loop in f())
>
> At the moment a loop has to fulfill the following requirements to
> be worth unrolled:
> - Currently only forStmts can be considered.
> - The bound has to be an integer literal. (eg. i < 5, i >=
> MACRO_TO_LITERAL)
> - The counter variable (i) has not escaped before/in the body of
> the loop and
> changed only in the increment statement corresponding to the
> loop. It also
> has to be initialized by a literal in the corresponding initStmt.
> - Does not contain goto, switch and returnStmt.
> These version is marked as *Unroll* in the statistics.
>
> In addition to that I run the measurements with a twist. So
> whenever an unrolled loop is creates new branches then we consider
> it as a normal (non-unroll) loop in the further analysis. It is
> done because of the exponential behaviour of the ExplodedGraph
> (symbolic execution) it could create a lot of new paths which will
> be analysed but has only a few difference to the one step ahead
> analyzed path. This is marked as *Unroll_v2* in the statistics.
>
> Time measurements: (avg of 5 run)
>
> curl libpng vim bitcoin ffmpeg xerces
> Normal 50.168 62.128 180.404 197.385 501.14 227.6075
> Unroll 54.562 64.186 205.468 1767.475 1127.076 248.5375
> Unroll v2 52.892 63.866 203.476 198.025 693.486 237.87
>
>
> Coverage measurements (% of reachable basic blocks statistics)
>
> curl libpng vim bitcoin ffmpeg xerces
> Normal 58.05 55.91 51.13 68.58 49.78 74.86
> Unroll 69.1 56.04 51.41 68.7 62.18 74.79
> Unroll v2 69.19 56.04 51.61 68.83 52.41 74.8
>
>
> Summary table which contains the deviation (percentage) to the
> normal run.
>
> curl libpng vim bitcoin ffmpeg xerces
> Unroll – time 8.76 3.31 13.89 795.445 124.9 9.19
> Unroll – coverage 11.05 0.13 0.28 0.12 12.4 -0.07
> Unroll v2 - time 5.43 2.8 12.79 0.324 38.38 4.5
> Unroll v2 – cov 11.14 0.13 0.48 0.25 2.63 -0.06
>
> The most outstanding (negative) difference is that the analysis
> time of project bitcoin increased and took 8x more time than in
> the normal case. That was single handedly caused by one benchmark
> file which contains a lot of for loop which makes ~10,000,000 step.
> (An improvement can be to not allow unrolling loops which takes N
> or more steps where N could be a well chosen number - I am running
> some measurements at the moment with the version N = 128).
>
>
> The number of findings:
>
> curl libpng vim bitcoin ffmpeg xerces
> Normal 35 32 81 10 375 52
> Unroll 27 33 81 10 367 48
> Unroll v2 27 32 81 10 363 52
>
>
> Most of the time it hasn't changed the founded bugs. On FFmpeg
> there are new finding which come from the fact that we are able to
> unroll loops (and analyze the codelines after them). However the
> unrolling resulted some loss of the findings as well. Most of the
> time (curl, ffmpeg) these loss was because these bugs were founded
> in complex functions on long paths which analysis exhausted the
> maximum limit of the nodes in the ExplodedGraph even in the normal
> analysis. It happened "faster" with the unroll feature since we so
> these paths were not analyzed thus the bugs were not found.
>
> In conclusion the unrolling of the above defined specific loops
> can have a positive impact on the coverage (% of reachable basic
> blocks), however, it comes with the price that it affects the
> running time. In most projects (all except curl) the time increase
> % was higher than the coverage increase %.
>
> So all in all, I think it is a feature which can be useful for a
> more thoroughgoing analysis but should not be used by default at
> the time.
>
> What do you think?
>
> Cheers,
> Peter
>
> Note: if you are interested in the detailed results they are
> available on the following site:
> http://cc.elte.hu/szepet/loop_modeling/unrolling/
> <http://cc.elte.hu/szepet/loop_modeling/unrolling/>
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
More information about the cfe-dev
mailing list