[cfe-dev] [StaticAnalyzer] LoopUnrolling measurements
Anna Zaks via cfe-dev
cfe-dev at lists.llvm.org
Mon Aug 28 09:01:16 PDT 2017
> On Aug 27, 2017, at 7:35 PM, Péter Szécsi <peterszecsi95 at gmail.com> wrote:
>
> Some further measurements with the limit of 128 resulted better statistics!
>
> Time measurements: (avg of 4 or 5 run depending on the project)
>
> curl libpng vim bitcoin FFmpeg xerces LLVM
> Normal 52.63 62.872 191.9225 208.25 526.215 229.9425 5044.62
> Unroll_limit 54.402 59.946 198.5075 212.94 646.3325 234.045 5125.96
> Unroll_limit_v2 51.952 58.908 193.2825 200.792 543.86 225.8675 4838.26
>
> Coverage measurements (% of reachable basic blocks statistics):
>
> curl libpng vim bitcoin FFmpeg xerces LLVM
> Normal 58.05 55.91 51.13 68.58 49.78 74.86 71.15
> Unroll_limit 69.23 56.04 51.46 68.78 62.07 74.91 71.14
> Unroll_limit_v2 69.14 56.04 51.53 68.7 52.46 74.91 71.13
>
> Summary table which contains the deviation in percent to the normal run.
>
> curl libpng vim bitcoin FFmpeg xerces LLVM
> Unroll_limit – time 3.37 -4.7 3.43 2.25 22.8 1.78 1.61
> Unroll_limit – cov 11.18 0.13 0.33 0.2 12.29 0.05 -0.01
> Unroll_limit_v2 - time -1.29 -6.31 0.71 -3.58 3.35 -1.78 -4.1
> Unroll_limit_v2 – cov 11.09 0.13 0.4 0.12 2.68 0.05 -0.02
>
> So according to these numbers the coverage has increased for (almost) every project while the time of the analysis has not changed drastically. In some cases it even decreased.
>
> The number of findings:
>
> curl libpng vim bitcoin FFmpeg xerces LLVM
> Normal 35 32 81 10 375 52 181
> Unroll 27 33 81 10 368 52 184
> Unroll v2 27 32 81 10 369 52 184
>
> Most of the lost findings (Curl, FFmpeg) come from the fact that those bugs were found in complex function where the analysis reached the maximum node number of the ExplodedGraph. LoopUnrolling caused some more paths (more precisely some path was not aborted as before) so the analyzer could not find the remaining ones. Some other bugs (FFmpeg) could disappear because we inlined more functions (or at least other functions) so we did not find the bug which was only found by analyzing it as a top level function. However, this means that other functions happened to be analyzed as top level and these resulted some new findings other than the ones which come from the increased coverage caused by the unrolling.
>
> In view of these measurements I would suggest enabling this feature by default or at least considering it.
What mode would you recommend turning on by default? Would you suggest setting the loop iteration limit to 128 and using unroll 2 (or something else)?
> (The main counter-argument to this could be that the concrete implementation could be enhanced since it stores data in the State which could be stored in the LocationContext.
How much data are you storing in the state (info about each loop or just cases where the bound is known)? Storing data in the state might effect memory consumption, so it would be important to measure that it did not regress, especially, if you store additional information in most nodes.
> But, I think that other than the implementation details, the functionality itself and the measurement results are great.)
>
> What do you think?
>
You’ve made a lot of progress during the summer. Good work!
Anna
> Peter
>
>
> 2017-08-24 17:32 GMT+02:00 Péter Szécsi <peterszecsi95 at gmail.com <mailto:peterszecsi95 at gmail.com>>:
> Hello everyone,
>
> I am working on improving the loop modeling strategy of the Clang Static Analyzer.
> In the current state of the analyzer it handles loops quite simple. It unrolls it 4 time by default and than cut the analysis of that path where the loop would have been unrolled more times.
>
> One motivation, if not anything else, can be that there were already questions on this behaviour in the cfe-dev mailing list, even it was thought as a bug.
> If we want to have a motivation goal which can be expressed in a more objective way than the coverage of the analysis is the closest what can show the impact of the modeling. Why?
> Because the naive loop handling (described above) can easily lead to unsimulated/unchecked code lines. A small example for that:
>
> int f(){
> int a[6];
> for(int i = 0; i < 6; i++)
> a[i] = i+2;
> //complex body
> }
>
> In this example the analysis will be stopped at the 4th step of the loop and it would not be continued after the loop. So this cause a lot of coverage loss in this case. For this problem I experimented with identifying specific known bound loops which would be worth to completely unroll. (eg. the above loop in f())
>
> At the moment a loop has to fulfill the following requirements to be worth unrolled:
> - Currently only forStmts can be considered.
> - The bound has to be an integer literal. (eg. i < 5, i >= MACRO_TO_LITERAL)
> - The counter variable (i) has not escaped before/in the body of the loop and
> changed only in the increment statement corresponding to the loop. It also
> has to be initialized by a literal in the corresponding initStmt.
> - Does not contain goto, switch and returnStmt.
> These version is marked as Unroll in the statistics.
>
> In addition to that I run the measurements with a twist. So whenever an unrolled loop is creates new branches then we consider it as a normal (non-unroll) loop in the further analysis. It is done because of the exponential behaviour of the ExplodedGraph (symbolic execution) it could create a lot of new paths which will be analysed but has only a few difference to the one step ahead analyzed path. This is marked as Unroll_v2 in the statistics.
>
> Time measurements: (avg of 5 run)
>
> curl libpng vim bitcoin ffmpeg xerces
> Normal 50.168 62.128 180.404 197.385 501.14 227.6075
> Unroll 54.562 64.186 205.468 1767.475 1127.076 248.5375
> Unroll v2 52.892 63.866 203.476 198.025 693.486 237.87
>
> Coverage measurements (% of reachable basic blocks statistics)
>
> curl libpng vim bitcoin ffmpeg xerces
> Normal 58.05 55.91 51.13 68.58 49.78 74.86
> Unroll 69.1 56.04 51.41 68.7 62.18 74.79
> Unroll v2 69.19 56.04 51.61 68.83 52.41 74.8
>
> Summary table which contains the deviation (percentage) to the normal run.
>
> curl libpng vim bitcoin ffmpeg xerces
> Unroll – time 8.76 3.31 13.89 795.445 124.9 9.19
> Unroll – coverage 11.05 0.13 0.28 0.12 12.4 -0.07
> Unroll v2 - time 5.43 2.8 12.79 0.324 38.38 4.5
> Unroll v2 – cov 11.14 0.13 0.48 0.25 2.63 -0.06
> The most outstanding (negative) difference is that the analysis time of project bitcoin increased and took 8x more time than in the normal case. That was single handedly caused by one benchmark file which contains a lot of for loop which makes ~10,000,000 step.
> (An improvement can be to not allow unrolling loops which takes N or more steps where N could be a well chosen number - I am running some measurements at the moment with the version N = 128).
>
>
> The number of findings:
>
> curl libpng vim bitcoin ffmpeg xerces
> Normal 35 32 81 10 375 52
> Unroll 27 33 81 10 367 48
> Unroll v2 27 32 81 10 363 52
>
> Most of the time it hasn't changed the founded bugs. On FFmpeg there are new finding which come from the fact that we are able to unroll loops (and analyze the codelines after them). However the unrolling resulted some loss of the findings as well. Most of the time (curl, ffmpeg) these loss was because these bugs were founded in complex functions on long paths which analysis exhausted the maximum limit of the nodes in the ExplodedGraph even in the normal analysis. It happened "faster" with the unroll feature since we so these paths were not analyzed thus the bugs were not found.
>
> In conclusion the unrolling of the above defined specific loops can have a positive impact on the coverage (% of reachable basic blocks), however, it comes with the price that it affects the running time. In most projects (all except curl) the time increase % was higher than the coverage increase %.
>
> So all in all, I think it is a feature which can be useful for a more thoroughgoing analysis but should not be used by default at the time.
>
> What do you think?
>
> Cheers,
> Peter
>
> Note: if you are interested in the detailed results they are available on the following site: http://cc.elte.hu/szepet/loop_modeling/unrolling/ <http://cc.elte.hu/szepet/loop_modeling/unrolling/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170828/70a74a75/attachment.html>
More information about the cfe-dev
mailing list