[PATCH] D12358: [Analyzer] Handling constant bound loops

Thu Aug 27 10:15:26 PDT 2015

> On Aug 27, 2015, at 8:57 AM, Sean Eveson <eveson.sean at gmail.com> wrote:
> 
> I accept that my current patch is not a comprehensive solution to the problem and that it may introduce false positives, however I do think it is an improvement, where it is preferable to have false positives over doing no analysis after the loop.

Hi Sean,

I'll take another closer look at your patch tonight and come back with more feedback. I completely understand your eagerness to push this forward.  I don't think that anybody disagrees with you that we want to do analysis of more code, especially the code that's currently not being analyzed all because of our current handling of loops. That said, if the solution is not precise enough it may cause a flurry of false positives, which then renders the efficacy of the tool impotent.  It doesn't matter if the tool has more coverage if it produces a high-volume of false positives.  I'm not saying that your patch will result in that happening all the time, but we should not just land the patch without understanding its implications.  I also think that with some relatively small enhancements most of our concerns can be addressed.

Because of the way the analyzer works, false positives often happen because of "correlated conditions" that are improperly truck by the analyzer.  Essentially, if an event that happened earlier in a path would have had an anti-correlation with something happening later in the path then reporting an issue along that path related to those two conditions would be a false positive.  We go to great lengths in the analyzer to handle correlated conditions; that is why it is a path sensitive analysis, which is very expensive.  When we drop information on the floor, we lose precision and for some kinds of code can greatly increase the chance of producing false positives.

We do drop information on the floor all the time, but when we do we try to make those decisions quite deliberate and understand their implications.  Some checkers also are written with the mindset that if there is insufficient information to report an issue with high confidence than no issue is reported at all.  But the problem here by not invalidating values possibly touched by the loop is that there is a incorrect perception that the information is precise when indeed it is not.

Thanks again for pushing on this; handling loops better is something I have wanted us to tackle for a very long time but never found the time to do it.  I'll circle back with you tonight with more feedback.

Cheers,
Ted