[PATCH] D32592: [Analyzer] Iterator Checker - Part 1: Minimal Checker for a Simple Test Case

Tue May 9 06:32:52 PDT 2017

baloghadamsoftware added a comment.

In https://reviews.llvm.org/D32592#747132, @NoQ wrote:

> Then, in methods that deal with iterator `SVal`s directly, i wish we had hints explaining what's going on in these ~7 cases. In my opinion, that'd greatly help people understand the code later, and it'd help us understand how to avoid this variety and provide checker authors with a better API as soon as we get to this, so it's the biggest concern for me about this checker.

I am not sure which method do you mean? I think here the crucial functions are the setIteratorPosition() and the getIteratorPosition() functions which provide a common way to handle all these SVals.

================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:166-167
+
+static llvm::APSInt Zero = llvm::APSInt::get(0);
+static llvm::APSInt One = llvm::APSInt::get(1);
+
----------------
NoQ wrote:
> I've a bit of doubt about those. Would they call their constructors every time clang starts, regardless of whether the analyzer or the checker is enabled? Maybe having them as private variables inside the checker class would be better?
> 
> As in http://llvm.org/docs/CodingStandards.html#do-not-use-static-constructors
Not needed in this patch. I delete them from here.

================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:219
+  OutOfRangeBugType.reset(
+      new BugType(this, "Iterator of out Range", "Misuse of STL APIs"));
+  OutOfRangeBugType->setSuppressOnSink(true);
----------------
baloghadamsoftware wrote:
> NoQ wrote:
> > Before we forget: Range -> range. As we've noticed recently in D32702 :), we don't capitalize every word in bug types and categories.
> Agree.
Also "out of" instead of "of out" :-)

================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:294-297
+    // Assumption: if return value is an iterator which is not yet bound to a
+    //             container, then look for the first iterator argument, and
+    //             bind the return value to the same container. This approach
+    //             works for STL algorithms.
----------------
NoQ wrote:
> baloghadamsoftware wrote:
> > NoQ wrote:
> > > I guess this deserves a test case (we could split this out as a separate feature as well).
> > > 
> > > I'm also afraid that we can encounter false positives on functions that are not STL algorithms. I suggest doing this by default only for STL functions (or maybe for other specific classes of functions for which we know it works this way) and do this for other functions under a checker option (i.e. something like `-analyzer-config IteratorChecker:AggressiveAssumptions=true`, similar to `MallocChecker`'s "Optimistic" option).
> > I will check whether this piece of code could be moved in a later part of the checker. However, I suggest to first wait for the first false positives before we introduce such an option. This far the false positives in my initial tests had different reasons, not this one.
> Unfortunately, we've had a poor experience with this approach in other checkers. You never know, and it seems that it's always better to have a safe fallback mode available under a flag, because if a few classes of false positives are found, and we are forced to reduce the checker to a safer behavior, it'd be hard to remember all the places where unsafe heuristics were used.
I think this heuristic is well marked by the comment, easy to find if it causes false positives. When I started working on Clang (Tidy first) reviewers discouraged me to add options before experiencing false positives.

================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:323-324
+
+void IteratorChecker::checkLiveSymbols(ProgramStateRef State,
+                                       SymbolReaper &SR) const {
+  // Keep symbolic expressions of iterator positions, container begins and ends
----------------
NoQ wrote:
> This callback is currently untested as well. I'm doing this bit of manual "mutation testing" by removing pieces of code and re-running tests because not only we'd rather keep every commit self-sufficient, but also we'd rather make sure we didn't forget anything, which is much easier to do patch-by-patch.
OK, removed from the first patch.

================
Comment at: test/Analysis/iterator-range.cpp:1-2
+// RUN: %clang_analyze_cc1 -std=c++11 -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorRange -analyzer-eagerly-assume -analyzer-config c++-container-inlining=false %s -verify
+// RUN: %clang_analyze_cc1 -std=c++11 -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorRange -analyzer-eagerly-assume -analyzer-config c++-container-inlining=true -DINLINE=1 %s -verify
+
----------------
NoQ wrote:
> Could we add run-lines without `-analyzer-eagerly-assume`? Currently all variants are passing, but if new tests will fail, we could `#ifndef` them out.
In my first checker Anna suggested to always use this option. She also wrote that she plans to remove possibility to execute the Analyzer without eagerly assume.

https://reviews.llvm.org/D32592