[PATCH] D32592: [Analyzer] Iterator Checker - Part1: Minimal Checker for a Simple Test Case
Artem Dergachev via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu May 4 03:18:12 PDT 2017
NoQ added a comment.
I see the checker going towards what we call "metadata symbols", which seems logical.
I still feel the checker deserves a lot more comments around the code. In particular, i'd definitely love to see a human-readable explanation of what kinds of iterators are supported (from plain pointers or integers to opaque structures): you did great job abstracting away these differences in the high-level logic, but i'd like to demonstrate how does the low-level logic (eg. offsets and comparisons, i.e. these moments when you're operating on various `SVal` objects directly) works in all these cases, because this part of the checker requires the most ingenuity to handle, while the rest is more or less straightforward.
================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:211-214
+std::pair<SymbolRef, llvm::APSInt>
+createDifference(SymbolManager &SymMgr, SymbolRef Sym1, SymbolRef Sym2);
+const llvm::APSInt *getDifference(ProgramStateRef State, SymbolRef Sym1,
+ SymbolRef Sym2);
----------------
These functions are currently unused; my compiler warns about that, and buildbots would probably yell at us when we commit it, so i guess it's better to add them when they are actually used; they'd also be tested as well.
================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:219
+ OutOfRangeBugType.reset(
+ new BugType(this, "Iterator of out Range", "Misuse of STL APIs"));
+ OutOfRangeBugType->setSuppressOnSink(true);
----------------
Before we forget: Range -> range. As we've noticed recently in D32702 :), we don't capitalize every word in bug types and categories.
================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:294-297
+ // Assumption: if return value is an iterator which is not yet bound to a
+ // container, then look for the first iterator argument, and
+ // bind the return value to the same container. This approach
+ // works for STL algorithms.
----------------
I guess this deserves a test case (we could split this out as a separate feature as well).
I'm also afraid that we can encounter false positives on functions that are not STL algorithms. I suggest doing this by default only for STL functions (or maybe for other specific classes of functions for which we know it works this way) and do this for other functions under a checker option (i.e. something like `-analyzer-config IteratorChecker:AggressiveAssumptions=true`, similar to `MallocChecker`'s "Optimistic" option).
================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:323-324
+
+void IteratorChecker::checkLiveSymbols(ProgramStateRef State,
+ SymbolReaper &SR) const {
+ // Keep symbolic expressions of iterator positions, container begins and ends
----------------
This callback is currently untested as well. I'm doing this bit of manual "mutation testing" by removing pieces of code and re-running tests because not only we'd rather keep every commit self-sufficient, but also we'd rather make sure we didn't forget anything, which is much easier to do patch-by-patch.
================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:478-479
+ auto &SymMgr = C.getSymbolManager();
+ EndSym = SymMgr.conjureSymbol(CE, C.getLocationContext(),
+ C.getASTContext().LongTy, C.blockCount());
+ State = createContainerEnd(State, ContReg, EndSym);
----------------
I see what you did here! And i suggest considering `SymbolMetadata` here instead of `SymbolConjured`, because it was designed for this purpose: the checker for some reason knows there's a special property of an object he wants to track, the checker doesn't have a ready-made symbolic value to represent this property (because the engine didn't know this property even exists, so it didn't denote its value), so the checker comes up with its own notation. `SymbolMetadata` is used, for example, in `CStringChecker` to denote an unknown string length for a given null-terminated string region.
This symbol carries the connection to the region (you may use it to reverse-lookup the region if necessary), and in particular it is considered to be live as long as the base region is live (so you don't have to deal with liveness manually; see also comments around `SymbolReaper::MarkInUse`). It's also easier to debug, because we can track the symbol back to the checker. Generally, symbol class hierarchy is cool because we know a lot about the symbol by just looking at it, and `SymbolConjured` is a sad fallback we use when we didn't care or manage to come up with a better symbol.
I'm not sure if `LongTy` is superior to `IntTy` here, since we don't know what to expect from the container anyway.
Also, please de-duplicate the symbol creation code. Birth of a symbol is something so rare it deserves a drum roll :)
I'd take a pause to figure out if the same logic should be applied to the map from containers to end-iterators.
================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:513
+
+bool isLess(ProgramStateRef State, SymbolRef Sym1, SymbolRef Sym2);
+bool isGreaterOrEqual(ProgramStateRef State, SymbolRef Sym1, SymbolRef Sym2);
----------------
One more unused function.
================
Comment at: lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:515-516
+bool isGreaterOrEqual(ProgramStateRef State, SymbolRef Sym1, SymbolRef Sym2);
+bool compareToZero(ProgramStateRef State, const NonLoc &Val,
+ BinaryOperator::Opcode Opc);
+bool compare(ProgramStateRef State, SymbolRef Sym1, SymbolRef Sym2,
----------------
One more unused function.
https://reviews.llvm.org/D32592
More information about the cfe-commits
mailing list