[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

Artem Dergachev via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 18 01:57:44 PDT 2016


NoQ added a subscriber: a.sidorin.
NoQ added a comment.

Wow, you managed to check something that could be checked without going through a hell of modeling dozens of STL methods, and probably even without stepping on poor C++ temporary object modeling in the analyzer, which sounds great.

These comments are incomplete because i didn't yet take my time to understand how your program state traits work; hope to come back to this a bit later.

Adding Alexey because he's fond of iterators.



================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:115
+
+typedef std::pair<const MemRegion *, SymbolRef> RegionOrSymbol;
+
----------------
Maybe `llvm::PointerUnion`?


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:175
+  const auto *LCtx = C.getLocationContext();
+  if (LCtx->getKind() != LocationContext::StackFrame)
+    return;
----------------
I think functions should always begin with a stack frame context, not sure, does this ever get violated? Do we have `checkBeginBlock`? Sorry if i'm wrong.


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:183
+  const auto *Site =
+      static_cast<const StackFrameContext *>(LCtx)->getCallSite();
+  if (!Site)
----------------
LLVM `cast<>` should be used, because it asserts cast correctness through LLVM's custom RTTI (and `LocationContext` child classes do support that).


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:195
+    auto Param = State->getLValue(P, LCtx);
+    auto Arg = State->getSVal(CE->getArg(idx++), LCtx->getParent());
+    const auto *Pos = getIteratorPosition(State, Arg);
----------------
I think this trick needs more comments/explaining. It is very unusual. Are you trying to model effects of passing an iterator by value into a function? What part of these effects are not modeled magically by the core?


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:358
+
+bool IteratorPastEndChecker::evalCall(const CallExpr *CE,
+                                      CheckerContext &C) const {
----------------
So the thing about `evalCall` is that every call can only be eval'ed by only one checker. So if you're doing this, you should be sure that your checker is modelling //all// effects of the call on //everything// in the program state, //manually//, and any checker that relies on that modelling should make sure that your checker is turned on.

Because the functions you are modelling are pure, i think it's, in general, a good idea to `evalCall()` them. Other checkers should be able to rely on PreCall/PostCall events to model their state changes.

So the question is, in what checker do we want this modelling to happen. Because your checker is looking for very specific errors, it might be a good idea to eventually split it into a separate checker. I think, at least, a FIXME for this task should be left around. I'm also currently tackling with a single checker to model all standard library functions (D20811), maybe i'd come up with a way to merge it there.


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:443
+  if (Pos && Pos->isOutofRange()) {
+    auto *N = C.generateNonFatalErrorNode(State);
+    if (!N)
----------------
Accessing end() is a UB, we should probably generate a fatal error node here.


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:446
+      return;
+    reportPastEndBug("Iterator possibly accessed past its end.", Val, C, N);
+  }
----------------
I think path-sensitive checkers should present their findings proudly. After all, they did their best to find a single execution path on which the problem //certainly// manifests.


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:521
+  auto RetVal = svalBuilder.conjureSymbolVal(nullptr, CE, LCtx, C.blockCount());
+  auto SecondParam = state->getSVal(CE->getArg(1), C.getLocationContext());
+
----------------
Number of arguments of `CE` should be checked beforehand. Yes, it is UB to modify namespace `std::` to introduce functions with same names but less arguments, but we still should not crash when we see such code.


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:568
+  // We also should check for copy assignment operator, but for some reason
+  // it is not provided implicitly in Clang Static Analyzer
+  for (const auto *M : CRD->methods()) {
----------------
It's not analyzer's fault :) We're inspecting the AST here.

Anyway, does `CXXRecordDecl::needsImplicitCopyAssignment()` look useful?


================
Comment at: test/Analysis/iterator-past-end.cpp:3
+
+template <typename T, typename Ptr, typename Ref> struct __iterator {
+  typedef __iterator<T, T *, T &> iterator;
----------------
We should probably separate this into an #include-able header in `test/Analysis/Inputs/`.

Also, there's always a bit of concern that it wasn't copy-pasted from a standard library implementation with an incompatible license such as (L)GPL. Which often happens when you do your best to emulate the normal way of defining things as closely as possible.


https://reviews.llvm.org/D25660





More information about the cfe-commits mailing list