[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

Artem Dergachev via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 1 13:07:10 PDT 2016


NoQ added a comment.

I think i managed to understand the reasoning behind your solutions! Right now i definitely approve all the high-level logic apart from the handling of left/right `SVal`s for `evalAssume`, which i think could be easily improved upon without significant drawbacks. See the inline comment.

As for inlining STL headers - ouch. It was supposed to be working (i.e. never inlining), it'd probably be great to know why it gets inlined. STL headers are confusing much more often than helpful to the analyzer in most cases. That said, if we're going to ever revert this decision, i think it's great to have more stuff already working, so i'd not worry about that. If moving stuff to a header defeats the purpose of some of your tests (eg. tests that specifically test what happens if the function is inlined), then probably it'd be a good idea to duplicate the tests, eg:

  // RUN: ... -DUSE_HEADER=0 ...
  // RUN: ... -DUSE_HEADER=1 ...
  
  #if USE_HEADER
  #include "Inputs/..."
  #else
  // Paste header here.
  #endif



================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:195
+    auto Param = State->getLValue(P, LCtx);
+    auto Arg = State->getSVal(CE->getArg(idx++), LCtx->getParent());
+    const auto *Pos = getIteratorPosition(State, Arg);
----------------
baloghadamsoftware wrote:
> NoQ wrote:
> > I think this trick needs more comments/explaining. It is very unusual. Are you trying to model effects of passing an iterator by value into a function? What part of these effects are not modeled magically by the core?
> If I pass an iterator by value (the most usual case) I have to assign its position (in or out of range) to the formal parameter from the actual one.
Had a look. So, essentially, the core copies argument values to parameter regions in `enterStackFrame()` without ever notifying checkers about it in any way. Okaay. Yep, let's stick to that for now, as i've no better approach in mind.


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:423
+
+void IteratorPastEndChecker::handleComparison(CheckerContext &C,
+                                              const SVal &LVal,
----------------
a.sidorin wrote:
> What will happen if we write something like this:
> ```
> bool Eq1 = it1 == it2;
> bool Eq2 = it3 == it4;
> if (Eq1) {...}?
> ```
> 
> As I understand, we'll assume the second condition instead of first.
Had a look. So the problem is, we obtain the result of the comparison as a symbol, from which it is too hard to recover the operands in order to move iterator position data from one value to another.

Normally we obtain a simple SymbolConjured for the return value of the `operator==()` call (or, similarly, `operator!=()`). For plain-value iterators (eg. `typedef T *iterator`) we might be obtaining an actual binary symbolic expression, but even then it's not quite clear how to obtain operands (the structure of the expression might have changed due to algebraic simplifications). Additionally, LHS and RHS aren't necessarily symbols (they might be semi-concrete), so composing symbolic expressions from them in general is problematic with our symbol hierarchy, which is rarely a problem for numbers but for structural symbols it'd be a mess.

For now i suggest, instead of storing only the last LHS and RHS, to save a map from symbols (which are results of comparisons) to (LHS value, RHS value) pairs. This map should, apart from the obvious, be cleaned up whenever one of the iterators in the pair gets mutated (incremented or decremented). This should take care of the problem Alexey points out, and should work with semi-concrete stuff.

For the future i suggest to let users construct their own symbols and symbolic expressions more easily. In fact, if only we had all iterators as regions, we should have probably used SymbolMetadata for this purpose: it's easy to both recover the parent region from it and use it in symbolic expressions. We could also deprecate the confusing structural symbols (provide default-bound lazy compound values for conjured structures instead), and then it'd be possible to transition to SymbolMetadata entirely.


https://reviews.llvm.org/D25660





More information about the cfe-commits mailing list