[PATCH] D28771: [Analyzer] Various fixes for the IteratorPastEnd checker

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Feb 23 08:16:29 PST 2017


NoQ added inline comments.


================
Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:530
+  auto value = RVal;
+  if (auto loc = value.getAs<Loc>()) {
+    value = State->getRawSVal(*loc);
----------------
baloghadamsoftware wrote:
> NoQ wrote:
> > baloghadamsoftware wrote:
> > > NoQ wrote:
> > > > baloghadamsoftware wrote:
> > > > > NoQ wrote:
> > > > > > Is there a test case for this hack?
> > > > > > 
> > > > > > I'd also consider inspecting the AST (probably before passing the values to `handleRandomIncrOrDecr()`) and making the decision based on that. Because even though this pattern ("if a value is a loc and we expect a nonloc, do an extra dereference") is present in many places in the analyzer, in most of these places it doesn't work correctly (what if we try to discriminate between `int*` and `int*&`?).
> > > > > I just want to get the sign of the integer value (if it is available). It turned out that I cannot do comparison between loc and nonloc. (Strange, because I can do anything else). After I created this hack, the Analyzer did not crash anymore on the llvm/clang code.
> > > > > 
> > > > > I do not fully understand what I should fix here and how? In this particular place we expect some integer, thus no int* or int*&.
> > > > Loc value, essentially, *is* a pointer or reference value. If you're getting a Loc, then your expectations of an integer are not met in the actual code. In this case you *want* to know why they are not met, otherwise you may avoid the crash, but do incorrect things and run into false positives. So i'd rather have this investigated carefully.
> > > > 
> > > > You say that you are crashing otherwise - and then it should be trivial for you to attach a debugger and `dump()` the expression for which you expect to take the integer value, and see why it suddenly has a pointer type in a particular case. From that you'd easily see what to do.
> > > > 
> > > > Also, crashes are often easy to auto-reduce using tools like `creduce`. Unlike false positives, which may turn into true positives during reduction.
> > > > 
> > > > If you still don't see the reason why your workaround is necessary and what exactly it does, could you attach a preprocessed file and an analyzer runline for the crash, so that we could have a look together?
> > > Just to be clear: I know why it crashes without the hack: I simply cannot compare loc and nonloc. Since concrete 0 is nonloc I need another nonloc. I suppose this happens if an integer reference is passed to the operator +, +=, - or -=. So I thought that dereferencing it by getting the raw SVal is the correct thing to do.
> > Yep, in this case the correct thing to do would be to check AST types rather than SVal types. Eg.,
> > ```
> > if (Arg->getType()->isReferenceType())
> >  value = State->getRawSVal(*loc);
> > ```
> > 
> > (you might need to do it in the caller function, which still has access to the expressions)
> > 
> > It is better this way because expectations are explicitly stated, and the assertion would still catch the situation when expectations are not met.
> > 
> > Also, please still add a test case to cover this branch :)
> I tried it and failed in std::vector::back(). It seems that the problem is not the reference, but loc::ConcreteInt. I added a test case, but in our mocked vector the integer 1 in *(end()-1) is nonloc::ConcreteInt, but in the real one it is loc::ConcreteInt. I do not see why is there a difference, neither do I know how could something be a location and a concrete integer at once. What is loc::ConcreteInt and what to do with it?
> What is loc::ConcreteInt and what to do with it?

It is a concrete memory address. The null pointer, for example, or maybe a fixed magic pointer in some embedded driver code.

Could you post an AST dump for the real `(end()-1)`on which you are failing? It might be that we end up looking at the other `operator-()` as in `(end() - begin())`, while iterators are implemented as pointers; no idea how that could be, but i'm suspecting something like that.


https://reviews.llvm.org/D28771





More information about the cfe-commits mailing list