[PATCH] D53701: [Analyzer] Record and process comparison of symbols instead of iterator positions in interator checkers

Tue Nov 6 06:31:30 PST 2018

baloghadamsoftware added a comment.

In https://reviews.llvm.org/D53701#1288258, @NoQ wrote:

> Mmm, is it possible to detect adapters and inline them as an exception from the rule? You can foresee a pretty complicated system of rules and exceptions if we go down this path, but i believe that it is still much easier and more reliable than the system that tries to synchronize two different models of iterators, so i really encourage you to at least give it a try somehow.

I am almost sure it is implossible. It is not only about iterator-adapters in the classical sense but also any iterator that internally deals with other iterators.

For example I created an iterator in the past that iterates over the elements of a list of lists. This iterator contained two iterators, one for the outer and one for the inner list. In this case determining whether two iterators are the same is non-trivial: for all in-range iterators both the outer and the inner iterators must be the same. However if the outer iterator is the past-end iterator of the outer list then the inner iterator is irrelevant. Thus the comparison operator of such iterator must check first whether any of the outer iterators is the past-end iterator and only compare the inner iterator none of them is. If both outer iterators are the past-end iterator then they are equal regardless of the inner iterators.

Another example could be an iterator that somehow merges two lists internally using two different iterators. In this case only one of the inner iterators is relevant when comparing two merging iterators.

So by dropping the inlining we lose some intrinsic knowledge of the analyzed code which leads the checker to wrong assumptions.

>> I  wonder whether the current "mixed" approach introduces additional paths because we do not do explicit state-splits and function `processComparison()` removes contradicting branches.
> 
> I believe that the effect of eager state splits is going to be roughly similar to the actual -analyzer-eagerly-assume:
> 
> 1. Split paths earlier than it is absolutely necessary, which may slow down the analysis and duplicate some work, but most of the time it'll be just a few nodes before the actual check in the code would have caused a split anyway.
> 2. Simplify constraint solving on each of the new paths, which will indeed help with refuting some invalid paths, especially those in which new constraints over the symbols are introduced after the check, but that's due to pecularities of our constraint solver, i.e. rather accidentally than intentionally.

When I first started working on Clang Static Analyzer Anna told be the `-analyzer-eagerly-assume` should be the default. In the iterator checkers the refutation happens intentionally in the functions `relateSymbols()` and `processComparison()`.

Repository:
  rC Clang

https://reviews.llvm.org/D53701