[cfe-dev] Obtain symbolic values of some variables backward from some path-sensitive callback

Tue Oct 29 06:33:29 PDT 2019

Hi!

This is a very interesting problem, and I'm excited to see that there is
real desire for such a feature.

On Thu, 3 Oct 2019 at 06:15, Алексеев Кирилл via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

>
>
> I need to find loop at first and then to find all definitions of variables
> in loop control expression, and definitions of some related variables used
> in rhs of previous definitions.
> For example program:
>
> foo2(int start, int end){
>   for (int i=start; i<end; i++){
>    …//memory copy loop
>   }
> }
>
> foo1(int b)
>  if(…){
>    c = 5;
>  }
> else c = 10;
>  a = b + c;
>  foo2(a, b)
> }
>
> I need to find loop at first and then (backward) definitions:
> a) start = foo2 param (a)
> b) a = b + c
> c) b = …
> d1) c = 5
> d2) c = 10
>
>  I.e. is it exist a pointer to previous node(s) of current node of
> ExplodedGraph?
>

Sure, but mind that nodes in the ExplodedGraph may have several
predecessors. After the analysis is concluded (and the entire graph is
built), the analyzer will create linear bugpaths from the ExplodedGraph,
which describe a specific path from the root to a bug node. At this stage,
nodes in the bugpath only have a single predecessor (unless its the root),
but the bugpath isn't available to the checkers, only visitors and
BugReporter itself.

> Also is it possible to find with Clang SA checkers' API input data that
> allow to reach some basic block? I want to find constraints on input data
> that allow to reach particular definitions of variables. Yes, it is over
> all paths, but not using simple meet operator.
>

To me it seems like the algorithm known as "Reaching definitions" may be
the answer to your problem. The drawback is that we have not implemented
this algorithm for C++ (but its in the works!
https://reviews.llvm.org/D64991). Once it is complete however, it still
won't be interprocedural, so arguing across function boundaries will
require some trickery.

The class described in this patch will calculate reaching definitions to a
variable within a function, and after some discussion on the mailing list
we think that it would be possible to combine the information gathered by
the analyzer (most importantly, the resolution of parameter passing) to
make it semi-interprocedural.

The answer to your question is "no", but its not that far from being a part
of the API.

> So it seemed for me that ExplodedGraph is the best choice.
>

It is very difficult to argue across the entire ExplodedGraph. The thing
that is most challanging to overcome is proving that one path of execution
is related to another.

Suppose you have to analyze the following program:

void f(int a, int b) {
  if (coin())
    blackmagic(&a, &b);

  for (int i = a; i < b; ++b) { ... }
}

Suppose that the function blackmagic() halts symbolic execution somewhere
due to the analyzer's budget depleting or containing code that the analyzer
can't model (like inline assembly). There is no guarantee that if coin()
returns true, symbolic execution would ever reach the loop, and even if it
did, it might just be the case that we don't know the internals of
blackmagic().

That said, it would amazing if we had the infrastructure to traverse and
argue about the entirety of the ExplodedGraph, but I fear its far in the
distance for now.

I hope this helped, please follow up if you still have questions!

Cheers,
Husi

>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20191029/18ce14ab/attachment.html>