[cfe-dev] Some problems about function summary

Sun May 1 05:17:03 PDT 2011

Hi,

Let me just suggest this paper to you:
Precise Interprocedural Analysis in the Presence of Pointers to the Stack
Pascal Sotin and Bertrand Jeannet  (INRIA)
http://pop-art.inrialpes.fr/people/sotin/papers/SotinJeannet2011-ExactPointerAnalysis.pdf

The classic Reps (POPL'01?) and SLAM (polymorphic predicates) papers may 
also give you some ideas.

Nuno

----- Original Message -----
> Hi, Ted,
>
> I am trying to extend clang static analyzer to support inter-procedural
> summary-based analysis. I have implemented a simple method for detecting
> inter-procedural memory leaks which still has some problems preventing
> doing a precise or even correct analysis.The main idea is recording the
> side effects of each function, called SE for short. The structure of SE
> is maps from statements to SVals.
> For example,
> void f(int *x) {
>  x[0] = 1;
> }
> void g() {
>  int a = 0;
>  f(&a);
> }
> The SE of function f is stmt(x[0])->ConcreteInt(1) (Here, x[0] must meet
> a condition that the base region of stmt must be a symbolic Region
> accessed from formal argument.).
>
> When analyzing function g, we binding the actual argument &a to the
> formal argument x(call the method EnterStackFrame() in GRState class.)
> and then iterator all of SEs in function f to evaluate the side
> effects(Use method ExprEngine::Visit(x[0]) to get the actual SVal or
> memory region of x[0] and assign it the Value Concrete(1)).
>
> There are many problems in this method.
> 1. If the index of x is a local variable ,the analysis is incorrect.For
> instance, void f(int *x) { int i = 0; x[i] = 1; }.The actual memory
> region of x[i] is unknown.Because the value i in function f is missed in
> current summary.
> 2. If there exists assignments between arguments, the analysis is also
> incorrect. For instance, void f(int *x, int *y){ x = y; x[0] = 1; }.
> ...
> ...
>
> Obviously, the summary info is insufficient.So I try to explore more
> summary info by collecting each function's access paths(AP).
> Access paths are symbolic names for memory locations accessed in a
> procedural.The structure of access paths is maps from statements to
> memory region. We use access paths to handle the transfer from the
> called function's statements with formal arguments to the caller's
> actual statements. For example,
> void f(int *x, int *y) {
>  x = y;
>  x[0] = 1;
> }
> void g(int *a, int *b) {
>  f(a, b);
> }
> The access paths of function f are stmt(x) -> symbolicRegion($x),
> stmt(y)->symbolicRegion($y).In order to use access paths, the structure
> of SE is adopted as maps from memory region to SVal.In this figure , f's
> SE is ElemRegion{SymbolicRegion($y), 0} -> ConcreteInt(1).
>
> While applying the summary info at the function's call site, we should
> translate from the called function's memory region in SE to actual
> memory region.ElemRegion{SymbolicRegion($y), 0} is translated as
> ElemRegion{SymbolicRegion($a), 0},and then assign ConcreteInt(1) to
> ElemRegion{SymbolicRegion($a), 0}.
>
> This method is difficult or even infeasible to implement.Because the
> translation is not always reasonable and the state data such as memory
> region during intra-procedural analysis is not released which causes a
> large memory consuming.
>
> So we should define a data structure for summary which is independent of
> GRState(or memory region).The most difficult thing is how to obtain the
> actual value from the called function's analysis data.Can you give me
> some suggestions about the design of function summary?Hope for your
> reply!
>
>
>
> -- 
> Zhenbo Xu