[cfe-dev] MemRegions - how to (re)use them right?

Wed Sep 30 18:27:22 PDT 2009

Hi Olaf,

The short answer is "yes this can be done," and others have voiced  
interest in doing taint analysis.

At a high-level, do you want to do basic flow-sensitive dataflow  
analysis or path-sensitive analysis?

MemRegions are used by the path-sensitive dataflow engine  
(GRExprEngine) to do a lazy abstraction of memory.  They are used in  
combination with SVals (see SVals.h) to reason about the values of  
expressions along a given path.  They aren't used just for  
diagnostics, but are used to precisely reason about memory and memory  
bindings (e.g., what value binds to a variable).

If you wanted to do this using path-sensitive analysis, one way to do  
this is to walk the ExplodedGraph produced by GRExprEngine.  The  
ExplodedGraph represents the possible paths traced within a function,  
and each node consists of a program location (e.g., an expression) and  
a program state.  Specific ExplodedNodes have locations that represent  
loads/stores.  You could "simply" walk the graph (using DFS) looking  
for loads, and then trace where individual expression values are  
passed to stores.  Each store consists of a location the value will be  
stored and the value to be stored.  After a store, you keep traversing  
the ExplodedGraph until you see a load from the same location, then  
trace that value, etc.  There are many details to get this right, but  
it could be done, and you wouldn't need to modify the analyzer at all.

This option could be generally useful for several clients interested  
in "taint" analysis.  Down the line we could build a simpler interface  
to this information that layers on top of the ExplodedGraph, and have  
the underlying machinery do all the hard work.  For example, the  
interface would support the query that for a given Expr evaluated at  
an ExplodedNode, what is the immediate set of preceding ExplodedNodes  
(and corresponding Exprs) from which the value of that Expr is  
derived.  This relation in itself is a graph that could be walked, but  
it is at a little higher level than just walking the ExplodedGraph  
directly.

Another option is for us to build support for general "taint" tracking  
into GRExprEngine and friends, with core transfer function logic doing  
some of the taint propagation.  This might be useful anyway, but would  
add more complexity not required by all clients.  It also might reduce  
the amount of path caching done by the analyzer.

On Sep 29, 2009, at 7:50 AM, Olaf Krzikalla wrote:

> Hi @clang,
>
> I need to analyze some (rather simple) data flow at AST level. While I
> could do it by my own I have the strange feeling that the
> MemRegionManager already provides a lot of the means I need. However I
> couldn't figure out how to use it. At the moment it seems to be used  
> for
> diagnostics only.
> Here is a code example:
>
> struct A { int a, b };
>
> void foo(int* sink)
> {
>  A temp;
>  A* ptr = &temp;
>
>  temp.a = /*expr*/;
>  temp.b = /* another_expr */;
>  *sink = temp.a;   // (1)
>  *sink = ptr->b;   // (2)
> }
>
> I want to know that at (1) actually the result of expr is written to
> sink and that at (2) the result of another_expr is written to sink.  
> Can
> I somehow compute this using clang's static analyzer?
>
>
> Best regards
> Olaf Krzikalla
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev