[cfe-dev] DataFlowAnalysis in Clang

Ted Kremenek kremenek at apple.com
Fri Jan 23 12:39:00 PST 2009


On Jan 23, 2009, at 2:26 AM, Simone Pellegrini wrote:

> Dear all,
> I am currently evaluating tools to implement some high
> level-code-transformations directly in the source code. Clang seems to
> be a very nice platform for doing that, the only thing which is  
> lacking
> is Data Dependency Analysis. I really need to have stuff like DefUse
> chains... and so on! As I cannot see anything like that in the  
> Analysis
> module I wanted to start to implement them from scratch by myself.
>
> Before start this adventure I want to be sure such structures aren't
> already implemented in clang. I guess that this kind of analysis is  
> not
> inline with the philosophy of the clang project, I saw that most of  
> the
> transformations (loop transformations/constants propagation) are meant
> to be applied at LLVM bytecode level. However for the transformation I
> am interested in I really need to work at source code level; lot of
> information is lost when it's converted (like openmp pragmas... and so
> on) and that's not good for us! :) If you have any kind of pointer  
> about
> the implementation of dataflow analysis in clang... you are really  
> welcome!
>
> many thanks for the awesome work you are doing with clang, Simone

Clang contains both flow-sensitive and path-sensitive dataflow  
analysis engines.  They can be found in include/clang/Analysis.  The  
flow-sensitive engine is used by LiveVariables.cpp and  
UninitializedValues.cpp respectively.

Both dataflow analysis engines use source-level CFGs that model both  
the explicit and implicit control-flow of C.

The flow-sensitive engine associates dataflow values with "block-level  
expressions" (which are expressions that appear as a "top-level"  
expression in the basic block of the source-level CFG).  Using -cfg- 
dump and -cfg-view on a source file will give you a better idea of  
what these concepts mean.  The dataflow solver uses a merge operate to  
merge values at confluence points and uses a worklist algorithm to  
compute dataflow values.  The flow-sensitive solver is functional; it  
definitely works and has been tested, but it has been highly optimized  
for performance (there simply hasn't been a need yet).

The path-sensitive engine is used by the static analyzer, is fairly  
complicated. It uses the notion of transfer functions to update  
dataflow state, but the program analysis task is modeled as a graph  
reachability problem in the state space graph of a program rather than  
associating dataflow values with sections of the CFG.



More information about the cfe-dev mailing list