[cfe-dev] Dataflow analysis with LLVM/Clang

Ted Kremenek kremenek at apple.com
Fri Oct 17 14:38:39 PDT 2008


On Oct 17, 2008, at 3:21 AM, João Paulo Rechi Vita wrote:

> I was looking at Clang's source code, and I found out there are two
> ways of calling my analysis:
>
> A) Write a stand-alone program, which would link to clang's libs;
> B) Add a command-line option to clang driver, which would call my  
> analysis code.

It's easy to do B using the AnalysisConsumer interface (see below).   
Doing 'A' is only as hard as coming up with the ASTs to feed to the  
analysis routines.

> AFAIK, my analysis code will have to do something like this:
>
> 1) Parse the source file(s), and generate the AST;
> 2) Generate a CFG from the AST;
> 3) Do a dataflow analysis over the CFG, and insert some statements on
> it (which I think, in turn, modifies the AST also);

The CFG is just a "view" of the control flow relationships of the  
statements into the AST.  Inserting statements into the CFG (which you  
cannot do) does not modify the original AST.  You can insert  
statements into the original AST, but those changes will not be  
reflected in the CFG until you rebuild the CFG.

>
> 4) Re-generate C code from the modified AST;
>
> Since steps 1, 2 and 4 are already done on clang,

3 is also done.  The "DataflowSolver" class implements a flow- 
sensitive, intra-procedural dataflow analysis engine.  Check out the  
LiveVariables and UninitializedValues examples.

Alternatively, GRExprEngine is a path-sensitive dataflow engine.

> it seems to make
> sense to go with choice B, but I couldn't find how difficult it would
> be to add one command line option to the driver. One the other side,
> going with A will make my code more independent of clang's driver
> changes, and it seems I can copy most of step's 1 and 2 from the
> driver. What do you guys think would be the best option? And also,
> there is any other example of use of the Clang's Analysis engine,
> besides the driver?

It's fairly easy to write an analysis that uses the "AnalysisConsumer"  
interface.  Look at AnalysisConsumer.cpp and Analysis.def in the  
"Driver/" directory.  The Analysis.def file allows you to  
declaratively define your analysis option for the driver, the scope of  
the analysis (e.g., does it run on functions, does it require the  
entire translation unit, etc.), and the "Action" function called by  
AnalysisConsumer.

For example, here is one action function defined in  
AnalysisConsumer.cpp:

static void ActionWarnObjCDealloc(AnalysisManager& mgr) {
   if (mgr.getLangOptions().getGCMode() == LangOptions::GCOnly)
     return;

   BugReporter BR(mgr);

   CheckObjCDealloc(cast<ObjCImplementationDecl>(mgr.getCodeDecl()),
                    mgr.getLangOptions(), BR);
}

and here is the corresponding line in Analyses.def:

ANALYSIS(WarnObjCDealloc, "warn-objc-missing-dealloc",
  "Warn about Objective-C classes that lack a correct implementation  
of -dealloc",
  ObjCImplementation)

In the future we may move to a more dynamic, plug-in model for  
defining new analyses for the analyzer.   By modifying these two files  
(which very localized changes), I don't think you'll have much issues  
with merging, and its fairly easy to plug in a new analysis to the  
driver within a couple minutes.






More information about the cfe-dev mailing list