[cfe-dev] More static analysis...

Thu Feb 26 22:02:13 PST 2009

On Feb 26, 2009, at 2:18 AM, Ben Laurie wrote:

> On Wed, Feb 25, 2009 at 5:27 PM, Ted Kremenek <kremenek at apple.com>  
> wrote:
>>
>> On Feb 25, 2009, at 8:56 AM, Ben Laurie wrote:
>>
>>> I'm interested in looking at detecting "known bad" patterns, for  
>>> example:
>>>
>>> (<expr> & 0) == 0 (this example is stolen from FindBugs)

Quick question.  Did you literally want to look for this syntax in the  
AST (i.e., the integer literal '0') or did you want to use the  
dataflow analysis to see cases when an expression evaluates to '0' and  
then is bitwise-ANDed with <expr>?  Is the motivation of this check is  
that it is trivially true?

>>> if(<non-boolean value>) (cause of recent OpenSSL vuln)

This seems a little too vague to me.  People check non-boolean values  
all the time in 'if' conditions.  Were you thinking about cases when  
<non boolean value> was negative?  Positive numbers seem completely  
legit to me (e.g., some flag was set in a bitmask, etc.).

In either case, both of these seem like "auditor" checks.  For the  
first check if all you wanted to do is grep the syntax, we can easily  
add another check that gets called by AnalysisConsumer.  Check out the  
functions "ActionXXX" in AnalysisConsumer.cpp to see all the places  
where the various checks are called.  An ActionXXX can get called for  
each function, an entire Objective-C class, or the entire translation  
unit.  For example:

static void ActionWarnDeadStores(AnalysisManager& mgr) {
   if (LiveVariables* L = mgr.getLiveVariables()) {
     BugReporter BR(mgr);
     CheckDeadStores(*L, BR);
   }
}

This action gets called for each function/method declaration.  The  
action queries the AnalysisManager (the thing that runs the checks)  
for the live variables analysis result on the current function (which  
gets computed on-the-fly).  The function 'CheckDeadStores' is then  
called with (a) the results of the live variables analysis and (b) and  
BugReporter object which is used to issue bug diagnostics to the  
user.  Analyses get registered in the Analyses.def file, where the  
"scope" of the analysis is declaratively specified.  For example:

ANALYSIS(WarnDeadStores, "warn-dead-stores",
          "Warn about stores to dead variables", Code)

Here "Code" means "code declaration"; that is a FunctionDecl or an  
ObjCMethodDecl.

A syntactic check can just walk the AST.  A good example of a  
syntactic check is CheckObjCDealloc.cpp.  Although this check will one  
day be revamped to do more real analysis, you can see how it walks the  
AST using iterators and does various pattern matching.  Syntactic  
checks are fairly easy to implement in this way and hook up to  
AnalysisConsumer.  The down-side is that they can be fairly dumb if  
the property you are checking requires any real smarts.

Another possible way to implement an "auditor" is to use the GRAuditor  
interface defined in GRAuditor.h:

template <typename STATE>
class GRAuditor {
public:
   typedef ExplodedNode<STATE>       NodeTy;
   typedef typename STATE::ManagerTy ManagerTy;

   virtual ~GRAuditor() {}
   virtual bool Audit(NodeTy* N, ManagerTy& M) = 0;
};

Checks that subclass GRAuditor<const GRState*> can be registered with  
a GRExprEngine and have their "Audit" method called after the path  
engine evaluates all expressions of a given type.  The Auditor is then  
free to inspect the analysis state at that point and flag errors.  A  
good example of this kind of check is in  
BasicObjCFoundationChecks.cpp.  The class in question is  
AuditCFNumberCreate, which flags misuses of the function  
CFNumberCreate (http://developer.apple.com/documentation/CoreFOundation/Reference/CFNumberRef/Reference/reference.html 
).  That function just interposes on the path engine at every CallExpr  
and checks the arguments in the function call for illegal values.

This is probably enough for one email.  I'm more than happy to dig  
into more details depending on what direction you decide to go in.   
Also, these interfaces are obviously evolving.  We haven't written  
many checks, so any idea of how to refactor the interface, make it  
better, etc., are welcome.