[cfe-dev] More static analysis...

Mon Mar 30 10:53:48 PDT 2009

On Mar 30, 2009, at 7:32 AM, Ben Laurie wrote:

> On Mon, Mar 30, 2009 at 1:14 PM, Zhongxing Xu  
> <xuzhongxing at gmail.com> wrote:
>>
>>
>> 2009/3/30 Ben Laurie <benl at google.com>
>>>
>>> On Fri, Feb 27, 2009 at 6:02 AM, Ted Kremenek <kremenek at apple.com>  
>>> wrote:
>>>>
>>>> On Feb 26, 2009, at 2:18 AM, Ben Laurie wrote:
>>>>
>>>>> On Wed, Feb 25, 2009 at 5:27 PM, Ted Kremenek <kremenek at apple.com>
>>>>> wrote:
>>>>>>
>>>>>> On Feb 25, 2009, at 8:56 AM, Ben Laurie wrote:
>>>>>>
>>>>>>> I'm interested in looking at detecting "known bad" patterns, for
>>>>>>> example:
>>>>>>>
>>>>>>> (<expr> & 0) == 0 (this example is stolen from FindBugs)
>>>>
>>>> Quick question.  Did you literally want to look for this syntax  
>>>> in the
>>>> AST
>>>> (i.e., the integer literal '0') or did you want to use the dataflow
>>>> analysis
>>>> to see cases when an expression evaluates to '0' and then is
>>>> bitwise-ANDed
>>>> with <expr>?  Is the motivation of this check is that it is  
>>>> trivially
>>>> true?
>>>>
>>>>>>> if(<non-boolean value>) (cause of recent OpenSSL vuln)
>>>>
>>>> This seems a little too vague to me.  People check non-boolean  
>>>> values
>>>> all
>>>> the time in 'if' conditions.  Were you thinking about cases when  
>>>> <non
>>>> boolean value> was negative?  Positive numbers seem completely  
>>>> legit to
>>>> me
>>>> (e.g., some flag was set in a bitmask, etc.).
>>>>
>>>> In either case, both of these seem like "auditor" checks.  For  
>>>> the first
>>>> check if all you wanted to do is grep the syntax, we can easily add
>>>> another
>>>> check that gets called by AnalysisConsumer.  Check out the  
>>>> functions
>>>> "ActionXXX" in AnalysisConsumer.cpp to see all the places where the
>>>> various
>>>> checks are called.  An ActionXXX can get called for each  
>>>> function, an
>>>> entire
>>>> Objective-C class, or the entire translation unit.  For example:
>>>>
>>>> static void ActionWarnDeadStores(AnalysisManager& mgr) {
>>>>  if (LiveVariables* L = mgr.getLiveVariables()) {
>>>>    BugReporter BR(mgr);
>>>>    CheckDeadStores(*L, BR);
>>>>  }
>>>> }
>>>>
>>>> This action gets called for each function/method declaration.  The
>>>> action
>>>> queries the AnalysisManager (the thing that runs the checks) for  
>>>> the
>>>> live
>>>> variables analysis result on the current function (which gets  
>>>> computed
>>>> on-the-fly).  The function 'CheckDeadStores' is then called with  
>>>> (a) the
>>>> results of the live variables analysis and (b) and BugReporter  
>>>> object
>>>> which
>>>> is used to issue bug diagnostics to the user.  Analyses get  
>>>> registered
>>>> in
>>>> the Analyses.def file, where the "scope" of the analysis is
>>>> declaratively
>>>> specified.  For example:
>>>>
>>>> ANALYSIS(WarnDeadStores, "warn-dead-stores",
>>>>         "Warn about stores to dead variables", Code)
>>>>
>>>> Here "Code" means "code declaration"; that is a FunctionDecl or an
>>>> ObjCMethodDecl.
>>>>
>>>> A syntactic check can just walk the AST.  A good example of a  
>>>> syntactic
>>>> check is CheckObjCDealloc.cpp.  Although this check will one day be
>>>> revamped
>>>> to do more real analysis, you can see how it walks the AST using
>>>> iterators
>>>> and does various pattern matching.  Syntactic checks are fairly  
>>>> easy to
>>>> implement in this way and hook up to AnalysisConsumer.  The down- 
>>>> side is
>>>> that they can be fairly dumb if the property you are checking  
>>>> requires
>>>> any
>>>> real smarts.
>>>>
>>>> Another possible way to implement an "auditor" is to use the  
>>>> GRAuditor
>>>> interface defined in GRAuditor.h:
>>>>
>>>> template <typename STATE>
>>>> class GRAuditor {
>>>> public:
>>>>  typedef ExplodedNode<STATE>       NodeTy;
>>>>  typedef typename STATE::ManagerTy ManagerTy;
>>>>
>>>>  virtual ~GRAuditor() {}
>>>>  virtual bool Audit(NodeTy* N, ManagerTy& M) = 0;
>>>> };
>>>>
>>>> Checks that subclass GRAuditor<const GRState*> can be registered  
>>>> with a
>>>> GRExprEngine and have their "Audit" method called after the path  
>>>> engine
>>>> evaluates all expressions of a given type.  The Auditor is then  
>>>> free to
>>>> inspect the analysis state at that point and flag errors.  A good
>>>> example of
>>>> this kind of check is in BasicObjCFoundationChecks.cpp.  The  
>>>> class in
>>>> question is AuditCFNumberCreate, which flags misuses of the  
>>>> function
>>>> CFNumberCreate
>>>>
>>>> (http://developer.apple.com/documentation/CoreFOundation/Reference/CFNumberRef/Reference/reference.html 
>>>> ).
>>>>  That function just interposes on the path engine at every  
>>>> CallExpr and
>>>> checks the arguments in the function call for illegal values.
>>>>
>>>> This is probably enough for one email.  I'm more than happy to  
>>>> dig into
>>>> more
>>>> details depending on what direction you decide to go in.  Also,  
>>>> these
>>>> interfaces are obviously evolving.  We haven't written many  
>>>> checks, so
>>>> any
>>>> idea of how to refactor the interface, make it better, etc., are
>>>> welcome.
>>>
>>> I found some time to take a look at this today, and I'm puzzled: I
>>> can't see where auditors choose what type they are interested in?
>>
>> Is this what you are looking for:
>>
>> GRExprEngineInternalChecks.cpp:470:
>>     AddCheck(new CheckAttrNonNull(BR), Stmt::CallExprClass);
>>
>
> Ah, yes, thanks!

Hi Ben,

I just remembered that the auditor interface doesn't work for "branch"  
statements like IfStmt, ForStmt, WhileStmt, etc., since that uses a  
different "builder" interface.  Also, what you are interested in  
auditing is the branch condition, not the branch itself.  This can be  
done using the current auditor interface, with the auditor checking if  
a given expression is the condition for an IfStmt by using a  
ParentMap.  I suggest doing this for an initial implementation.  The  
problem is that "AddCheck" expects a StmtClass value, and there is no  
StmtClass value for "all expressions".

To me both these points indicate that the auditor interface needs to  
be evolved to make it more useful.  I think Auditors needs a  
declarative way where they indicate what expressions they are  
interested in auditing.  AddCheck would just take an auditor (and no  
StmtClass value) and then figure out when the auditor should be called  
by interrogating the auditor.  This interface should potentially be  
rich enough to say "only audit expressions used as conditions for  
branches", etc.  As I mentioned in my earlier email, the Auditor  
interface isn't extensively used yet, so it certainly can and should  
be evolved.

I just added a version of "AddCheck" that just takes an auditor an no  
StmtClass (this should help you Ben with implementing your check):

   http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20090330/014758.html

Auditors registered using this version of AddCheck will be called for  
every non-terminator statement/expression that appears within a basic  
block.  As I said before, branches are not directly audited (because  
that part of the analysis engine literally just handles the branching  
of the paths *after* the branch condition has already been evaluated),  
but the branch conditions themselves will still get passed to the  
auditor.