[cfe-dev] More static analysis...

Fri Feb 27 08:27:39 PST 2009

On Fri, Feb 27, 2009 at 6:02 AM, Ted Kremenek <kremenek at apple.com> wrote:
>
> On Feb 26, 2009, at 2:18 AM, Ben Laurie wrote:
>
>> On Wed, Feb 25, 2009 at 5:27 PM, Ted Kremenek <kremenek at apple.com> wrote:
>>>
>>> On Feb 25, 2009, at 8:56 AM, Ben Laurie wrote:
>>>
>>>> I'm interested in looking at detecting "known bad" patterns, for
>>>> example:
>>>>
>>>> (<expr> & 0) == 0 (this example is stolen from FindBugs)
>
> Quick question.  Did you literally want to look for this syntax in the AST
> (i.e., the integer literal '0') or did you want to use the dataflow analysis
> to see cases when an expression evaluates to '0' and then is bitwise-ANDed
> with <expr>?

Not sure I'd care either way.

>  Is the motivation of this check is that it is trivially true?

Right - and probably incorrect.

>
>>>> if(<non-boolean value>) (cause of recent OpenSSL vuln)
>
> This seems a little too vague to me.  People check non-boolean values all
> the time in 'if' conditions.  Were you thinking about cases when <non
> boolean value> was negative?  Positive numbers seem completely legit to me
> (e.g., some flag was set in a bitmask, etc.).

I'd be interested to see how often this actually comes up (as opposed
to <non-boolean value>&<bitmask>). The bug in OpenSSL was a function
that returned 1 for success, 0 for failure and -1 for error.
if(func(...)) obviously is trying to test for success but includes
error. Which is bad.

My claim is that if(x) is usually one of:

x is a pointer, and we are testing for non-NULL (which is fine).
x is boolean.
x is something else, and this is a bug.

I agree that people _may_ use if(x) as shorthand for if(x != 0), but I
am from the school of thought that says that asking people to make
their code clearer is not a sin, even if you are a static analyzer.

> In either case, both of these seem like "auditor" checks.  For the first
> check if all you wanted to do is grep the syntax, we can easily add another
> check that gets called by AnalysisConsumer.  Check out the functions
> "ActionXXX" in AnalysisConsumer.cpp to see all the places where the various
> checks are called.  An ActionXXX can get called for each function, an entire
> Objective-C class, or the entire translation unit.  For example:
>
> static void ActionWarnDeadStores(AnalysisManager& mgr) {
>  if (LiveVariables* L = mgr.getLiveVariables()) {
>    BugReporter BR(mgr);
>    CheckDeadStores(*L, BR);
>  }
> }
>
> This action gets called for each function/method declaration.  The action
> queries the AnalysisManager (the thing that runs the checks) for the live
> variables analysis result on the current function (which gets computed
> on-the-fly).  The function 'CheckDeadStores' is then called with (a) the
> results of the live variables analysis and (b) and BugReporter object which
> is used to issue bug diagnostics to the user.  Analyses get registered in
> the Analyses.def file, where the "scope" of the analysis is declaratively
> specified.  For example:
>
> ANALYSIS(WarnDeadStores, "warn-dead-stores",
>         "Warn about stores to dead variables", Code)
>
> Here "Code" means "code declaration"; that is a FunctionDecl or an
> ObjCMethodDecl.
>
> A syntactic check can just walk the AST.  A good example of a syntactic
> check is CheckObjCDealloc.cpp.  Although this check will one day be revamped
> to do more real analysis, you can see how it walks the AST using iterators
> and does various pattern matching.  Syntactic checks are fairly easy to
> implement in this way and hook up to AnalysisConsumer.  The down-side is
> that they can be fairly dumb if the property you are checking requires any
> real smarts.
>
> Another possible way to implement an "auditor" is to use the GRAuditor
> interface defined in GRAuditor.h:
>
> template <typename STATE>
> class GRAuditor {
> public:
>  typedef ExplodedNode<STATE>       NodeTy;
>  typedef typename STATE::ManagerTy ManagerTy;
>
>  virtual ~GRAuditor() {}
>  virtual bool Audit(NodeTy* N, ManagerTy& M) = 0;
> };
>
> Checks that subclass GRAuditor<const GRState*> can be registered with a
> GRExprEngine and have their "Audit" method called after the path engine
> evaluates all expressions of a given type.  The Auditor is then free to
> inspect the analysis state at that point and flag errors.  A good example of
> this kind of check is in BasicObjCFoundationChecks.cpp.  The class in
> question is AuditCFNumberCreate, which flags misuses of the function
> CFNumberCreate
> (http://developer.apple.com/documentation/CoreFOundation/Reference/CFNumberRef/Reference/reference.html).
>  That function just interposes on the path engine at every CallExpr and
> checks the arguments in the function call for illegal values.
>
> This is probably enough for one email.  I'm more than happy to dig into more
> details depending on what direction you decide to go in.  Also, these
> interfaces are obviously evolving.  We haven't written many checks, so any
> idea of how to refactor the interface, make it better, etc., are welcome.

Wow. OK, I'm going to have to play with code before I can comment, but
thanks for the detailed info!